Well, I want to show you how to revert a commit in git repositories – how to rewrite your story! We have few possibilities to do so, few cases, as we must consider rewriting local and external history, and of course, we have some risks, because rewriting changes can be very dangerous. So sit back and let’s start changing history.
Ways to roll back changes
Let me remind you briefly how Git works. When working with the repository, we have 4 levels:
- working directory – every changes tracked by git
- staging (index) – changes ready to be committed
- local repository – all unsynchronized local commits
- remote repository – external synchronization point
Those layers are important because git rollback operations can work differently for different levels. Another important thing before further consideration is the fact that Git has a linear history. Ok, we can branch the work, but it’s still linear. Each commit has a parent and its own unique SHA hash, and any attempt to manipulate such a linear history can cause us trouble.
Let’s consider the first case. The last commit in the remote repository is A. We grab this code and start working. We’ve created a new feature, made our own commit B, but it’s still only local, we haven’t done a push yet. Our linear story goes something like this: A — B. It turns out, however, that this feature is redundant (or wrong) and we don’t want to have it. We can easily remove it with the reset operation. We also have to choose the flag (soft, mixed, hard), which determines whether the changes will be saved in our working directory or not. Done! But you need to know that such an operation has serious consequences. Above we have considered the case where commit B was only on our local repository, in such a situation there is no problem and we can safely perform this operation.
But what if we wanted to reset a commit that already exists in the remote repository? Well, this is where it gets a little wild. Git will allow us to do that, why not, but we ourselves should avoid such an operation. I’ve already mentioned that the history in Git is linear and that each commit has a parent. This is important because when we ‘break’ such a sequence and the history is not consistent, we will not be able to synchronize with an external repository. We’ll get an error when trying to push because Git doesn’t allow such tampering with history.
How to undo a commit in git
There may come a time when you will need to undo something – for instance, undo local commit, undo git restore, or undo changes in git. There are a few basic tools (and rules too), but keep in mind that some of these operations can not be undone later. To be more specific, we are getting into a dangerous area where you may end with some lost data and work, so… be careful. Now, let’s go back to the main topic and find the answer to how to undo changes in git.
Git revert commit
So how can we undo commit that already exists outside? Fortunately, there is a safe solution – the git revert changes operation. With it, we can undo changes from any commit. How to revert git command the right way? We don’t have to check who changed what and where, we just tell Git that the particular commit is to be rolled back. From a code perspective, we are doing something that resembles an ordinary UNDO. There was a change – there is no change. We got rid of the unwanted code. However, from Git’s perspective, it looks a bit different. Friendly reminder – my Git story is linear, so revert just makes a new commit, which mirrors the one we want to undo. History has not changed (we got only a new commit) so it is a safe operation. You can see that in the picture below.
How to revert commit in git
I have accidentally pushed critical data
We already know several ways to roll back changes, each has its own advantages and disadvantages. But these were only theoretical examples to learn a few commands. Let us now consider a more difficult case. Suppose I had a password or other very important piece of information saved in the plaintext in the notes.txt file. I accidentally added this file to my commit and I did a push. What’s next? The quick reset won’t work as it would ruin the history. Revert will also do nothing because my commit will still be public. Well, it will not be visible in the current version of code, in current HEAD, but it will be stored in history. This is very dangerous because it gives the illusion that the password is no longer available in the repository. Well .. there is. In my article about ransomware attacks on GitHub, Bitbucket and GitLab I am describing a case that proves that such things happen in the real world:
“In 2018, an experiment was conducted to search Github for withdrawn commits that contained the words “removed password” in the message. Result? 350k!”
Resetting public commits
So how can we deal with it? It is not that simple, especially when many people use our repository at the same time. This can be a big problem for our company. Because not only do we risk losing important data, but also repairing such a problem may stop work on a given project for some time. Why? Let me explain what steps would be taken in such a situation.
We will be using the reset operation, but before that, we need to pause syncing with the external repository for a while. Nobody can pull/push, all open PRs should be closed.
The person to fix the situation should ideally do a clean clone of the project (remember to download the tags). What’s next? Well, now we’re doing a reset, something we already know and understand that it spoils our history. But don’t worry, in this case, we’re doing it on purpose, and we’re about to figure out how to get out of it. This is easy enough because we just reset with the –hard flag (i.e. delete everything, and not save the code in the working directory). When we are sure that the unwanted change is gone, then we push, but necessarily with the –force flag, which overwrites the state of a given branch. This allows us to push even if we changed the commits history. Be careful! This is a very risky operation, and usually the use of the –force flag is disabled for security reasons.
Okay, we’ve cleaned up the situation on the branch, but the problem hasn’t been resolved yet. What if someone had already downloaded these changes locally before we fixed them? Ideally, each developer should delete local copies and make a clean clone. But this is quite problematic. In this case, it should be enough to perform the fetch and rebase operations of the new version of the corrected branch, and careful resolution of conflicts, if they appear.
There is also another way to solve the aforementioned problem. We can make a local copy of the branch with the unwanted code and then remove the branch completely from the external repository. On our copy, we reset as above, but we don’t need to use the –force flag when doing push, because, from the git perspective, we will push a completely new branch, just with the same name as before, only the content will be different.
In both situations, there is one more thing that we must remember. Very important from a security perspective. If someone somehow knows the hash of our unwanted, deleted commit, he or she will be able to access it and recover the data for some time! The removed commit will become the so-called orphaned commit, not linked to any branch, but will still exist. It would be worthwhile to manually use the GIT’s GC mechanism here, but we’ll talk about it another time.
Moreover! There is a mechanism in git called restore that allows us to recover the deleted files.
Another, and it seems to be the safest, way is to use a backup of the repository from before the unfortunate commit. Although also that might require some work. If we even have such a backup at all. The risk here is losing the changes that came later, but we can patch them and then apply them to the version recovered from the backup. If we have a properly configured backup process, then deleting the repository and quick recovery based on a backup seems to be the safest and fastest solution, because it does not require us to manually reset specific changes.
One conclusion can be drawn from the above considerations – make a backup as often as possible because you never know when it may be needed. And when it is needed, and it will be for sure, then we realize how much it costs us to lack it.