Well, I want to show you how you can rewrite your changes in your repositories, using other words, how to rewrite your story and which git commands you need to use in your command line! We have few possibilities to do so as we must consider not only to edit local and external history, but to rewrite it. Hence, of course, we have some risks, because rewriting changes to undo commits can be very dangerous. So, sit back and let’s start changing git log.
Ways to roll back changes
Let me briefly point out how Git works. First, I should note that when we work with the repository, we have 4 levels:
- working directory – every changes tracked by git
- staging (index) – changes ready to be committed
- local repository – all unsynchronized local commits
- remote repository – external synchronization point
Those layers are important because rollback operations can work differently for different levels. Another important thing, before we move on with our further consideration, is the fact that Git has a linear history. Ok, we can branch the work, but it’s still linear. Each git commit has a parent and its own unique SHA commit hash, and any attempt to manipulate such a linear history can cause us trouble.
Let’s consider the first case. The last commit in the remote repository is A. We grab this code and start working. We’ve created a new feature, made our own commit B, but it’s still only local, we haven’t done a push yet. Our linear story goes something like this: A — B. It turns out, however, that this feature is redundant (or wrong) and we don’t want to have it. We can easily edit and remove this last commit with the git reset command operation. We also have to choose the flag (soft, mixed, hard), which determines whether the changes will be saved in our working directory or not. Done! But you need to know that such an operation has serious consequences. Above, in the following example, we have considered the case where the latest commit B was only on our local repository, in such a situation there is no problem and we can safely perform this operation.
But what if we wanted to reset a git commit that already exists in the remote shared repository? Well, this is where it gets a little wild. Git will allow us to do that, why not, but we should avoid such an operation ourselves. I’ve already mentioned that the history in Git is linear and that each specified commit has a parent. This is important because when we ‘break’ such a sequence and the git log is not consistent, we will not be able to synchronize with an external repository. We’ll get an error when trying to push because Git doesn’t allow such tampering with history.
How to undo a commit in git
There may come a time when you will need to undo something – for instance, undo local commit, undo git restore, or undo changes in git. There are a few basic tools (and rules too), but keep in mind that some of these operations can not be undone later. To be more specific, we are getting into a dangerous area where you may end up with some lost data and work, so… be careful. Now, let’s go back to the main topic and find the answer to how to undo changes in git.
Git revert commit
So how can we undo changes that already exist outside? Fortunately, there is a safe solution – the git revert command operation. With it, we can undo changes from any git commit. We don’t have to check who changed what and where in a git log command or look for a commit message, we just tell Git that the particular commit is to be rolled back. From a code perspective, we are doing something that resembles an ordinary UNDO. There was a change – there is no change. We got rid of the unwanted code. However, from Git’s perspective, it looks a bit different. Friendly reminder – my Git story is linear, so the git revert command just makes a new commit. This new commit just mirrors the one we want to undo, the previous commit. Our git log has not changed (we got only a new commit) so it is a safe operation. You can see that in the picture below.
How to revert commit in git
I have accidentally pushed critical data
We already know several ways to roll back changes, each has its own advantages and disadvantages. But these were only theoretical examples to learn a few commands. Let us now consider a more difficult case. Suppose I had a password or other very important piece of information saved in the plaintext in the notes.txt file. I accidentally added this file to my current commit and I did a push. What’s next, any other subsequent commits? The quick git reset won’t work as it would ruin the commit history. Git revert will also do nothing because my git commit will still be public and there is no need to write a commit message to the team. Well, it will not be visible in the current version of code, in the current HEAD, but it will be stored in a git log. This is very dangerous because it gives the illusion that the password is no longer available in the git repository. Well .. there is. In my article about ransomware attacks on GitHub, Bitbucket, and GitLab I am describing a case that proves that such things happen in the real world:
“In 2018, an experiment was conducted to search Github for withdrawn commits that contained the words “removed password” in the message. Result? 350k!”
Resetting public commits
So how can we deal with it? It is not that simple, especially when many people use our git repository at the same time. This can be a big problem for our company. Because not only do we risk losing important data, but also repairing such a problem may stop work on a given project for some time. Why? Let me explain what steps would be taken in such a situation.
We will be using the git reset operation, but before that, we need to pause syncing with the external repository for a while. Nobody can pull/push, all open PRs should be closed.
How secure are your repos and metadata? Don’t push luck – secure your code with the first professional GitHub, Bitbucket, and GitLab backup.
The person to fix the situation should ideally do a clean clone of the project (remember to download the tags). What’s next? Well, now we’re doing a git reset, something we already know and understand that it spoils our commit history. But don’t worry, in this case, we’re doing it on purpose, and we’re about to figure out how to get out of it. This is easy enough because we just reset with the –hard flag (i.e. delete everything, and not save the code in the working directory). When we are sure that the unwanted change is gone, then we push, but necessarily with the –force flag, which overwrites the state of a given branch. This allows us to push even if we changed the git log. Be careful! This is a very risky operation, and usually the use of the –force flag is disabled for security reasons.
Okay, we’ve cleaned up the situation on the branch, but the problem hasn’t been resolved yet. What if someone had already downloaded these changes locally before we fixed them? Ideally, each developer should delete local copies and make a clean clone. But this is quite problematic. In this case, it should be enough to perform the fetch and rebase operations of the new version of the corrected branch, and careful resolution of conflicts, if they appear.
There is also another way to solve the aforementioned problem. We can make a local copy of the current branch with the unwanted code and then remove the working branch completely from the external repository. On our copy, we reset as above, but we don’t need to use the –force flag when doing force push, because, from the git perspective, we will push a completely new branch, just with the same name as before, only the content will be different.
In both situations, there is one more thing that we must remember. Very important from a security perspective. If someone somehow knows the commit hash of our unwanted, deleted git commit, he or she will be able to access it and recover the data for some time! The removed previous commit will become the so-called orphaned git commit, not linked to any branch, but will still exist. It would be worthwhile to manually use the GIT’s GC mechanism here, but we’ll talk about it another time.
Moreover! There is a mechanism in git called restore that allows us to recover the deleted files.
Another, and it seems to be the safest, way is to use a backup of the repository from before the unfortunate git commit. Although also that might require some work. If we even have such a backup at all. The risk here is losing the changes that came later, but we can patch them and then apply them to the version recovered from the backup. If we have a properly configured backup process, then deleting the repository and quick recovery based on a backup seems to be the safest and fastest solution, because it does not require us to manually type git reset commands to get specific changes.
One conclusion can be drawn from the above considerations – make a backup as often as possible because you never know when it may be needed. And when it is needed, be sure it will, only then we can realize how much it costs us to lack it.