Can Git Restore a Deleted File?
Last Updated on December 11, 2024
Git as a version control system is very popular nowadays. It is convenient not just because you can do many different operations with it, including such git commands as git revert, git push, git reset, git rebase, or many more, but it also can permit you to restore deleted files.
Fortunately, for us, Git really has the right tools to do so. Thus, we are going to discuss one of the ways that will allow us to recover deleted files, named the GIT RESTORE function.
Behold – git restore
The RESTORE function was added to git version 2.23 (August 2019), so it is a relatively new thing. However, it is becoming an increasingly popular option, despite the fact that the official documentation still says:
“THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE. ”.
Anyway, it is a nice tool. It allows to unstage changes from Staging Area or to discard local changes. When we run the GIT STATUS command, the GIT RESTORE operation will be the suggested method to undo changes and it actually replaces the RESET command. Here you can read more about other ways to restore or remove files in git.
Restore removed file
However, we will focus today not on cleaning up the Staging Area, but on restoring already deleted files. Let’s consider two cases here:
- restore locally deleted file,
- restore a file removed from the external git repository.
As we know from the aforementioned article, changing the git history is a dangerous action. So the first case is easier because we can manipulate local history and local commits (those that have not yet been synchronized with the external repository). In the second case, however, we need to be more careful and take care of the correct commit history before we push our code. Fortunately, using GIT RESTORE allows us to recover files without changing the history.
Regardless of which of the above situations we have, if we want to restore removed file, we first need to find out in which commit the file was deleted. We will use the REV-LIST operation for this. Let’s assume that we deleted the README.md file and now we want to get this deleted file back.
Command: git rev-list HEAD – README.md will show us the list of commits that contains the deleted file.
Probably the first commit on the list (the newest one) is the one where our file was deleted, so we should be more interested in the next one, in this case, the 6b2f73.
Once you know the hash of the particular commit that contained the deleted file, just run git restore with the appropriate parameters, i.e. the – source flag, hash, and filename:
git restore – source 6b2f73 README.md
As a result, our deleted file will be restored and marked as Untracked, but this is not a problem for us and we can add it as a new current commit at any time.
You could use the git CHECKOUT operation here to switch to a specific commit and manually recover the deleted file you are interested in from there, but that doesn’t sound the best or the smartest. In addition, by using the git CHECKOUT, we risk modifying our history. Using the GIT RESTORE operation allows us to keep the history and only make a new change that restores our file.
Restore deleted branches
We already know how to restore individual files. But what if we delete a branch? Is it possible to recover the entire branch? Of course, it is, and I’ll show you how to do it in a moment. The only difficulty is the need to obtain the appropriate SHA of the commit on the branch. Usually, it will be relatively easy, but I can imagine a complicated situation and a branch deleted long ago, which will be difficult to dig into.
How secure are your repos and metadata? Don’t push luck – secure your code with the first professional GitHub, Bitbucket, GitLab, and Jira backup.
Anyway, we can use the familiar REV-LIST operation to find the SHA of the entire commit that contains the file, or we can use the REFLOG tool, which I think is a better idea. Having the appropriate SHA code, we use the git CHECKOUT command, and specifically build the following git checkout command:
git checkout -b BRANCH SHA, where BRANCH is the name of the deleted branch, and SHA is the commit hash that the branch pointed to.
Orphaned commits
Let’s consider how it is possible to recover data in git at all, since we have deleted them. The GIT RESTORE operation should not raise any doubts – git has a linear history. Once a given file existed in the local repository, so despite its removal, the file still exists, saved in some previous commit. We just recover its contents. But what happens when we delete a branch? What is a branch at all? By itself it does not store any information, commits do record it. Branch is just a pointer to the commit, and the only information it contains is the SHA code of the commit it points to. This is why branches in GIT run very fast and are very “light”.
But it also has consequences. When we delete a branch, i.e. an indicator, we don’t delete the commit it was pointing to. So in theory, this previous commit and changes from a given branch are still in our repository. It is then called “orphaned commits”. They exist, but there is nothing to point to them. They are alone, unrelated to anything, and in fact invisible. When we delete a branch, we will not see these commits either using the LOG function or from the browser level, when we open our git repository on GitHub or Bitbucket.
The only way to recover or view them is to know their SHA codes. Now let’s look at it from a security perspective. We’ve deleted some branch with critical data and we think we’re safe. Well, not really. These files could still be recovered by a criminal or hacker.
Garbage Collector
In the IT world, you can come across something called Garbage Collector. It is a mechanism that cleans up, for example, unused objects or files. Such a mechanism exists, among others in Java Virtual Machine, but also exists in git, which many people don’t even realize. By default, “unnecessary” data is stored in git for 90 days, after which the Garbage Collector will get rid of them, but only if nothing indicates a given commit. It may happen that we have reflog entries, or some other branch points to our commit, etc. Only completely “useless” items will be removed. GC further optimizes certain things and allows for less memory usage.
Having knowledge of data recovery, deletion (previous article), and GC, we see both great opportunities and great risks here. We must always bear in mind the possibility of doing what we do not want and the possibility of harmful actions by criminals. We should always be prepared for this and make regular backups. In the case of introducing critical data to the git repository and then deleting it, we should also make sure that GC cooperates with us. Appropriate configuration and the ability to manually start the GC allow us to maximize the security of our data, but it will never give us 100% certainty. A suitable backup tool can prove to be an indispensable ally to restore removed files or data.
Git backup – the most reliable way to restore lost data
And what if all the above scenarios don’t work? A professional git backup solution can be your best chance to successfully restore lost files and data. While looking for the right solution keep in mind that during the software development process your team uses over a dozen specialistic tools. Hence, choose a backup solution that covers the DevOps ecosystem and not just a specific platform. GitProtect DevOps backup allows you to secure git service providers like GitHub, Bitbucket, GitLab, plus Jira for secure project management and soon also Confluence, Kubernetes, Zendesk, and more. And it’s the only product on the market that is equipped with true Disaster Recovery. That one feature can be your best choice while dealing with minor or major problems & outages.
And to learn more about other methods at your disposal check other blog posts:
Git – How to Revert a File to the Previous Commit?
Git HEAD – Git HEAD reset and Git HEAD overwrite – what to do?
3 Best Methods to Back Up and Restore Repositories and Metadata in Bitbucket Cloud