Git Clone Explained – How to Clone a Git Repository?
Last Updated on December 11, 2024
Git is the most popular version control system nowadays. It is a completely free open-source tool that allows you to e.g. work together on the development of source code. Linus Torvalds developed Git during the development of the Linux kernel, and its first version was released in 2005 and has been gaining popularity ever since. According to Stack Overflow’s 2023 survey, as many as 93% of programmers use Git version control system. The survey for 2020, for example, did not have such a question. Although, there was a question about “Collaboration Tools” and as many as 82.8% indicated GitHub, which is only one of several popular services using Git, so overall popularity is even higher.
I would like to add that many popular open-source projects use Git. It is enough to mention popular raster graphics editor GIMP, programming languages such as Perl, Ruby on Rails or the jQuery framework. Each of us can collaborate on this project using the Git VCS.
What is a Git clone?
In order to be able to work with Git, whether in open-source, commercial, or our own projects, we need to have a cloned repository (local or remote repositories) on our computer. Also, cloning local or remote repos can make it easy to fix merge conflicts, manage files (add or remove them), and push larger commits. Cloning pulls down all repository data, including all file versions. Git is a distributed version control system, which means that each clone is an exact copy of the underlying git repo. In an extreme case, e.g. during a failure of the external remote server and the lack of backups, we can restore the entire existing repository on the basis of such a copy.
So what is a git clone? This is literally a clone. It makes a complete copy of the target repository along with a whole history of changes from the beginning of the project. At the same time, git clone is also the name of a specific function in Git. It allows us to make this copy (simply type git clone in a command line). Importantly, performing this operation is ‘one-time’, which means that after the first launch, we no longer need this function during further work.
We already know that git clone makes a local copy of the entire repository. Though, there still needs to be some external syncpoint done. This is the place where everyone connects their changes and downloads changes made by others from there. Thanks to this configuration, regardless of the number of people working on one project at the same time, each local copy is connected to this one, the so-called remote repository, and doesn’t need to know anything about the others. The clone function automatically connects our existing local repository with the remote one, which is also called origin. You can read more about Git clone here.
Let’s clone a Git repository
The git clone operation, just like any other function in Git, has a basic default behavior that we can extend with various parameters. Let’s check this git clone example:
git clone < repo_address> < directory> –no-tags –filter = blob: none
This function will copy a project from a given address to a given remote or local directory. Moreover, it will skip tags and blob files while downloading data. For correct operation it is enough to provide only the repository address, the rest of the parameters are optional. And there are just a few of them! If you are looking for detailed knowledge about the available parameters, please refer to the official documentation.
The default branch is the initial branch that Git will check out when cloning a repository. This default can affect cloning behavior, branch tracking, and repository initialization.
Visual learner? Then, don’t read but watch our GitProtect Academy video to find out how to clone a repository (Psst, don’t forget to subscribe!)
I have already mentioned that git clone will download all the data along with the entire history of changes. But how will it act on branches? Well, remote-tracking branches will be created, all data will be fetched, but the pull operation will apply only to the main git branch. What does it mean? That by default we will have locally created and fetched only this main remote branch. All others git branches are labeled as remote branch heads. From the perspective of an ordinary user, this may not change much. However, from the perspective of local Git files, it changes a lot, because the local branches are a mapping of the “real” ones marked as remote.
To clone a GitHub repository to your local system, you can use the ‘git clone https’ command. This command creates a full copy of the repository, including all its history and logs. You need to obtain the repository URL and specify a folder for the clone. For example:
git clone https://github.com/user/repo.git < directory>
How to Git clone a specific branch?
One of the parameters for the git clone function is –branch (or -b). By default, clone takes all branches and performs a checkout only on the main git branch. The above-mentioned parameter allows us to change it and perform a checkout for a particular remote branch that we specified. However, it won’t change the fact that Git will fetch all branches anyway. This is not what we would like to achieve in this case.
Imagine a repository that has three branches, the master being the main one. Clone operation with the –branch develop parameter will allow us to pull and checkout the developed specific branch, but what will happen to the other two? Check out the pictures below:
As you can see, all the branches were downloaded anyway. Let’s try to modify our git clone command in such a way as to clone a single branch only. Since Git 1.7.10 (and we currently have version 2.32.0 – released on the 6th of June, 2021) the clone operation in a git clone command has the –single-branch parameter. What does the documentation tell us about it?
“git clone” learned “–single-branch” option to limit cloning to a single branch (surprise!);
tags that do not point into the history of the branch are not fetched.”
So let’s check the operation in practice. We will copy the operation from the previous example, but this time we will add another parameter. Let’s see its effect…
Done! This time we managed to clone a single branch only. Why do we need this? Sometimes the repository we are working on can be very extensive and we don’t need to download all branches. Both for reasons of saving memory and keeping order and avoiding chaos, such instruction can be useful and helpful for us.
Conclusion on how to clone a repository and how to clone a specific branch
Today we learned how to clone any local or remote repository and how to clone a specific branch in Git. Cloning a repository allows us to have more control over what we do, but it also has its consequences. In the beginning, I mentioned that a local copy of the existing git repository, in extreme cases, allows you to restore the project. So, each local clone works a bit like a backup of the base repository.
The problem appears when this copy contains only a single branch, then of course we do not have the entire remote repository and we must be aware of it. Proper backup is important and should never rely on local reproductions. Why? Because the parameters of the git clone function allow you to filter many items and we can never be sure of the differences between our and an external repo. I recommend using dedicated backup solutions like GitProtect.io to avoid surprises.
Next step, how to clone using HTTPS in git
[FREE TRIAL] Ensure compliant DevOps backup and recovery with a 14-day trial 🚀
[CUSTOM DEMO] Let’s talk about how backup & DR software for DevOps can help you mitigate the risks