Last Updated on March 8, 2024

While working on a project, it’s common to have it combined with another one, especially if you work in a network with other people. It might be a library built by other developers, or a piece of the project developed independently and then reused in several projects. When such a thing happens, you want to keep both projects distinct yet you want to be able to use one of them in a different one. This post was written to assist you in managing projects using Git subtree and submodule. We will show you the key differences, so you can decide which choice is the best for you.

What is git submodule – why and how to use it?

A Git submodule is a separate repository within a repository, to put it simply. Project management is advantageous in a variety of ways. Submodules are similar to child repositories in the way that pointer commits must be manually updated. They are easy for a team to work together at the same time.

You don’t clone or integrate any of the actual code in your new repository when you use many submodules, it’s better to say that you include links to the forest repository on GitHub. These pointers lead to a submodule commit in a different repository.

Git submodules enable you to preserve one git repository as a sub directory of another. Also, Git submodules allow you to include and track the version history of external code in your Git repository.

Git’s basic package includes submodules that allow Git repositories to be nested within other separate repositories. The Git submodule, to be exact, corresponds to a specific commit on the child repository.

To manage the versioning of external dependencies for a project, you can use the Git submodules feature. For example, here are the scenarios in which you can use git submodules:

  • You can lock the code to a specific commit for your own safety when an external component or subproject is changing too quickly or forthcoming modifications would break the API.
  • When you wish to track a vendor dependence for a component that isn’t automatically updated too often.
  • When you delegate a project component to a third party and wish to include their work at a certain time or release. When the changes aren’t too frequent, this method works well.
Get free GitProtect trial

How to use git submodules?

Firstly, create a new submodule using the git submodule command that saves the path and hyperlink references in a folder called .gitmodules.

For example, to clone a repository with submodules, use:

git clone –recursive <URL to Git repo>

If you’ve previously cloned a repository and wish to load its submodules, use:

git submodule update –init

If there are nested submodules, do the following:

git submodule update –init –recursive

Specify a branch for a submodule using:

git submodule set-branch -branch <branch name> — <submodule path>

Or change branch using:

git submodule change branch

A great git submodule alternative – git subtree

Imagine your Git repository as a tree. Within this structure, a subtree serves as a smaller, manageable version of the main tree. Unlike submodules, subtrees allow you to nest one repository inside another as a subdirectory, offering a more seamless and flexible integration. They can be committed to, branched, and merged just like any other repository. This flexibility makes them an excellent alternative to submodules, particularly when you need to incorporate and manage a project within another. According to Git’s official documentation, when performing a subtree merge, Git recognizes the relationship between the two projects, allowing for intuitive merging and management. This approach is especially beneficial for projects requiring close integration without the overhead of submodule management.

Why consider git subtree?

  • It has the same functions as a standard repository.
  • It’s easy to use it with your main repository because it’s saved as commits.
  • The module’s contents can be changed without the necessity to create a separate repository copy of the dependency.
  • Users of your current repository do not need to learn anything new to use the git subtree. They can forget the fact that you’re managing dependencies with git subtree.
  • Unlike git submodule, git subtree does not create new metadata files (i.e., .gitmodule).
Git Commands Cheat Sheet

How to use git subtree?

A subtree can be added to a parent repository. To add a new subtree to a parent repository, you must enter the following commands –  firstly, remote add it, secondly, use the subtree add command, which looks like this:

git remote add remote-name <URL to Git repo>

git subtree add –prefix=folder/ remote-name <URL to Git repo> subtree-branch name

The commit history of the whole child project gets merged into the parent repository after such commands.

Changes to and from the subtree are pushed and pulled using:

git subtree push-all

git subtree pull-all

Git subtree vs submodule

Similarities

External git repositories can be incorporated into other git repositories using git submodule and git subtree. Both techniques allow you to link a specific version of an external component to the local repository and bundle them. Both tools keep tracking the external repository’s history, enabling you to check out previous commits.

Submodules or subtrees?

Submodules have been around for a long time, and have their command (git submodule) and extensive documentation.  If we compare it with adding a subtree, adding a submodule is fairly straightforward. All of the hazards and flaws do not appear until the last moment, which can be annoying.

Submodules are sometimes the best option. This is especially true if your codebase is big  and you don’t want to keep downloading it, as many existing codebases do. Submodules are then used to make it easy for other users, who have no need to download complete blocks of code, to collaborate with you. Because submodule code is the central code used by all container projects, you should aim to keep it independent of other container details.

Shortcomings of git submodules

  • Cloning repositories, which contain submodules, requires downloading the submodules separately. The submodule folders will be empty after cloning if the source repository is moved or becomes unavailable.
  • This is related to a couple of major disadvantages of Git submodules, including locking to a certain version of an external repository, a lack of good merge management, and the widespread assumption that the Git repository is unaware that it has become a multi-module repository.

Shortcomings of git subtrees

  • A new merging approach must be learned.
  • It’s a little more difficult to contribute code for the sub-projects upstream.
  • You must be sure that super and sub-project code is not mixed in new commits.

Ensure data protection to your git repositories hosted in GitHub, GitLab, or Bitbucket and make your source code ransomware-proof and disaster-resistant.


Summary

Each tool has advantages and disadvantages. Here are some aspects to consider when you decide which one is ideal for you.

  • Component-based development favors Git submodules, whereas system-based development favors Git subtrees.
  • Git submodules have a smaller repository size since they are just links to a single commit in a subproject; whereas Git subtrees store the whole subproject, including its history.
  • Subtrees are decentralized, while Git submodules must be accessible on the server.

A Git subtree isn’t the same thing as a Git submodule. There are certain restrictions on when and how each of them can be used. If you’re going to upload code to a third-party repository, consider a Git submodule since it’s faster to do so. Use a Git submodule if you have a third-party code that you won’t probably push since it is easier to pull.

Before you go:

🔎 Check out the top reasons why it’s worth starting to back up DevOps tools as soon as possible

🐙 Do you think that if you use GitHub/GitLab/Bitbucket, you don’t need a backup? We’ve busted this myth in our DevSecOps MythBuster blog post! Check it out!

📚 Don’t miss our series of articles where we’ve investigated 2023 for threats: Atlassian security incidents, infamous GitHub-related incidents, and GitLab vulnerabilities and security incidents

👀 Read our comprehensive analysis, Your own Git backup script vs. repository backup software, and see which option better meets your requirements

✍️ Subscribe to GitProtect DevSecOps X-Ray Newsletter and always stay up-to-date with the latest DevSecOps insights

📅 Schedule a live custom demo and learn more about GitProtect backups for your DevOps data protection

📌 Or try GitProtect backups for your GitLab, Bitbucket, GitHub, or Jira ecosystem to guarantee data protection and ensure continuous workflow

Comments are closed.

You may also like