It is hard to find a developer or an IT-related person, who has never heard about GitLab. Nowadays it is probably one of the most popular git hosting services for development teams to host their data. Have doubts about whether to backup GitLab or not?  Download this ebook and get back to this article to find out the best practices to protect your GitLab. It will not only make every line of your source code accessible but also recoverable, which will definitely make you sure that your team can work without interruptions and delays even during severe GitHub outages. And, what is most important, you will always have stable access to your Intellectual Property and, thus, save hours of work, money, reputation, and your customers’ trust.

Table of content

Part 1 Backup performance
Backup all repositories with related metadata
Save your storage space with incremental and differential backup
Choose the best deployment that suits you: SaaS or On-Premise
Add multiple storage instances and complete the 3-2-1 backup rule
Backup replication matters
Acquire flexible retention – up to unlimited
GitLab backup as a part of the CI/CD process
Monitoring center – email and Slack notifications, tasks, advanced audit logs
Create a dedicated GitLab user account only for backup reasons to bypass throttling

Part 2 Backup security
GitLab backup software for SOC2, ISO 27001 compliance
User AES encryption in-flight and at rest
Zero-knowledge encryption
Data Center region of choice
Sharing the responsibility for managing the backup system
Ransomware protection

Part 3 Disaster recovery
Disaster Recovery – use cases & scenarios
Restore multiple git repositories at a time
Point-in-time restore – don’t limit yourself to the last copy
Restore directly to your local machine
Don’t overwrite repositories during the restore process

Part 1
BACKUP PERFORMANCE

Backup all repositories with related metadata

If you want to be sure that all of your GitLab environment is protected for sure, you need to backup all the repositories with related metadata. It doesn’t matter what you use GitLab or GitLab Ultimate, you copies should include:

  • Repository,
  • Wiki,
  • Issues,
  • Issue comments,
  • Deployment keys,
  • Pull requests,
  • Pull request comments,
  • Webhooks,
  • Labels,
  • Milestones,
  • Pipelines/Actions,
  • Tag,
  • LFS,
  • Releases,
  • Collaborants,
  • Commits,
  • Branches,
  • Variables,
  • Groups,
  • Snippets,
  • Project’s topics.

Always remember that to adjust your data protection policy in accordance with the needs, structure, and workflow of your organization, your backup software should permit you to create a number of backup plans. 

What is the best way to achieve it? To set up a backup plan for critical repositories and metadata to track the changes on a daily basis, at least. Of course, it would be better to backup your critical data even more frequently. For this reason, it is possible to use the Grandfather-Father-Son (GFS) rotation scheme and any other backup plan to keep your unused repos for future reference. If there is a need, you can store your data infinitely. Moreover, you can even delete those unused data from your GitLab account and keep the copy on the storage without overloading your GitLab account.

Save your storage space with incremental and differential backup

If you want to save your storage space, or speed up backup and limit bandwidth, it is reasonable to include only changed blocks of your GitLab data (since the last copy) into your backup software. To say, even more, it would be ideal, if you are able to define different retention and performance schemes for every type of copy. There could be whether full, incremental, or differential backup. 

Choose the best deployment that suits you: SaaS or On-Premise 

Each time when you use your repo, whether it is GitLab or GitLab Ultimate, and you want to back it up, you need to run your backup software somewhere. So, here comes the question: what to use – a cloud or to make it self-hosted in your private infrastructure?  The main difference is the place where the backup service is installed and run. 

Suppose you want to deploy it in a SaaS model. In that case, you can’t allocate any additional device used as a local server, because the service runs within the provider’s cloud infrastructure. In this case, you should have no worries about its maintenance or administration, and the continuity of operation, because all of that is guaranteed by the service provider.  

If you prefer to use on-premise deployment, you will need to install the software on a machine of your provision and control, because it works in your environment locally. That is great when you have the possibility to install it on any computer, whether it is Windows, Linux or macOS, or even on popular NAS devices. Using this deployment model, you will avoid any problems related to connectivity to the network, you will know that all the copies are made via the local network. It will make the backup process faster and more efficient. 

At the same time, please, pay attention that it’s much better when the deployment model is independent of data storage compatibility. 

Once you decide to use GitProtect.io Cloud PRO, Cloud enterprise, or On-Premise Enterprise, it will permit you to use GitProtect unlimited cloud storage because it is always included in the license. If your choice is the Enterprise plan, you can bring your own storage, as well – a cloud or on-premise one. Such storage as AWS S3, Wasabi Cloud, Backblaze B2, Google Cloud Storage, and Azure Blob Storage is supported by GitProtect.io. Moreover, our solution is compatible with S3, on-premise storage, including NFS, CIFS, SMB network shares, local disk resources, and hybrid or multi-cloud environments. 

Add multiple storage instances and complete the 3-2-1 backup rule

You should have the possibility to add an unlimited number of storage instances with your GitLab backup. There can be cloud or on-premise, what is more, it would be perfect if you have both of them to replicate backups between storages. Why? Because it will reduce any outage or disaster risk to a minimum and it will help you to apply the 3-2-1 backup rule, according to which you should have at least 3 copies, which are kept on 2 different storage instances with at least 1 in the cloud. 

With GitProtect.io, which is a multi-storage system, you can store your data:

  • in the cloud (GitProtect Cloud, AWS S3, Wasabi Cloud, Backblaze B2, Google Cloud Storage, Azure Blob Storage, and any other public cloud compatible with S3).
  • locally (NFS, CIFS, SMB network shares, local disk resources),
  • in a hybrid environment/multi-cloud.

There is no difference in what type of license you choose, you always get GitProtect Unlimited Cloud Storage for free, therefore, you can start protecting your GitLab repositories from the moment you sign in. 

Let’s look at an example of how this multi-storage system works. You are an ordinary developer in a company, where the Security and Compliance department forces all their employees to store their data on Google Cloud Storage. But… you decided to have your own backup plan and started sending your copies to your local server. One day a huge Google outage takes place and you need to instantly restore your copy from three weeks ago. In this situation, all you need to do is to log in to your GitProtect.io account and restore the needed data. You can restore it to the same or a new GitLab account, to your local machines, or cross-over to another git hosting platform, no matter which one – GitHub or Bitbucket. You will be able to peacefully continue your work in about 5 minutes.

Backup replication matters

Another important feature you should consider while choosing backup software is backup replication. It allows you to keep your backup copies in multiple locations to keep up with the 3-2-1 backup rule, which enables redundancy and business continuity. It is better for you to have the possibility to replicate from any to any data store – cloud to cloud, cloud to local, or locally with no limitations. 

If you backup your data with GitProtect.io replication becomes much easier. You can set a replication plan right from the menu. All you need to provide is the source and target storage, agent, schedule, and that’s it – your backup replication plan is ready.

Acquire flexible retention – up to unlimited

Usually, repository providers provide from 30 to 365 days of retention by default. GitLab permits to keep its customers’ data for 90 days, for example. But, what if you will need some data from 4 or 5 years ago? Thus, retention settings play one of the most crucial aspects when it comes to the choice of the appropriate GitLab backup software. You need to make sure that the features it offers meet your legal, compliance, and industry requirements. Sometimes it happens that an organization should keep all the data for years – that depends on responsibilities – the data which is stored in your repository, time – how long it should be kept and restoration – the moment from which that data should be restored in case of failure. 

It is important for you to set different retention for every backup plan by:

  • indicating the number of copies you want to keep, 
  • indicating the time of each copy to be kept in the storage (those parameters should be set separately for the full, differential, and incremental backup)
  • disabling rules and keeping copies infinitely. 

GitLab backup as a part of the CI/CD process

If you want to make sustainable software, you shouldn’t forget to take care of every stage of the deployment process, so that the delivery is predictable and flawless. To ensure that each release goes through a specified list of steps there should be automation. That is why DevSecOps say that it is important to include security into this process. There is no doubt that you can customize your CI/CD steps in accordance with your workflow, type of project, team or etc., but you should be sure that backup is included as it will ensure your team will work peacefully. Moreover, you will be sure that your source code is easily recoverable from at any point in time. Thus, you can not only motivate your team to work faster but also keep them focused on creating superb features without fear of any mistakes. 

With GitProtect.io you can tailor your backup so that it will work on your company’s workflow through API. As an example, you can set an automatic backup after each pull request. 

If you decide to set an automated backup as a part of your CI/CD process, you will get a guarantee that every change in the code is well-protected and easily recoverable from any point in time and can effortlessly be rolled back to previous versions in case of a failure or a human mistake. 

Monitoring center – email and Slack notifications, tasks, advanced audit logs 

It is clear that there can be a situation when you aren’t directly responsible for managing backup software, but it can be extra important for you to monitor backup performance, check on statuses and responsibilities for a specific change in the settings to control your admins. Thus, you need a comprehensive and customized monitoring center. 

And if you need to find one of the easiest ways to be aware of all notifications without login, you should customize your email notifications settings. You should be able to configure:

  • recipients (thus, only interested parties will be notified about backup statuses),
  • backup plan summary details, including tasks finished with success, with warnings, canceled tasks, not started tasks and failed tasks, 
  • a preferred language, which might be an advantage for some teams. 

It would be ideal if you had notifications sent directly to the software you and your team use on a daily basis. With these Slack notifications, you can almost get a definite guarantee that you aren’t going to miss any vital information. 

When you need to check the status of ongoing tasks and historical events, you can turn to the tasks section. It will provide you with all the necessary information about actions in progress. 

Then, your GitLab backup software should show advanced audit logs, as well. These logs contain all the information about how the applications, services, created backups and restored data work. What is more, you can see which actions each of the admins perform and you can even prevent any intentional malicious activity. 

If you want to achieve easier and non-engaging monitoring, it will be an advantage for you to have the possibility to attach those audit logs to your external monitoring systems and remote management software via webhooks and API.

All of the mentioned above should be accessible through a single central management console, which helps you to manage backup, restore, monitoring, and all the system settings. You can save your time for sure with powerful visual statistics, a data-driven dashboard, and real-time actions.  

GitProtect.io is the only GitLab backup and recovery software on the market that can provide you with an all-in-one solution managed with a single central management console.

Create a dedicated GitLab user account only for backup reasons to bypass throttling

For big enterprise users the best idea is to create a dedicated GitLab user account that will be connected to GitLab backup software and responsible only for backup purposes, for example, backup@companyname.com. Why? There are two reasons. The first is security, as the user should have access only to repositories he wants to protect. At the same time, it helps to bypass throttling, because every GitLab user will have his own pool of requests to the GitLab API. Thus, every application which is associated with this account operates on the same number of requests. It allows the separate users to bypass the mentioned limits and perform backup without any delay or queue. 

It is nice to have several GitLab users, who manage the backup within your GitLab account if you have to manage a big organization and a number of repositories. In this case, if the first one exhausts the number of requests to the API, the next one will be automatically attached, and so on. Under these circumstances, even if you have an enormous GitLab environment, it will work uninterruptedly. 

Part 2
BACKUP SECURITY

GitLab backup software for SOC 2, ISO 27001 compliance

It is a well-known fact that security is the main issue nowadays. And the most sensitive data to be protected for any IT-related organization is the source code. That is the reason why your repository and metadata backup should include a number of security features to ensure data accessibility and recoverability, improve your security posture, and help you meet your shared responsibility duties. To be precise: all of that should allow you to empower your team and keep you on top of regulatory standards at the same time. 

Thus when you choose a software provider and Data Center for your service to be hosted, it should have all world-class security measures, audits, and certificates. 

Here are security issues you should keep in mind and pay attention to

  • AES encryption and your own encryption key,
  • encryption: in-flight and at rest,
  • flexible, long-term, unlimited retention,
  • the possibility to archive old, unused repositories in accordance with legal requirements,
  • easy monitoring center,
  • multi-tenancy, the possibility to add additional admins and assign privileges,
  • Data Center strict security measures,
  • ransomware protection,
  • Disaster Recovery technologies.

User AES encryption in-flight and at rest

It is impossible to speak about security and data protection without proper sustainable encryption. Moreover, it is virtually important to encrypt your data at every stage, whether before and during the transmission in-flight or in the storage at rest.  Only in this case, you will be guaranteed that even if your data is intercepted, nobody can decrypt it. 

Another issue is that your software should offer AES encryption, which stands for Advanced Encryption Standard. It is a symmetric-key algorithm. It means that the same key is used for data encryption and decryption. Many DevSecOps think about AES as unbreakable, thus, many governments and organizations use it. 

If we speak about a perfect scenario, then you should have a choice on encryption strength and level, such as

  • Low, which forces the AES algorithm in OFB (OUTPUT FEEDBACK) mode, when the encryption key is 128 bits.
  • Medium, which makes the AES algorithm run in OFB mode, though the encryption encryptor is longer, 256 bits. 
  • High, which pushes the AES algorithm to work in CBC (CIPHER_BLOCK CHAINING) mode with the encryption key, 256 bits long.

You should always have a choice because once you decide to select an encryption method, you have to keep in mind that on the basis of it, the backup time will vary and the load on the end device or selected functionalities can be limited.  However, there should be no worries as all AES encryption levels are considered to be unbreakable. 

When you configure your encryption level, you should be asked to provide a string of characters, in keeping with which your encryption key will be built. You should be the only person who knows this string and it would be a good idea to save it in the password manager. 

Most providers create encryption keys to secure their users’ data. However, if you decide to make your own encryption key, it will be much stronger. Thus, GitProtect.io gives you this possibility. Our solution enforces your data security and enables you to make up custom encryption keys. 

Furthermore, you will have the possibility to use your own Vault and provide us with your key only during the backup performance. That will ensure that you have much stronger control over your access and credentials. 

Zero-knowledge encryption

It is important that your device knows nothing about the encryption key, thus it should receive it only during performing a backup. In this case no one, but you could decrypt it. In the security industry, this approach is called zero-knowledge encryption. Thus, when you are looking for reliable backup software, you should make sure that it has all AES data encryption, your own encryption key, and a zero-knowledge solution in place.

Data Center region of choice

It is principal for every security-oriented business to know how to manage and store their data. Your backup software provider’s Data Center should be relevant to you, as it might impact coverage, application availability, and uptime. Hence, it is important that you have a choice where you want to host your software and store your data alternatively. With GitProtect.io you have an opportunity to make this choice in the very beginning, after signup, because at that time you will be asked to decide where to store your management service, in an EU or US-based Data Center. 

On the other hand, no matter which Data Center you choose, the most crucial is that it should be compliant with strict security guidelines and meets such international certifications and standards as ISO 27001, EN 50600, EN 1047-2 standard, SOC 2 Type II, SOC 3, FISMA, DOD, DCID, HIPAA, PCI-DSS Level 1and PCI DSS, ISO 50001, LEED Gold Certified, SSAE 16.

Another issue you should pay attention to is physical security, fire protection and suppression, regular audits, and round-the-clock technical and network support. 

Share the responsibility for managing the backup system 

It is a well-known fact that it doesn’t matter which business area you belong to, sharing responsibility helps in faster performance, increasing team morale, and permitting you to focus on development. Here is what your GitLab backup software should let you do:

  • add new accounts,
  • set roles,
  • set privileges to delegate responsibilities to your team members and administrators
  • have more control over access and data protection. 

It is possible to reach it, but only with the help of a central management console and easy monitoring. It will help you to have access to insightful and advanced audit logs, moreover, you will know what concrete actions are performed in the system and who made those changes. 

Ransomware protection

Somebody when he hears the word “backup” can ask how it is related to ransomware, though backup should always be ransomware-proof, as it is the final line of defense against such malware. Let’s figure out how and why. For example, GitProtect.io compresses and encrypts your data which permits you to keep it unexecutable on the storage. In this situation even if ransomware hits your backup data, it won’t be executed and spread on the storage. 

Secure Password Manager keeps the authorization data for GitLab and the storage, and if we speak about on-premise instances, the agent receives them only for the duration of the backup. Thus, if ransomware hits the machine our agent is on, it means that nobody will have access to the authorization of data and storage. 

Unfortunately, everything can happen and if some ransomware will encrypt your GitLab data, you will have an opportunity to restore a chosen copy from the exact point in time and continue your coding without a delay. 

And one more thing, to prevent data from being modified or erased and to be more ransomware-proof, backup vendors offer immutable, WORM-compliant storage technology which writes each file just once but reads it many times. 

PART 3
DISASTER RECOVERY

Disaster Recovery – use cases & scenarios

Once you decide to choose an appropriate backup and recovery software for your GitLab repositories and metadata, you should be sure that it has a sustainable Disaster Recovery technology, which can respond to every possible data loss scenario. Usually, vendors provide you with the data recoverability only when GitLab is down but, unfortunately, there can be much more dangerous situations. 

Here we are going to have a look at how GitProtect.io prepares you for every possible scenario. Though before that let’s see the possible data restore options. 

Recovery features in a nutshell:

  • point-in-time restore,
  • granular recovery of repositories and only selected metadata,
  • restore to the same or new repository or organization account,
  • cross-over recovery to another Git hosting platform, for example from GitLab to GitHub or Bitbucket and vice versa,
  • easy data migration between platforms,
  • restore to your local device.

The majority of backup vendors offer additional applications for your data restore. But with GitProtect.io you don’t need them, as it provides one central management console for a complete backup & recovery software for your DevOps ecosystem. 

1. What if GitLab is down? 

GitLab outages happen rarely, but still, they happen. And in a situation like that, it is always essential to know how to behave to provide your team with uninterrupted work. So, what to do if there is an outage? With GitProtect.io you can instantly restore your local machine as .git to your local instance, or use a cross-over feature and restore your repository to another git hosting platform, whether it is GitHub or Bitbucket. That’s it, you can peacefully continue your work.

2. What if your infrastructure is down?

Until you have a 3-2-1 backup rule, you shouldn’t worry about your infrastructure being down. Nowadays this rule has already become a widely-adopted standard in data protection. According to this rule, you should have at least 3 copies saved on 2 different storage devices, at least 1 of which is in the cloud. GitProtect.io provides its customers with a multi-storage system that permits them to add an unlimited number of storages, including on-premise, cloud, hybrid, or multi-cloud, and make backup replication among them. Moreover, with this solution, you will be offered free cloud storage in case you seek a reliable, second backup target. Thus, you will always be sure that even if your backup storage is down, you can easily restore all the needed data from any point in time from your second storage.

3. What if GitProtect’s infrastructure is down? 

GitProtect.io is a product of Xopero Software, a backup & restore company, thus, it lives from data protection and, of course, is ready for every potential outage scenario, especially the one which harms its infrastructure. In case if GitProtect.io’s environment is down, you will be shared the installer of your on-premise application. Thus, your only task will be to log in, assign your storage where your copies are stored, and use all data restore and Disaster Recovery options the solution provides. 

Restore multiple git repositories at a time

There exist a lot of situations when you need to instantly restore all your entire Git environment. And the best help here you can get from Restore and Disaster Recovery technologies. If you have a backup plan, in case of a failure, outage, or downtime, you can quickly restore your data. The easiest way to do it is to restore multiple GitLab repositories at a time. All you need to do is to choose repositories you want to restore, look at the most recent copies or assign them manually and restore them to your local machine. Another option is cross-over recovery to another hosting service provider. All of that will make your Disaster Recovery plan both easy and fast, and efficient. 

Point-in-time restore – don’t limit yourself to the last copy

It is a common fact that human errors are one of the most usual reasons for cybersecurity incidents and data loss. You never know where the risk is hidden so there is no difference in the case of git repository backup, as it can be from the intentional or unintentional repository or branch deletion to HEAD overwrite. As soon as you define the exact state and date you want to roll out, it will be principal to have a possibility to restore your GitLab backup from some specific or defined moment in time. Here it is better to mention that most backup vendors offer their customers to restore only the latest copy or the copy from up to 30 days prior (the retention limitations can be a real problem then).

But what should you do if you notice some devastating changes in your source code, for example, after 50 days of them occurring? In a situation like that, you will need to go back to some extra time prior. Thus, you should be sure that your GitLab backup software provides you with point-in-time restore and unlimited retention options. If you want to read more about retention, here you can find some more information. It doesn’t matter when the mistake or threat was found, you can use such software for GitLab archive reasons, legal compliance assurance, and GitLab storage limitations to be overcome. 

Restore directly to your local machine

Sometimes those who work on GitLab in SaaS want to restore their copies to their local machine. It can happen due to a weak internet connection, cloud infrastructure downtown, or service outage. Thus, together with other restore possibilities, your GitLab backup software should permit you to restore your entire git environment to the local machine. 

At the same time, it’s worth remembering that it is always great when your software provides you with some additional options, such as restoring to the same or new GitLab repository, and cross-over recovery to another git hosting service. Why? Because it is hard to predict which recovery opportunities you may need in the future. 

Don’t overwrite repositories during the restore process

If you need to restore your repository from a copy, it is better to have it restored as a new repo. Why not overwrite the original one? Because, in the future, you may need to use your original one, for example, for tracking changes. Moreover, it enlarges your security and gives you full control over your data, making you a decision-maker who decides when to keep or delete the repository.

Comments are closed.

You may also like