When you say “I have a backup of my DevOps data”, whether it’s GitHub, GitLab, Bitbucket, Jira, or another tool, we assume that you mean that you can recover those copies if such need arises. Right? Well, you must know that backup and recovery are two disparate concepts, and companies should always run them together. But do they do that?
Data Protection Basics
Backup and disaster recovery assumes creating or updating one more copy on a regular basis, storing them in one or more locations (preferably both cloud and on-prem), and using the copies to resume and continue business operations in the event of failure – human errors, deletions, data corruption, service downtime, cyberattack or natural disasters. By such recovery, we mean the restoration of all data in bulk, which allows you to get back to work immediately.
However, those two processes, backup and disaster recovery are very often mistaken for each other or treated as one entire process. Backup is the process of making the copies. Disaster recovery is the plan, set of measures, indicators (including RTO and RPO), and processes for using the copies to quickly reestablish access to applications, data, and IT resources after an outage. That plan might involve switching over to a redundant set of servers, storage systems, or tools until your primary data center or service is functional again.
There is an even bigger concept behind disaster recovery – business continuity. A business continuity plan includes the measures regarding employees, sites, suppliers and everything else needed to continue core business functions.
Disaster recovery and backup are not interchangeable. Depending on needs and resources, a company may only use data backup for daily operations and accidental deletions or complement it with a comprehensive disaster recovery strategy and processes such as replication, cross-restore, etc.
And here appears our myth that backup always comes with disaster recovery…
MYTH #5 – Backup always comes with Disaster Recovery
Scenario 1 – You have a GitHub/GitLab/Bitbucket/Jira backup script
Let’s take backup scripts as an example. Scripts for backup of repositories, metadata, or export automation in Jira are some of the most frequently used DIY solutions among users of DevOps tools.
Of course, you can make your own solution to create a script that automatically clones the entire repository on a periodic basis. The main challenge in this approach is to ensure proper authorization, automation, and management, but for some basic needs, you can easily overcome it with a few lines of code. True that.
While a simple script for cloning repositories may be effective for a small number of repositories, it becomes less practical as the number of repositories grows or when new ones are added regularly. Additionally, such a script may not adequately protect metadata associated with the repositories, such as information contained in pull requests and issues.
Please bear in mind that having a git backup script allows you to do only copies. Once you need to recover your data from such copies – you need to write another script. And under no circumstances should you consider this approach as a way to ensure smooth disaster recovery. Scripts will not provide you with the basic functionalities that your backup plan should include (e.g. retention, copy rotation, encryption, replication) and even more different data recovery scenarios. Just consider answers to the below questions…
- Do your scripts protect all repositories and key metadata?
- Do your scripts allow you to test the backup and recovery process?
- Do you have clearly defined RTO and RPO times?
- Do you store copies in several independent storage locations?
- Do you have replication?
- How long do you keep your copies and from what point in time can you restore them?
- Is it possible to recover copies to a local device, a different deployment (C2C, P2C, C2P), or another tool (between GitHub/GitLab/Bitbucket)?
- Does your team know what to do in the event of a failure and who is responsible for restoring data and tools to continue operations?
- In short – have these solutions prepared you for every possible failure scenario?
As you can see, scripts provide a certain level of security, but they cannot be treated as the basis for disaster recovery and business continuity. In this case, it is worth relying on automated software that provides backup, granular restore for daily operations, and every-scenario-ready disaster recovery. Only such a solution will guarantee that you will recover all your data in bulk and continue your business operations in the event of attacks, service downtime, or in case of human errors and accidental deletions – restore only deleted items granularly. Such solutions also ensure compliance with the most important security standards, industry regulations, and shared responsibility models.
Scenario 2 – Your DevOps backup vendor claims it has a Disaster Recovery
Unfortunately, the popular equating of the concepts of backup and disaster recovery is taken advantage of by many vendors who do not always offer backup software with disaster recovery technologies in the proper sense. Sometimes in this way, they communicate the granular restore option, which allows you to browse the copies and select individual items to restore but not an option to restore all data in bulk to many different locations.
What’s the difference? In the simplest terms, disaster recovery is the plan and processes for using the copies to quickly reestablish access to all data (in our case all repositories and metadata or all Jira data) that are crucial for business continuity in the moment of failure. Sometimes, it might involve switching over to a redundant set of storage systems, data centers, or even tools (between GitHub, GitLab, and Bitbucket) until the primary data center or service is functional again. It requires the possibility of restoring all data in bulk to many locations – local devices, between data stores, or even the option to restore your GitHub data to a Bitbucket or GitLab account (and conversely). The hidden feature of Disaster Recovery is also data migration between deployments (from on-premise to the cloud) if you need to secure the process and easily migrate from local to cloud deployment of your GitHub, GitLab, or Atlassian account or between those tools.
Unfortunately, some DevOps backup software vendors communicate that they have disaster recovery technology, limiting it only to the possibility of granular restore, which applies to restoring individual, selected elements. This does not assume a restore of the entire instance and account data and should only be used for everyday operations, such as accidental deletion, which only affects a limited amount of data. Granular restore cannot be equated with disaster recovery because it does not ensure the restoration of all critical data in bulk which is necessary for ensuring business continuity.
When choosing backup software and considering its disaster recovery capabilities, it is also worth paying attention to where the data will be restored. If you want to be absolutely sure that your disaster recovery process will be adapted to every possible scenario, make sure that you can restore your DevOps data to a local device, same or different accounts and deployments (e.g. between self-hosted GitLab and the cloud account) and between tools – for example, restoring your GitHub data to your Bitbucket account. In the case of Jira, it is worth checking whether the provider allows you to restore all data except users, in case you need to restore to a new Jira account but do not want to incur double fees.
Also, pay attention to the compatibility of the solution with external data storages (both cloud and local), the possibility of replication, the retention time it offers, and, of course, all security features that will ensure compliance with security requirements, shared responsibility models, legal and industry regulations and security certifications (including SOC 2 or ISO 27001).
Find out more about Disaster Recovery Use Cases and ensure the DevOps operations continuity in every possible failure scenario with GitProtect.
And If you still think that your GitHub, GitLab, Bitbucket, Jira data do not need Disaster recovery – check out previous editions of MythBuster, in which we debunk the thesis that nothing fails in the cloud and that vendors themself guarantee your data security.