Bitbucket Backup Best Practices
Last Updated on July 19, 2024
Nowadays the number of users of different git hosting services is growing rapidly. It happens due to the fact that people need more and more software programs that make their life easier. Bitbucket is one of the most used git-based code hosting and collaboration tools built for teams and powered by Atlassian. And, there is no doubt, if there is a conversation about source code, there should be a lot about its security and protection, and under it we mean backup. And if you still have doubts about whether to backup your Bitbucket account or not, download this ebook.
Here we collected the best practices to backup your Bitbucket environment so that your source code is always accessible and recoverable even in the case of the most severe Atlassian outage.
Backup Performance
Once you decide to backup your Bitbucket environment, you need to make sure that your backup covers all your repositories with the related metadata. Your copies should include the following information (regardless of the package – Bitbucket or Bitbucket DC):
- Repository,
- Wiki,
- Issues,
- Issue comments,
- Downloads,
- Deployment keys,
- Pull requests,
- Pull request comments,
- Webhooks,
- Pipelines.Actions,
- Tags,
- LFS,
- Commits,
- Branches.
Using your backup software, you should have the possibility to create as many custom backup plans as you need under your data protection policy to fit your organization’s needs, structure, and workflow for your developers work peace of mind.
One of the best things to do is to create a backup plan to record the everyday changes (or even more often) of your principal repositories and metadata. For this reason, you can use, for example, Grandfather-Father-Son (GFS) rotation scheme and some other backup plans for unused repositories that you have to keep for any future reference. Thus, you can store your copies for as long as you want – even infinitely. Also, you should have an opportunity to delete those repositories from your Bitbucket account and keep the copy on storage to reduce the Bitbucket limits.
You may ask, could Bitbucket DIY Backup work for your organization? And help you fulfill security obligations and compliance requirements? In general, snapshots can be a good option for simple backup needs. But they are not suitable for large-scale or enterprise deployments. In this scenario, Bitbucket DIY Backup’s biggest drawback is the fact that it’s a self-contained solution. This means you will be responsible for setting up regular backups and restoring copies on your own if necessary. Take under consideration… Learn more
Incremental and differential backups that save your storage space
If you need to reduce the backup size of large files on your storage, speed up your backup, or limit bandwidth, you should make sure that your software includes only changed blocks of Bitbucket data since the last copy. At the same time, you should be able to define full, incremental, and differential retention and performance schemes for every type of copy.
Deployment: SaaS or On-Premise
It is always up to you to decide how to run your backup solution: on the cloud or self-host on your private infrastructure. The main difference here is the place the backup service is installed and run.
If you choose a SaaS model, then there is no need for any additional device to use as a local server because the service, in this case, will be run within the provider’s cloud infrastructure. Moreover, in a situation like this, the service provider should guarantee the maintenance, administration, and continuity of operation.
When we consider local (or on-premise) deployment, it will be your obligation to install the software on a machine of your provision and control and, then, it will work in your environment locally. You should always have a choice, thus, if you can install it on any computer, including Windows, Linux, macOS, or on popular NAS devices, it will be a good opportunity for you. Under this model, you will be able to avoid any issues related to connectivity to the network, because all the backup copies will be made within the local network. All of that will make the backup process faster and more efficient.
But pay attention that the deployment model has to be independent of data storage compatibility.
With GitProtect.io Cloud PRO, Cloud Enterprise, and On-Premise Enterprise you can use GitProtect unlimited cloud storage which is included in the license by default. When you choose the Enterprise plan, you can bring your own storage – SaaS or local, it depends on you. GitProtect.io supports such clouds as AWS S3, Wasabi Cloud, Backblaze B2, Google Cloud Storage, Azure Blob Storage*, and any public cloud compatible with S3, on-premise storage (NFS, CIFS, SMB network shares, local disk resources), as well as, hybrid and multi-cloud environments.
Fulfill the 3-2-1 backup rule and add multiple storage instances
Your Bitbucket backup solution should provide you with the possibility to add multiple storages – SaaS and local (preferably both of them – hybrid form). It will help to replicate backups between storages and eliminate any outages or disaster risks. What is more, it will help you to meet the 3-2-1 backup rule, according to which you should have at least 3 copies on 2 different storage instances, one of which is in the cloud.
GitProtect.io, which is a multi-storage system, allows you to store your sensitive data:
- in the cloud, which includes any of such cloud storages as GitProtect Cloud, AWS S3, Wasabi Cloud, Backblaze B2, Google Cloud Storage, Azure Blob Storage*, and any public cloud compatible with S3,
- locally, including NFS, CIFS, SMB network shares, local disk resources,
- in a hybrid environment/multi-cloud.
With GitProtect.io you always get GitProtect Unlimited Cloud Storage for free and it doesn’t matter what type of license you have. It permits you to start protecting your repositories straight away.
To understand how this multi-storage system works in practice, let’s look at an example. You work for a company the Security and Compliance department which makes you store your data on Google Cloud Storage, but you decide to send your copies to your local server as well. Then there occurs a huge Google outage and you need to instantly restore your copy from a month ago. In this case, all you need to do is to log in to your GitProtect.io account, select a backup plan assigned to your local server, choose a copy from a month ago, and restore it. The place where you want to restore your data is up to your choice – it can be the same or a new Bitbucket account, local machine, or cross-over recovery to another git hosting platform, whether GitLab or GitHub. Thus, you can continue your work peacefully only in 5 minutes after the ongoing Google outage.
Backup replication
Replication is an important aspect you should pay attention to when you choose backup solution. Due to backup replication, you can keep consistent copies in multiple locations according to the 3-2-1 backup rule, which enables redundancy and business continuity. At the same time, it is vital to have an option to replicate from any to any data store, which means you can replicate from cloud to cloud, cloud to local, or locally with no limitations.
If you use GitProtect.io, it will be easy as you can find a replication plan in the menu of the central management console. You only need to indicate the source and target storage, agent, and simple schedule and that’s it, your backup retention plan is ready.
Flexible retention – up to unlimited
If we consider retention settings, we understand that it is a key question in choosing the proper Bitbucket backup solution as they should meet your legal, compliance, and industry requirements. Sometimes depending on what kind of data, the period of time you need to keep the sensitive information, or the period the data should be restored in case of failure you need to keep the data for years.
Most vendors provide retention from 30 to 365 by default. Unfortunately, it isn’t enough if you need to meet your compliance requirements.
You should have an opportunity to set different retention for every backup plan by:
- indicating the number of copies you want to keep,
- indicating the time of each copy to be kept in the storage, but be attentive as those parameters should be set separately for the full, differential, and incremental backup,
- disabling rules and keeping copies infinitely (in most cases for GitHub archive purposes).
Monitoring center – email and Slack notifications, tasks, advanced audit logs
You need to have a complex and customized monitoring center if you want to monitor backup performance, check on statuses, and see who is responsible for some changes just in case. And you will definitely want it!
Custom email notifications seem to be one of the easiest ways to know about all the changes in your git repository. You should have the possibility to configure:
- recipients, which will permit you to be informed about backup statuses without even having an account,
- backup plan summary details, which include successfully finished tasks, tasks finished with warnings, failed tasks, canceled tasks, and tasks not started.
- a language to choose, which is an advantage as well.
In a perfect world, all the notifications sent to the communication app you and your team use should be sent on a daily basis. GitProtect offers your team Slack notifications. In this case, you won’t miss any important information, ever.
Checking the status of ongoing tasks and historical events is another important issue. Thus, with the tasks section, you will always have a clear view on actions in progress and all the detailed information.
Also, here we should mention advanced audit logs as they contain all the information on how the applications, services, created backups and restored data work. Moreover, it permits you to see which actions each of the admins perform and it’s possible for you even to prevent any intentional malicious activity.
To achieve easier and not-so-involved monitoring, you can attach those audit logs to your external monitoring systems and remote management software using webhooks and API.
It is convenient when all the mentioned features are accessible through a single central management console. It can help you to manage backup, restore, monitoring, and all the system settings. Powerful visual statistics, a data-driven dashboard, and real-time actions can save your time.
It is convenient when all the mentioned features are accessible through a single central management console. It can help you to manage backup, restore, monitoring, and all the system settings. Powerful visual statistics, a data-driven dashboard, and real-time actions can save your time.
GitProtect.io is the only Bitbucket backup and recovery software solution on the market that gives you the possibility to set data protection operations with a single management console.
Create a dedicated Bitbucket user account only for backup reasons to bypass throttling
The best practice for big companies is to create a dedicated Bitbucket user account and connect it to Bitbucket backup solution. Thus, it will be responsible only for backup purposes, [email protected] – can be an example. All of that is for two reasons, the first of which is security, which means that this user needs to have access only to repositories it should protect. At the same time, it helps to bypass throttling as each Bitbucket user has his own pool of requests to the Bitbucket API. Hence, it will help to operate every application associated with this account on the same number of requests. It will permit a separate user to bypass the limitations and perform backup easily and smoothly without any delay.
In the situation when you have to manage a big organization that has many repositories, it can be a good idea to have a few Bitbucket users dedicated to backup purposes within your Bitbucket account. It will permit you to automatically attach a user when the first one exhausts the number of requests to the API. Under such circumstances, even if you have the biggest Bitbucket environment, it will be performed uninterruptedly.
Backup Security
Bitbucket backup software for SOC 2, ISO 27001 compliance
Nowadays security is a serious concern for the majority of companies, and source code might be the most valuable data for any IT-related organization. For this purpose, your repository and metadata backup should include many different security features to ensure data accessibility and recoverability. Moreover, it should improve your security posture and help you to meet the shared responsibility duties.
The software provider and Data Center your service is hosted should have modern and outstanding security measures, audits, and certificates in place.
Pay special attention to the following security issues:
- setting AES encryption with your own encryption key,
- in-flight or at rest encryption,
- long-term, unlimited, flexible retention,
- possibility to archive old, unused repositories due to legal requirements,
- easy monitoring center (more),
- multi-tenancy, the possibility to add additional admins and assign privileges,
- data Center strict security measures,
- ransomware protection,
- Disaster Recovery technologies.
Could the Zero downtime backup work for your organization? Let’s keep in mind that even the Atlassian while talking about it uses the term “technique”. And it is still script-level protection – which of course comes with some drawbacks. It could work for you if the scale is not an issue, and you are not concerned with the ransomware security measure, disaster recovery, and of course – if you don’t need to cover other tools, like Jira, Confluence… Learn more
User AES encryption in-flight and at rest
Encryption is another important practice in data protection. It is worth protecting your sensitive data on every stage, in-flight and at rest. In-flight encryption means that the data is encrypted on your device, before it leaves your machine and during the transfer, and at rest is when your data is encrypted in the backup target or git repository. Only by using these two types of encryption, you can be guaranteed that even if your data is intercepted, nobody is able to decrypt it.
For your software to have Advanced Encryption Standard (AES) is a must. AES was created as a symmetric-key algorithm that uses the same key to encrypt and decrypt data. It is thought that AES is unbreakable which is why many governments and organizations prefer to use them.
It is important that you have a choice on encryption strength and level, considering:
- Low: forces the AES algorithm to work in OFB (OUTPUT FEEDBACK) mode with an encryption key of 128 bits.
- Medium: as in the case of ‘Low’ encryption strength, the AES algorithm is run in OFB mode, but the key used is the encryption encryptor is twice longer – it consists of 256 bits.
- High: with this option selected, AES will work in CBC (CIPHER-BLOCK CHAINING) mode, and the encryption key is 256 bits long.
All the levels belong to AES encryption and are considered to be unbreakable. Though it is important for you to have a choice as depending on the encryption method you choose, the backup time will vary and the load on the end device or selected functionalities can be limited.
You should provide a string of characters during the encryption configuration because on the basis of this string your encryption key will be built. It is worth mentioning that you should be the only person who knows the password and it is a good idea to save it in the password vault or password manager.
But if you want your encryption key to be really strong, you should provide your own one. GitProtect.io, unlike the majority of providers which provide encryption, enables you to create custom encryption keys – thus, it enforces your data security.
Also, it should be mentioned that you will be able to use your own vault and provide us with your key only when the backup is being performed. That will bring more assurance that you and your developers have much stronger control over your access and credentials.
Zero-knowledge encryption
You, the key owner, should be a single person who knows the encryption key. Even your device shouldn’t have such sensitive information as it receives it only when performing a backup. Such an approach in the security industry is called zero-knowledge encryption. Thus, once you need to check for reliable backup solution, make sure that it has both all AES data encryption and your own encryption key, and zero-knowledge infrastructure.
Data Center region of choice
It is crucial for every security-oriented business to know the way the data is stored and managed. That is why, the Data Center location matters. It should be relevant to you as it can impact coverage, application availability, and uptime. So, you should always have a choice of the place to host your software and store your data. With GitProtect.io you get such a choice because during signing up, you will need to choose where to deploy the service and store data, in a EU or US-based Data Center.
Though, when choosing the Data Center, you should make sure that it is compliant with strict security guidelines and meets such standards and certifications as ISO 27001, EN 50600, EN 1047-2 standard, SOC 2 Type II, SOC 3, FISMA, DOD, DCID, HIPAA, PCI-DSS Level 1and PCI DSS, ISO 50001, LEED Gold Certified, SSAE 16.
Never forget about physical security, fire protection, and suppression systems, regular audits, and constant technical and network support. All these factors are important as well.
Sharing the responsibility for managing the backup system
It doesn’t matter what business area we speak about, but we know for sure that if you share responsibilities among employees, your work goes smoother and faster. Moreover, it increases team morale and permits you to focus on a wider picture. So, what should your Bitbucket backup solution let you do? Here we need to mention such opportunities as adding new accounts, setting roles, and privileges to delegate responsibilities to your team of developers and administrators, and having more control over access and data protection.
All the mentioned options you can get only with a central management console and easy monitoring. It is essential for you to know the actions done in the system and who is responsible for those changes. And that is the reason why you need to have access to insightful and advanced audit logs.
Ransomware protection
Take a note that backup should be ransomware-proof. Let’s look precisely at how the backup vendors process your data. GitProtect.io keeps your data unexecutable on the storage because it compresses and encrypts it. Thus, even in the case when some ransomware hits your backed-up data, it won’t be able to execute and spread it on the storage.
You can keep the authorization data for storage and Bitbucket in Secure Password Manager. And if you have an on-premise instance, the agent receives them only during backup. Hence, if a machine is hit by any ransomware, our agent is on. Nobody can get access to authorization data and storage.
Situations can be different. Even if ransomware succeeds in encrypting your Bitbucket data, you should have the possibility to restore a chosen copy from the point in time you need, so that to get to work immediately.
Remember that if a backup vendor offers you immutable, WORM-compliant storage technology that writes each file only once but reads it many times, in this case, your data won’t be modified or erased. Moreover, it will make your data ransomware-proof.
Disaster Recovery
Disaster Recovery – use cases & scenarios
When you need to choose the proper backup and recovery application for your Bitbucket repository and metadata, you should check if its Disaster Recovery technology meets all possible data loss scenarios. The majority of vendors provide you with recoverability only in case Atlassian is down. But what about other dangerous situations? Here are the restore options GitProtect.io guarantees you:
- point-in-time restore,
- granular recovery of repositories and selected metadata,
- restore to the same or new repository/organization account,
- cross-over recovery to another Git hosting platform (ie. from Bitbucket to GitLab or GitHub and vice versa),
- easy data migration between platforms,
- restore to your local device.
When you choose GitProtect.io to backup your data, you get the best backup and disaster recovery features as one package. You don’t need to install any additional applications. GitProtect.io is a complete backup & recovery software for your DevOps ecosystem with one central management console.
1. What if Atlassian is down and you can’t get access to your Bitbucket account?
If Atlassian is down, you need to act fast to restore your repository to continue coding. In these circumstances you have three options. First, you can instantly restore your entire Bitbucket environment from the last copy or a selected point in time to your local machine as .git. Second, you may recover the copy to your Bitbucket local instance. Or, third, you can use crossover recovery to another git hosting platform (GitHub or GitLab), that will help your developers work without interruption.
2. What if your infrastructure is down?
Nowadays the best backup practice is the 3-2-1 backup rule which has already formed a standard in data protection. Under this rule, you should have at least 3 copies stored on 2 different storage instances and at least one of those copies should be kept in the cloud. When you choose GitProtect.io, you can add an unlimited number of storage instances, choosing from on-premise, cloud, hybrid, or multi-cloud, and make backup replication among them. At the same time, if you need a reliable, second backup target, this solution can offer you free cloud storage. Thus, you shouldn’t worry if your backup storage is down. You will be able to restore all the data you need from any point in time from your second storage.
3. What if GitProtect’s infrastructure is down?
Data protection is our everything. And that is the reason why we must be prepared for every potential outage scenario, especially the one that can harm our infrastructure. Thus, in case our infrastructure is down, you will get the on-premise app installer. Your task will be to log in and assign your storage where your copies are kept. That permits you to get access to all your backed up data, and data restore and Disaster Recovery options mentioned above.
Restore multiple git repositories at time
There are a lot of situations when you need to restore your Git repo instantly. For example, when it comes to downtime or service outage. Thus, it is great when you have an algorithm for what to do in such situations. And Restore and Disaster Recovery technologies can help in this question. Because if the backup is done, then data can quickly be restored. The easiest way to get it is the possibility to restore multiple Bitbucket repositories at the same time. All you need to do is just to choose repositories you want to restore. Look for the most recent copies or assign them manually and, finally, recover them. The place you want to recover is up to your choice. You can restore them to your local machine, or recover cross over to another hosting service provider and make your Disaster Recovery plan as easy, fast and efficient as possible.
Point-in-time restore – don’t limit yourself to the latest copy
One of the most common reasons for data loss and cybersecurity risks are human mistakes. Sometimes it doesn’t matter if the deletion was intentional or not, all you need to do is to state the exact time you need to restore your Bitbucket backup from. It should be mentioned that most backup vendors offer you to restore the latest copy of the copy from up to 30 days prior.
So, what should you do if you notice some mistake in your code two months after it has occurred? Hence, you should be sure that your Bitbucket backup solution offers you both point-in-time restore and unlimited retention options. Here you can read more about retention. Also, such software helps to overcome Bitbucket storage limitations, ensure legal compliance, and have constant data recoverability, which ensures that you are ready for any threat.
Restore directly to your local machine
You may prefer to work on Bitbucket in SaaS. Hovewer, there can come a moment when you want to restore some copies to your local machine. The reasons can be different, among them there can be cloud infrastructure downtime, service outage, or weak internet connection. That is why it is important that among numerous restore possibilities there is an option to restore your entire Git environment to your local machine.
Though, such additional options as restoring to the same or new Bitbucket repository or crossover recovery are essential as well. Why? As nobody knows which scenario will be the best to meet your organization’s needs.
Restore without repository overwriting
You should know that if you want to restore your repository from the copy, it is better to restore it as a new repository and not overwrite the original one. Let us explain why. In the future, you may need the original copy, for example, for some reference. And what is more important it is crucial from the security point of view to have one. Furthermore, it makes you in charge of keeping or deleting your repositories, so you have full control of your data and you are the one who makes all the decisions.