GitLab High Availability and GitLab Geo Options
Last Updated on March 6, 2024
“Time is money” – a trivial statement, but it reflects the reality of the market for modern IT services. We create certain products or services that we make available to customers, and they pay us to use them. Of course, theoretically, we guarantee access to these systems either as a service provider or by offering our products directly to customers. What happens during outages or breakdowns? A simple fact – customers are unable to access the service. And this is only a part of a much bigger problem. Every minute, the user can not access the service or data, his or her organization is losing not only precious time but also money. Therefore, we should put some countermeasures in place – a service provider and user as well.
GitLab High Availability
High Availability is a feature that allows us to minimize the interruption of our services in case of some technical problems with our appliances. It may be an unexpected network issue, software crash, or even hardware failure. Actually, it doesn’t matter what the reason is. What really matters is the result. And by the result, I mean that other appliances (copies of the original one) can take control and keep our application up and running.
This allows us to be confident that our applications and systems can handle any failure and still be available to our customers which can be crucial for us due to many reasons. Of course, there is always a tough decision to be made. Every tool, feature, or enhancement costs us money, so at the end of the day we have to ask the question – is it worth it? Sometimes certain security features may not be cost-effective from a particular point of view. Let’s take a look at the GitLab documentation to explain in detail what I mean here.
“For environments serving 3,000 or more users, we generally recommend that a HA strategy is used as at this level outages will have a bigger impact against more users.”
Thus, GitLab recommends using HA only from a certain level of user numbers. This is understandable. On the other hand, with HA in place, there is a need to duplicate each component. It means that we have additional maintenance costs. Then what to do in a case with a small number of users? Let me quote another piece from the documentation mentioned above:
“For a lot of our customers with fewer than 3,000 users, we’ve found a backup strategy is sufficient and even preferable. While this does have a slower recovery time, it also means you have a much smaller architecture and less maintenance costs as a result.”
GitLab Geo Options
Now let’s check out another feature called GitLab Geo. Let’s imagine a situation when our GitLab instance is hosted in a single location, somewhere in Europe. But our company is quite big and we just opened two new offices: one in the United States and the other in Singapore. Do you see where I’m leading to? Especially today in the post-pandemic world there are a lot of distributed projects and distributed teams. This is both a challenge and an opportunity. Being able to hire specialists from all over the world is great, but having distributed teams involves some technical problems, which fortunately we can solve relatively easily.
Eliminate data loss risk and ensure business continuity with the first TRUE Disaster Recovery software for GitLab.
With GitLab Geo, we can have read-only mirrors of our GitLab instance. For example, one instance per team, if needed. This configuration allows our teams to work faster by increasing the speed of fetching, cloning, or reading any data. All repositories, user accounts, issues, groups, and more are replicated from the primary instance to the secondary ones with read-only access. This feature is very easy to scale and, above all, noticeably increases the speed of work. For example, on the official site in the “about” section regarding GitLab Geo there is the following statement:
Is it a backup?
We have already learned what GitLab Geo and GitLab High Availability are… more or less, but how does it relate to the topic of this post? Well… it is close. Let’s start by explaining what a backup is and what it should be. In short, it is a copy of the computer data taken and stored elsewhere so that it may be used to restore the original one. But it is not enough. A good backup should include features like:
- automation
- encryption
- versioning
- data retention
- recovery process
- scalability
And these are just the most essential features. There are tools on the market that provide all of them or much more. For example, they may include easy integrations with various hosting platforms, an easy and user-friendly UI, plenty of statistical data, or various types of notifications, such as emails.
As you can see, both previously described solutions do not have these features. They are useful tools, but they are simply used for something else. Anyway, the authors themselves mention this in the paragraph quoted above, where there was a mention of projects with fewer than 3,000 users. Using backup as intended even allows us to avoid the complexities that come with High Availability. To make it clear, let me add that a properly configured environment with HA and Geo can help us avoid the necessity to restore from backup, but it will not replace it.
GitLab Geo is a very nice tool. Though it is still in the development process and many features or improvements are to come. Currently, it should not, and must not even be considered as a backup. Leaving aside some of the arguments already mentioned, there are also a few limitations here. For example, real-time updates of issues do not work on the secondary site. The same goes for GitLab Runners – these cannot be registered on the secondary site. What about a consistency and recovery plan then?
HA vs. Backup and DR – what to choose instead
When we speak about DevOps backup – and in this case the GitLab backup in particular – we understand that backup is crucial because nobody wants to lose precious time and important data due to any failure. GitProtect provides organizations with a holistic approach to data protection meaning you are able to set policy-based automatic backup, define the schedule and perform backups with no limitations, additionally enhanced with ransomware protection and first-on-the-market REAL disaster recovery. All this comes with centralized management, detailed audit logs, webhooks, and API integration. GitProtect helps you keep your DevOps ecosystem safe for your compliance needs.
You can use your own scripts – but this approach is not recommended due to difficulty with maintaining the correct backup chain and many other security issues. You can also choose a professional backup solution – in this case, backup management is much easier and comes with a lesser amount of time and resources. But remember, even with GitLab Geo and GitLab HA you still need a reliable and working backup in place.