But in business you must not forget one thing – security. Yes, we have to protect our data. Each of our applications, each device and each repository may be attacked by hackers. Well, even not ‘may’, but sooner or later it will be. So the question is not IF, but WHEN. It is worth considering whether we are prepared for this?
There are a lot of types of attacks, including popular DDoS, XSS, phishing, and more. This could be even an attack by regular email spam and waiting for the user to click on a malicious link. There are also aspects of social engineering here – how do you get a user to install malware himself? Especially if the hacker targets himself to attack someone who is tech-savvy. People working with GIT usually have some knowledge of the technology. Either way, regardless of the source of the attack, the effect for us can be terrible.
What is ransomware?
I would like to focus on what we call ransomware. This is malware that blocks access to data or entire systems, usually encrypts it, and then demands a ransom to unblock that access. In 2019, Europol published a report listing ransomware as the most prominent threat in terms of prevalence and financial damage. If such an organization claims so, it indicates the scale of such attacks.
According to a 2017 Malwarebytes report, 35% of SMBs have been victims of a ransomware attack. 90% of infections ended up with over 1h long downtime! Interestingly while the ransom demands weren’t always large, the impact on productivity made a significant drop in revenue. This is the clue of the program! After such an attack we measure our losses not in the amount of the ransom, but in the fact that our services are unavailable and our employees cannot work on their development. By the way, the ransom is rarely actually paid, even if it is small.
Case study – Ransomware attack on GitHub, Bitbucket and GitLab
Consider the renowned attack that took place in May 2019. Ransomware has attacked hundreds of repositories on Github, GitLab, and Bitbucket. All source code disappeared from infected repositories, and instead, there was only one file with information about the infection and the amount and method of paying the ransom. Let me quote one of the victims of this attack. This is a part of the post at security.stackexchange.com:
“I’m at a bit of a loss just now as what to do, 2 factor has been turned on in github, the main server where the code was used I’ve removed unused scripts etc changed passwords, currently building a new server droplet and moving everything as a precaution in case the server was accessed.“
As seen above, the developer not only lost his code, but also had to spend time and cost securing his current configuration and architecture, and to be prepared to avoid future problems. A double loss that was potentially avoidable by preparing for such an attack in advance. Hackers probably scanned the repositories for Git config files, pulled credentials from them and then used them to access infected repositories.
There is a quite happy-end to this story – most of the data was finally recovered. However after long hours or even days and a lot of pain and tears. It is also a huge alarm signal, shining bright red and emitting a loud alarm sound. “Be careful! Your repository could be next! “
So how can we protect our repositories?
First of all, don’t keep your credentials public! On the one hand, this is so obvious, but as it turns out, not entirely. In 2018, an experiment was conducted to search Github for withdrawn commits that contained the words “removed password” in the message. Result? 350k! And I can bet that at least twice as many just had different message content, but did exactly the same. This is a huge number that shows the scale of this phenomenon.
In addition to uncivilized public keeping of passwords, endpoints, API keys or other sensitive data, it is also worth considering less obvious topics. As an example let’s take the ‘hidden’ file .DS_Store on Mac-OS systems. It is a file that contains some metadata about the directory in which it is located. Why could this be a problem? Because making this directory public allows a potential hacker to obtain information that he should not have.
Without going into details, I will use the results of the experiment of the portal internetwache.org. Among the 1M most popular websites (according to Amazon) as many as 10k domains shared such a file. Over 600 of them contained information about a file with the .bak extension, of which “only” 2 had the name “db.bak”. I wonder what this file could contain… Of course, authentication was important here, or rather the lack of it. It may not be a large scale, but it shows that potentially meaningless information in the hands of hackers can be dangerous for us.
Let’s talk more about authentication in the context of our repositories. It is critical to use additional security features such as 2-factor authentication. Github strongly insists on using this type of access, or by using SSH, and moving away from the traditional login-password approach. But let’s assume for a moment that we have such security in place, and still the hacker managed to, for example, steal a personal token and logged in correctly. How can we protect ourselves against that? By applying restricted access to the repository. By default, you should block access everywhere and only grant the necessary minimum. It is also worth making sure that such access is revoked if the employee changes the project or leaves the company.
As for the passwords, I will only add the obvious obviousness that they should be strong and, above all, not used anywhere else!
You also have to be careful with your tokens. This is a good way to strengthen the authentication protection, but if we inadvertently provide our token in the repository URL (e.g. when performing the git clone operation), such a token will be added to the local GIT config file. Coming back to the story about the May 2019 attack, such information in the configuration file can do us a lot of harm.
Reflect on how theory compares to practice. We all do Pull Requests always before putting new code into our repositories, right? From a formal point of view, it is so, but it is important how the Pull Request will be carried out. This is a weak point because the human factor is decisive here, and that often fails when it comes to security. Consider a tool for static analysis of each Pull Request. There are several available on the market, or you can prepare your own if you want to, anyway it is worth equipping yourself with such a tool. How to do it on your own? GIT enables GIT HOOKS, which is a mechanism that runs specific scripts for a given event. For example, when adding a new commit or on every PUSH/PULL operation. It may run whatever and whenever you need. In this way, we can protect our projects against committing solutions that contain passwords or other things that should not be public.
Am I prepared for that?
Let’s go back to the effects of ransomware attacks on our repositories. We’ve covered a number of ways to increase security, but none of them gives us a 100% guarantee. The arms race in the subject of security will never allow us to be completely calm. We can have an excellent and secure process for making changes to the repository, follow recommendations, change SSH keys regularly, and so on, but still – we can fall victim to a ransomware attack, just like every third company. It is necessary here not only to take care of security, but also to prepare a plan for what we can and what we should do when an attack occurs. The obvious answer is to do a proper regular backup of our repositories and have a way to easily restore them if needed. So answer yourself – are you and your company prepared for this?