Detect Secrets in Code with GitGuardian

Let’s begin with an undeniable truth: every programmer has, at some point in their coding career, hard-coded secrets into their code. Whether it’s to perform a quick API test or to store credentials temporarily, there’s no denying that it’s a convenient shortcut. However, it’s also becoming an increasingly dangerous threat to the security of codebases.

The Problem of Secrets Sprawl

The issue at hand is one of magnitude: while most developers recognize that hard-coding secrets is an ill-advised practice, errors still occur often enough to make the situation dangerous for everyone. To put it simply, secrets-in-code, and even more in Version Control Systems such as git, are a reality that must be addressed.

If you are not yet convinced, please allow me to present the data: according to the largest study on the matter, the State of Secrets Sprawl 2023 report, in 2022, 10 million new secrets were discovered in public GitHub commits. This number, derived from analyzing more than 1 billion new commits, is 67% higher than the preceding year.

The report also uncovered that 1 out of 10 GitHub code authors exposed a secret in 2022, which indicates that such incidents are not uncommon and are not necessarily related to experience or seniority. Furthermore, what is visible on GitHub is only the tip of the iceberg, as secrets are more frequently exposed in private source code repositories.

With software development teams managing an ever-increasing number of credentials, there is a heightened risk of secrets being exposed in source code, CI/CD workflows, container image layers, runner logs, and other areas. This is not only a security issue, but it also affects productivity, as teams must rotate credentials quickly to address the vulnerability, which can disrupt CI/CD pipelines and bring teams to a standstill. To reduce these risks, organizations should consider shifting secrets scanning left and detecting hard-coded secrets earlier.

First, let’s review why secrets detection is harder than one might think. We will then look at how to use secret detection to develop an effective policy for protecting an organization’s secrets.

Why Are Hard-coded Secrets Different from other Types of Vulnerabilities?

It is important to note that hard-coded secrets differ from other types of vulnerabilities in that they are not execution vulnerabilities.

This means that they do not require the software to be running to be a vulnerability; rather, from the moment a secret is copied in cleartext in a digital asset, it becomes a security flaw. That’s a fundamental difference with all the security flaws listed in well-known security resources, such as the Common Weaknesses Enumeration Top 25, or the OWASP Top 10.

In the event that a developer mistakenly commits a secret, it may either be acknowledged or not.

In the latter case, it is likely the secret will reach the remote version control system (VCS). At that point, the secret would already be considered leaked (best case scenario, it would be detected at the code review stage, but the secret may already need to be rotated at that point).

In the former case, one very common mistake would be to delete it and simply commit the change. The secret disappears from the current state of the source code, but it is still in the commit history!

It is not uncommon to find valid secrets hidden deep inside the codebase history. The bottom line is that, unlike other vulnerability scanning processes, secrets detection needs to take into account this attack surface and scan for incremental changes to the repository to prevent these kinds of leaks.

How to Detect Secret Credentials in Your Repositories?

From a developer perspective, you want a tool that supports:

CLI support
Historical scans
Pre-commit hooks
Continuous Integration (we will use GitHub Action as an example)

From a security analyst perspective, you would want:

A single pane of glass to monitor in one place dozens, hundreds or even thousands of repos spread over multiple SCM systems (GitHub, but also GitLab, Bitbucket or Azure Repos)
Generic detection to catch less obvious secrets, like for instance a JSON Web Token
Developer-driven remediation
Possibility to run the app on-premise

Let’s have a quick look at the GitGuardian solution.

GitGuardian Dashboard

GitGuardian monitors a set of shared repositories called the Perimeter, which is VCS-agnostic: no matter whether the repositories are hosted on GitHub, GitLab, Bitbucket or Azure Repos, they are all visible from a single place:

This single pane of glass allows you to check all your projects’ status rapidly. You can also re-initiate a full scan on selected repositories.

Then you have the incident view. This view allows you to filter by status, source, severity, and more tags.

Detected secrets are always grouped by incidents: the same API key hard-coded in multiple files would appear as a single incident. GitGuardian has some powerful features to accelerate the triage of incidents.

First, to ensure high-precision alerts, GitGuardian checks the validity of secrets with non-intrusive API calls. Though not always possible (the interface clearly indicates when it is not), this check ensures perfect confidence in the importance of the alert.

Second, GitGuardian uses rule sets to automatically assign severity to each incident. These severity rules are invaluable to prioritizing the remediation work. By following GitGuardian’s best practices on prioritization, investigation and remediation, you should be able to decide what needs to be acted on immediately and what is less urgent.

Real-time detection with GitHub integration

Another important step in setting things up is integrating GitGuardian with your VCS for continuous monitoring.

We will focus on GitHub here, but feel free to check our documentation to integrate with GitLab, Bitbucket, or Azure Devops.

GitGuardian integrates natively with GitHub via a GitHub app that you can install on your personal GitHub repositories and on the repositories of your GitHub organizations. Once set-up, secrets scanning will be fully integrated with each GitHub Pull Request through Check Runs:

This allows the individual developer to get notified when an incident is detected by GitGuardian, directly in the GitHub interface.

One important thing here is that the check will alert the developer before the commits are merged. This limits the incident to their branch and gives them a chance to fix it easily. As a result, secrets-free collaborative branches can be used for QA, staging, and production environments.

It is also essential to understand that scans are conducted on each commit within a pull request, not just on the final state reviewed in the pull request.

This deep scanning helps uncover cases where one commit adds a secret, and one commit removes the same secret within the same pull request (a very common case, which couldn’t have been identified through a code review):

Pre-commit Hook

The best way to protect a team, or an organization’s sensitive credentials is pre-commit hooks. They are like a security seatbelt for developers.

A pre-commit hook is a short (or long!) snippet of code that is run, as you guessed before anything is committed in git.

For our use case, this is actually the best moment to scan for credentials and secrets, since once one has created a git commit, it can be annoying to have to rewrite the history.

Conversely, pre-commit hooks are easy to set up globally on a developer’s machine and are a “set & forget” security mechanism.

How? You simply need to have the following in .git/hooks/pre-commit file:

bash
#!/bin/bash
ggshield secret scan pre-commit

GitHub Action

Scanning for secrets in the CI is really your last line of defense. Imagine someone decided to bypass security checks and a credential is about to be merged with the main branch.

To counter this, we have to set up a GitGuardian action on our repository, so that we can at least catch these during pull request checks. This is what a simple GitGuardian action check looks like. You can put this in .github/workflows/gitguardian.yml

yaml
name: GitGuardian scan
on: [push, pull_request]
jobs:
scanning:
name: GitGuardian scan
runs-on: ubuntu-latest
steps:
– name: Checkout
uses: actions/checkout@v2
         with:
           fetch-depth: 0
     – name: GitGuardian scan
  uses: GitGuardian/gg-shield-action@master
env:
          GITHUB_PUSH_BEFORE_SHA: ${{ github.event.before }}
          GITHUB_PUSH_BASE_SHA: ${{ github.event.base }}
          GITHUB_PULL_BASE_SHA: ${{ github.event.pull_request.base.sha
}}
          GITHUB_DEFAULT_BRANCH: ${{
github.event.repository.default_branch }}
          GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}

Make sure to create a GitGuardian personal access token in your account first:

Then, set the GITGUARDIAN_API_KEY as an encrypted secret for your GitHub repository.

Going Further: Developer-Driven Remediation

If you made it this far, congratulations! You can be sure that any secret committed to this repository would break the pipeline and be reported in the dashboard, along with all the other past incidents.

You can also configure real-time alerting and notifications to push incident alerts on your perimeter to the channel of your choice ( Slack, Discord, JIRA, etc.). Even if you are rolling your own tools, it’s fairly easy to integrate GitGuardian alerts thanks to event-based custom webhooks that any custom web service can consume.

Along with other features, these integrations are here to help pull all parties closer to the remediation process. Developers need to acknowledge the threats and consequences of secrets sprawl, and picture how collaborating with security teams can strengthen the overall security posture – without compromising speed and productivity.

The auto-healing playbook allows automatic incident-sharing with the involved developer to collect feedback more quickly or to allow them to resolve or ignore the incident.

You can read more about how to assign incidents, collaborate, and organize the cleaning of your repositories’ leaked secrets.

Try it Yourself!

You are now aware of how easily secrets can be leaked. Unlike runtime vulnerabilities, leaked secrets can persist in old commits and represent a real threat. That’s why using a secrets detector in your DevSecOps workflows is a must-have for code security.

This awareness is an essential first step toward building a culture of shared responsibility between security, operations, and developers for preventing production issues, keeping pipelines running, and remediating issues as soon as possible.

GitGuardian is free for individual developers, open-source projects & teams of less than 25 members.

Install GitGuardian and start monitoring your repositories today.

Backed up platforms

Use cases

Industries

Overview

Products

Resources

Case studies

Best practices

Join newsletter

GitProtect.io

Legal

Browse categories

The Problem of Secrets Sprawl

Why Are Hard-coded Secrets Different from other Types of Vulnerabilities?