Home Interviews “Public GitHub is often a blind spot in the security team’s perimeter”

“Public GitHub is often a blind spot in the security team’s perimeter”

Jérémy Thomas, Co-founder and CEO of GitGuardian, talks about the exposure of secrets within public repositories on GitHub and how this threat is evolving year on year.


GitHub and the community surrounding it has changed the way the world uses and builds open source components and software. At present, there are more than 50 million developers using GitHub; 60 million repositories are created in a single year, with over two billion contributions. With such a vast resource of data publicly available, there is also an abundance of sensitive data that is unknowingly or accidentally pushed to the platform, namely secrets like API keys, credentials, and other digital authentication strings. These secrets can be used by attackers to gain access to infrastructure, systems, and PII.

It will still be a mammoth task to quantify the problem that arises due to the public GitHub. To evaluate that, CISO MAG interviewed Jérémy Thomas, Co-founder and CEO of GitGuardian. Thomas is an engineer and an entrepreneur. A graduate from Ecole Centrale in Paris, he first worked in Finance and then began his entrepreneurial journey by first founding Quantiops, a consulting company specializing in the analysis of large amounts of data, then GitGuardian in 2017.

In this interview, Thomas talks about the exposure of secrets within public repositories on GitHub and how this threat is evolving year on year. He also talks about the responsibilities of CISOs to ensure their developers do not accidentally leak secrets, Intellectual Property, or PII, and the best practices that need to be established.

Edited excerpts of the interview follow:

Recently, an unknown actor compromised the official PHP Git repository and pushed backdoored code under the guise of a minor edit. And these are a common affair. With organizations oftentimes taking most of their codes from Git repositories, don’t you think this is a major cybersecurity issue?

Indeed, leveraging open source dependencies comes with both risks and opportunities for organizations. On one side, organizations don’t have to reinvent the wheel and can reduce their time-tomarket by easily importing external open source software in their codebase, in the form of dependencies. If carefully chosen (not all open source codes are equal), these dependencies are battle-tested at unprecedented scales, scales that only the open source can allow. There are 50 million developers on GitHub. These developers collaborate publicly, write code, test code, solve bugs, and deploy code in various environments. The more a code snippet is deployed in as many environments as possible, the more it is tested, the more eyeballs are on it, the safer it is. However, it happens that certain dependencies contain vulnerabilities (just like every software does). In such cases, the huge scale of the open source can become a downside, because the vulnerable code is instantly deployed in so many environments. Hence the need for Software Composition Analysis tools to have visibility about components imported in the codebase, their version and vulnerability status, so that they can be patched quickly when a vulnerability is discovered.

PHP is thought to underpin almost 80% of websites. This includes all WordPress sites, which are built on PHP. With malicious actors pushing backdoors for remote code execution (RCE) like the earlier incident mentioned, do you think these can result in a much larger-scale attack surface? How does GitGuardian come into this picture?

Backdoors are a particularly major cybersecurity issue especially if they remain undetected for extended periods. This is only compounded when these backdoors are added to technology which, as you say, underpins many websites and web applications. Another example backdoor recently discovered was for CodeCov’s continuous integration (CI) tool. This backdoor was undetected for months and allowed the attackers to steal sensitive information from users’ CI environments.

Many think of security as building a wall around your assets and infrastructure, the core idea being to keep intruders out. While this wall is important, there are multiple ways an attacker can penetrate past this wall, a backdoor as we just discussed is one example. It may be a backdoor in your application, in the underpinning technology of your application such as the PHP example or as part of your application environment as was the case with the CodeCov example. This means we need a shift in how we approach security that considers what happens when the walls are breached. Solutions can help ensure sensitive information is not exposed to attackers even in the event of an intrusion. For example, GitGuardian Internal Monitoring solution scans for sensitive data within internal Version Control Systems and alerts users in real-time if any are discovered. This means that even if an attacker gains access to these internal systems via a backdoor or any other method we can prevent them from using secrets to move laterally into different systems.

According to a GitGuardian study, there has been a 20% year-on-year increase in the number of secrets – such as application programming interface (API) keys, private keys, certificates, usernames, and passwords – discovered on a public repository. What measures must CISOs take to ensure that the data of their organization must not be among these secrets?

CISOs are starting to realize that even if their company has limited official activities on public GitHub, their developers most certainly use the platform regularly. The difficulty for security teams is that public GitHub is often a blind spot in their security perimeter. It is difficult for organizations to identify developers’ public activity on personal repositories without a proper solution. Our report indicates that many corporate credentials are found on developers’ personal repositories, where CISOs have no visibility and no authority to enforce any kind of preventive security measures. On top of this, most organizations underestimate the number of secrets that are exposed within their internal repositories. And this, even if they deployed secrets management solutions. As code is a very leaky asset, widely accessible within the organization, it is critical that CISOs secure their Software Development Lifecycle by implementing efficient secrets detection…To read the full interview, subscribe to CISO MAG.

This interview first appeared in the June 2021 issue of CISO MAG.