Blackbeard's Ransom: Ransomware of Source Code is a Real Threat
DISCLAIMER: This article describes pure research. We do NOT support using the results or ideas of this research for illegal, immoral, or unethical goals.
Question: Can source code stored in an online service be ransomed away from the owner?
We want to share a recent story that we hope will help some of you. For more than 25 years, we have practiced cyber defense with the belief that if we understand how offensive cyber-attacks work and how attackers think, we can help defend others from them. Recently the question stated above was posed to us in technical conversation. After thinking about this carefully, for our answer we said the following to quote 'The Simpsons' cartoon: "Short answer, 'yes with an if'. Long answer, 'no with a but'".
We want to provide an anonymized example. Consider organization X. This is an organization we have privately advised in the past and they once told us that they do not prioritize penetration tests or security assessments because they are primarily a consulting organization that has no public points of presence, and they only produce documents and source code stored externally in GitHub. We did not challenge this assertion at the time, but while we were researching about this post we wondered, are they as invulnerable as they believe?
No, they had a problem: their source code is at risk from insider threat.
We can recall that at the time, organization X maintained more than one hundred repositories in source code control from more than fifty different developers, many of whom had moved onto new jobs in today's market by the time we were advising them. Many of these repositories would have been labeled as important or critical but were only accessed by one (1) or two (2) people and even then, only occasionally. Thankfully, this organization used a decent permissions structure for controlling who can see what repositories in the web interface, but we are all but certain the permissions were out of date with respect to departing employees and contractors.
Describe the Nightmare
Let's describe the nightmare scenario for organization X:
- A developer or sysadmin decides they are underpaid or that they no longer are valued by organization X.
- They ask for compensation adjustments, change of work location, hours, or other items to improve their work/life experience, and are subsequently denied.
- They decide to resign but not before hurting the organization as much as possible.
- They leverage poor or weak permissions of their organization GitHub account to ransom all projects away from the actual owner.
- In a two-step act, the ransomware will also delete the commit history preventing someone from rewinding to the repository status just before encryption of files.
Now let's be clear:
- As specified in a Congressional Research Service Report R46932, executing any such ransomware action with intent to cause financial harm to a victim can be prosecuted under the Computer Fraud and Abuse Act (CFAA) by the FBI.
- For a person to successfully perform a ransom action, they must have substantial or total access to the repositories of an organization.
- Any active project that is 'checked-out' could overwrite the ransomed contents when pushed.
- While this article is about prototype malware, we use GitHub as an example and even then, only a completely fictitious organization. We love GitHub and use it extensively.
- Unlike every single article we read preparing for this post, we are not selling you anything. We want to help you.
The Ransomware
We are going to describe how we were able to quickly construct a prototype product that leverages GitHub access to ransom repositories away from the owner organization. Professional ethics prohibit us from releasing the full copy of the source code.
When we were given the challenge to perform a ransomware-style attack on an Organization in GitHub, we first started thinking of an iterative process whereby we would pull each repo, encrypt the files with a key, and push back up. We did a similar rolling encryption to AWS EC2 instances when we wrote the blog post on Sylphs in the Cloud. Then we realized that in true engineer fashion, we were overcomplicating the issue and overshooting the problem. The only real characteristic of ransomware is making a user's data inaccessible, with the ability to restore it upon a transaction. The data did not need to be encrypted at all – merely removed from the organization's repositories. This simplified the issue.
On the other hand, git by its nature keeps a change log, so it would not be enough to delete the files and push this as a commit. We could, however, simply clone a repository, delete the remote, and then create a brand-new repository with the same name, and push up a single file as README.md containing the ransom information. To do this we needed two Python libraries: pygit2, a library with bindings for the git C libraries, and PyGitHub, Github's own library that has similar functionality to their 'gh' CLI tool. The former allows for manipulation of the local Git repos: think 'git add', 'git commit', etc.; and the latter allows for creation and deletion of remote repositories. PyGitHub requires a developer API token to authenticate to the upstream GitHub account. To their credit, Github has been pushing for more granular tokens that do not have a wide array of permissions, but in a past life as a penetration tester, we can assert with confidence that many organizations do not follow a least-privilege model.
The pattern becomes the following:
- Authenticate to GitHub.
- Create an object containing all repositories in the organization.
- Clone each repository.
- Delete the remote for the repository.
- Create a new repository with the same name, or a name indicating the stolen source. We used a random slug appended to the end just to avoid any potential collisions as the loop ran.
- Create and push a README, presumably with a bitcoin wallet ID or similar.
This leaves all the valuable source code safely on your local host, while bypassing any sort of backups or git reset functionality to restore previous versions.
A demonstration in screenshots follows. For clarity again, we use a 100% fake GitHub organization we reserve for this sort of testing. We named the tool volksfrei
, after Hans Gruber's terrorist group from Die Hard. The broader ransomware framework we have under development is named BlackBeard.
Our demonstration starts with screenshots showing several repositories living on GitHub.com for our fictitious organization:
Next, we see one of the repositories in greater detail from the same fictitious organization:
Now imagine the nightmare scenario has occurred so let's run our BlackBeard prototype ransomware to alter the previous example repository permanently until we are paid as we show in the next image:
To repeat what was stated earlier, this execution requires a GitHub access token and knowledge of the specific organization name. Without these two (2) items the ransomware act would fail. Finally, let's look and see what the resulting repository looks like in the encrypted state:
All the prior files from the first image are gone and replaced with a message designed to make it clear that this repository has been altered. Since this is a prototype, we skip the obligatory demonstration of the restoration of the original repository data.
So What?
Now let's turn our attention to the "so what". It is only natural to think: "You proved this is possible but unlikely, what should we care?"
Neutralizing Git History
First, we address Git History. If you are not aware, Git maintains a log of actions of changes to a repository and tracks which user made the changes. In addition, when you clone a repository, you are pulling the binary copies of the changes that have been made to the repository since the first commit allowing you to rewind or fast-forward the repository to any point in time according to the git history log. While it is not the main point of this blog post a quick look at the Git binary data that is stored for each repository is shown in the following image:
As you can see in the previous image, the entire collection of history is found underneath the directory:
<REPOSITORY>/.git/objects
Each file is a separate ZLIB compressed data object. For the not faint of heart, this is described in detail on the following URL:
https://git-scm.com/book/en/v2/Git-Internals-Git-Objects
In short, this URL gently explains these files with the following phrase: "Git is a content-addressable filesystem. Great. What does that mean? It means that at the core of Git is a simple key-value data store."
Now, how can you increase the damage of ransomware like this? If you can erase the git history for all existing copies of the projects that live on other developer workstations, you can work towards preventing recovery of the source code. A git hook could be used to erase all the history stored for the projects you are targeting for ransom. The following hook samples are added to every GitHub project upon creation:
Each of these files is a shell script that is run during the git phase(s) described by the filename. If you want to enable that hook you can rename the file to remove the .sample suffix. Consider the following simple pre-commit hook.
If you store the previous BASH script example in a file called <REPOSITORY>/.git/hooks/pre-commit
, the next time you run a git commit
, the entire history of a repository soon to be ransomed will be erased. Using a non-zero return code means git will abandon the update before completion. We chose this merely as an example. It is not a significant error to adopt this simple example to erase all files in the repository, but you will have to change the return code.
Truthfully, it is not easy to get this file into all copies of the repository as the .git
directory is excluded from source code control and must be setup using external mechanisms. As the attacker you can't commit a file to Git targeting this directory and hope that it is propagated to all copies of the source code. Instead, we frequently use a Makefile
to automate setup of a development environment. You could alter the Makefile
to install the Git hook from an Internet resource but then you must determine how to force users to run the Makefile
target again that will grab the commit hook. We reviewed the following articles discussing mechanisms for deploying Git Hooks:
- https://medium.com/@rishpandey/simplest-auto-deployment-using-git-hooks-4cd6d98e0fc6
- https://github.com/bahmutov/pre-git
- https://stackoverflow.com/questions/40156102/git-hook-automatic-installation
We will certainly think more on this topic as time goes on and will post updates if we discover any scary ways to push malicious git hooks to repositories.
GitHub Backup
The actual point of this blog post is to implore you to please, please implement a backup strategy for any cloud service such as GitHub and to not rely on the service provider completely. Secondary to this, we want to describe how you might be able to point the finger at the culprit if this does happen to you. GitHub takes it upon themselves to provide the following link for describing the backup problem:
https://docs.github.com/en/repositories/archiving-a-github-repository/backing-up-a-repository
Let us summarize this for you: You are on your own! If you do not address backup explicitly then if a situation such as this were to occur, you would be in trouble. In my searching, we also found the following blog post:
https://blog.gitguardian.com/the-ultimate-guide-to-github-backups/
In this post, Greg Bak, the product development manager for GitProtect.io provides very good detail on GitHub backup practices. We aren't selling GitHub backup, please read this URL for more information.
Backup using GitHub Command Line Interface
Recently, we realized that we can use the GitHub Command Line Interface (CLI) to perform a backup with only a few lines of BASH syntax. First, you need to install the CLI, and the following instructions work effectively:
https://github.com/cli/cli#installation
Once you have the CLI installed, you must login to GitHub to generate an authorization token. This is described on the following URL: https://cli.github.com/manual/gh_auth_login
The next screenshots show the login process on macOS:
In this example, we have already authenticated using the CLI so the second message is unique to my computer, but the next steps will look like the following screenshots:
Once you press <ENTER>, your default web browser will open and look like the following screenshot from Firefox:
If you enter the proper one-time code your browser should update like the next screenshot:
Once you have a successful login and authorization token, the following SHELL syntax will clone all repositories associated with the organization:
You can adopt this to be a cron
job on Linux to run repeatedly.
GitHub Audit Log
The principal method GitHub offers to provide accountability for who might ransom your repositories is the "Audit Log". This provides a useful record of all interactions for an organization in GitHub. Quite honestly, this method is woefully under-emphasized. Here is the rather dry documentation link for those brave enough to read:
The short story of this article to get to the GitHub audit log for your organization is:
- Login to GitHub.
- Load your Organization Page.
- Load the Settings page for your Organization.
- Scroll down menu on left to "Archive."
- Click on "Logs" to expand drop-down.
- Click on "Audit Log".
The resulting page that loads shows the audit log for your organization. This is a record of the actions that registered users performed on repositories owned by your organization. You can browse the live log or export to comma separated value (CSV). If you download this archive to CSV and decompress to load in Excel, you will find there are one hundred ninety-four (194) fields tracked for each action while only a tiny fraction of these fields is shown in the live view as you can see in the following image:
As you can see from the previous image, the actor (user account) and focused repository for each event can be clearly seen in the live view but what we find striking is that the location where the action originated is not shown. If you export this content, there surely must be more detail, right?
Perhaps unironically, the answer is again, "Yes and No".
GitHub Audit CSV
Exporting this content to comma separated value (CSV) format is very easy. In the previous image you can click on the green export button in the top-right of the image and click on the "CSV" option from the drop-down as shown in the following image:
The resulting CSV file that downloads will have one hundred ninety-four (194) columns of which the following are useful for this conversation:
@timestamp
action
actor
actor_id
actor_location.country_code
created_at
operation_type
repo
user
user_agent
user_id
These columns are self-explanatory, and we won't waste your time trying to describe what you should do with this data as you may only ever read this content if you really, really are in trouble. What we find somewhat surprising is that the column actor_ip
is present in this export but was not filled in on the exports that we did in testing when writing this post.
Since we did discover during research that the GitHub CLI does not allow you (at this time) to export the audit log we recommend you adopt a routine process of reviewing this data through a CSV export on a routine basis. We will note for complete coverage that at the conclusion of our research we discovered the following GitHub project:
https://github.com/github/ghec-audit-log-cli
We have not yet tested this project, but this is appearing to be a JavaScript client to download the audit log for your organization for the purposes of sending this data to an external destination.
Summary
To summarize it is possible to create ransomware for GitHub, but our research shows this comes with several serious qualifications:
- The ransomware author must have a token for authentication to GitHub.
- The ransomed project could be restored by overwriting the new repository with a separate copy that has already been checked out.
- Deliberate ransomware actions could be prosecuted by the FBI under the Computer Fraud and Abuse Act.
- The ransom actions will leave indicators in the GitHub audit log, but these actions will not indicate a source IP address.
Finally, backup of your organization repositories stored in GitHub is simple using the GitHub CLI. If nothing else, we STRONGLY recommend implementing backup of your GitHub data if you are a subscriber. Whether you use a commercial service or handle this on your own, backup, and backup NOW, PLEASE.