[This is a Guest Diary by Noah Pack, an ISC intern as part of the SANS.edu BACS program]
As a college freshman taking my first computer science class, I wanted to create a personal project that would test my abilities and maybe have some sort of return. I saw a video online of someone who created a python script that emailed colleges asking for free swag to be shipped to him. I liked the idea and adapted it. I created a script that emailed companies and asked for free swag, knowing that most conferences that year had been canceled due to the COVID-19 pandemic. I wrote my script, made a new email account for the script to use, created a list of ten companies it would email, and it worked flawlessly. To celebrate my achievement, I uploaded my code to GitHub. The next thing I knew, I was getting login attempts to the email address I set up for my script to use. I had hardcoded the email address and password into my code, and my computer science class didn’t teach us safe programming practices.
My situation had no ill consequences, but it could have if I had used my actual email for the script or if my project was bigger and I had used AWS or another cloud provider and hardcoded those credentials. In a later class I did learn how to safely pass credentials to my scripts without fear of leaking them on GitHub, but leaked credentials remained on my mind. This led me to the question “What happens when you leak your AWS API keys?”
In this article, I will share some research, resources, and real-world data related to leaked AWS API keys. I won’t get into scenarios where credentials are stored properly but stolen via a vulnerability, only where a developer or other AWS user hardcodes their credentials into a GitHub repository or a website.
Canary Tokens
To collect data, I used Canary Tokens. Canary Tokens are honeypots that, when opened or used, send an alert to their owner informing them of a breach. Canary Tokens can be a word document, QR code, AWS API key, or many other file types to suit various needs. The AWS API key token is a file that looks like this:
(This is an actual Canary Token)
It looks exactly the same as how a developer would store this information and contains everything needed to make a successful connection to the AWS API. Nothing beyond that works to prevent an attacker from actually abusing these keys.
I left a Canary Token on a decently trafficked e-commerce website I help maintain, hardcoded into the website’s source. I also posted one on my GitHub account in an obvious repository with a name that any researcher would recognize as a test.
All the Canary Tokens I created were used.
Research
The token I added to the source code of a website took three days before an attacker tested it, generating this alert:
The traffic came from a Proton VPN user. It is likely that they were using a crawler to scan websites for credentials or vulnerabilities but could have been testing the collected credentials manually. This was the only time this canary was tested. Because the person who tested it was using a VPN, it would be nearly impossible to find exactly where this attacker is from. The IP used to test this key has been seen doing other attacks, but because of the anonymity associated with a shared VPN IP address, it would not be possible to tie this to any other reported incidents involving this IP.
The user-agent information that the Canary Token includes is very interesting. We know that the attacker is using a Python script to check if the credentials are valid with the Boto3 library. We also know the script is running on the Windows Subsystem for Linux. This information helped me to create a script [2] that tests AWS API keys to see if they are valid.
My data is not large enough to say definitively that if you hardcode credentials into your decently trafficked e-commerce website you will have a couple days to fix them before they are used. In this case too, a crawler may have picked up the keys much earlier, and they were not tested until days later.
The AWS API keys I posted to GitHub were tested much sooner. Within minutes, I was receiving email alerts like the one pictured below:
I soon became overwhelmed with alerts and turned them off to preserve my email inbox. The interesting difference with these attempts to use my canary was that they were almost all coming from what turned out to be one company.
Clearly, if you post your AWS credentials, they will be picked up and used by someone, whether it is a security company, researcher, or attacker. So, what can you do to resolve this problem if you find yourself in it? The first thing you should do is generate new AWS API keys and deactivate the ones you leaked. There is no way to undo posting credentials when things like the wayback machine exist. The best solution is to prevent this from happening in the first place.
Luckily, there are tools like GitGuardian [3], GitLeaks, TruffleHog [4], and RepoSupervisor that can be integrated into your Continuous Integration and Continuous Deployment (CICD) pipeline and scan for hardcoded credentials before the code goes into production. Some of those tools require subscriptions, like GitGuardian, while others, like truffleHog, are free and open source. I created a script that can verify if an AWS API key works; you can find it at the end of this article in my GitHub account. My reasoning for creating my own script was that many of these tools include features that would not be useful if your goal is only to verify whether the keys work, while some tools that can do this are made for exploiting that access. I wanted to create a simple script that anyone in IT could look at and understand so QA, junior developers, interns, and new analysts who find an AWS API key can quickly verify it without putting it into a tool they do not fully understand.
Why does this matter?
Hardcoding credentials happen more often than you might think. There are lots of new developers, and in my experience, secure coding practices are not taught to university students until the upper-level classes. Even then, experienced developers make mistakes, unintended files get committed, and code left in place to test can sometimes make its way to production. There is a reason that entire companies exist to scan for these credentials.
Conclusion
If you are writing code, do your best not to hardcode credentials; someone will find them. The allure of free swag may distract you, but remediation is more time-consuming than doing it the right way in the first place. Implementing tools in your CICD pipeline to scan for these mistakes is a great preventative measure, but it is not perfect. Use IAM permissions in AWS to limit each API key to only the permissions it needs.
[1] Canary Tokens: https://docs.canarytokens.org/guide/
[2] My Script: https://github.com/npackt/Simple-AWS-API-Key-tester
[3] Git guardian: https://www.gitguardian.com/
[4] TruffleHog: https://trufflesecurity.com/trufflehog
[5] More on AWS API keys: https://aws.amazon.com/what-is/api-key/
[6] https://www.sans.edu/cyber-security-programs/bachelors-degree/
-----------
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu