top | item 19602279

Characterizing secret leakage in public GitHub repositories

117 points| feross | 7 years ago |blog.acolyer.org | reply

36 comments

order
[+] dguo|7 years ago|reply
I accidentally published[1] my AWS secret key last year because I pushed an old project from college. At the time, I was very new to using source control and had little idea how to distinguish between what should and shouldn't be committed. I hope colleges and code boot camps go over that sort of info nowadays. The usefulness to effort to learn ratio seems exceptionally high.

[1]: https://www.dannyguo.com/blog/i-published-my-aws-secret-key-...

[+] aiddun|7 years ago|reply
My sophomore year of high school, I was trying to writing a Discord (chat platform) bot for a server I shared with friends and unknowingly included the private key in a public repo I hoped to show them. A specifically written crawler for Discord keys found the key and starting spamming the server with images of very very undesirable things from the far corners of the internet at a rate of hundreds per second. Needless to say I learned my lesson the hard way.
[+] Liveanimalcams|7 years ago|reply
When I attended Hack Reactor they did tell us not to push them. However since they didn't teach us git (they expected us to know it) many still pushed them up. You would know because they'd get an email from some random company/person letting them know that they found their secret keys and that they should enroll/buy their services if they don't know what they're doing. Luckily no one from my class got hosed, but others in the past had.
[+] rsmolinski|7 years ago|reply
Looks like a great methodology and good results. Looking forward to reading the paper because I've been working around the GitHub API restrictions for the same purpose.

Specifically, I'm building a SaaS (https://www.locktower.com/) for organizations (or security teams) looking to have a managed solution for detecting leaked secrets in GitHub/BitBucket/etc. I'm in the process of building an on-prem version as well. Overall, I really hope to help drive down the number of unresolved leaks that the authors found.

[+] 7ewis|7 years ago|reply
I wrote a tool that scans all the new commits to our Org for passwords/secrets.

Webhook > AWS API Gateway > Lambda

The Lambda uses the new(ish) Layers feature so it can use Git. I then use the truffleHog[0] library to scan for entropy/regexes inside the commit.

If something is detected, it posts to an SNS topic, which is currently subscribed to by another Lambda that posts an alert to my team and the Security team's Slack channel.

It then calls the GitHub API to make the repo private to limit the exposure.

[0] - https://github.com/dxa4481/truffleHog

[+] semi-extrinsic|7 years ago|reply
Why not have a pre-commit hook clientside that runs truffleHog AND if successful generates some form of file indicating it was run, then have a serverside hook checking for that file? This should be doable even with plain Github/etc, no?
[+] pry_or|7 years ago|reply
I assume you saw the note on truffleHog in the article? The paper found it to be rather inaccurate outside of the basics (mainly AWS keys). Hopefully the authors open source their stuff.
[+] novaleaf|7 years ago|reply
This highlights the difficulty of sharing secrets with your production code. How can you get secrets into production in a secure way?

Cloud providers have proprietary solutions, but those don't work on other providers (or your local dev env).

Rolling your own secrets server seems like an expensive centralized disaster waiting to happen.

It seems like putting a secret into source code is one of the least risky options. Just make sure it's not in a public git repo.

[+] torbjorn|7 years ago|reply
Article quotes someone making this claim: > we discovered that even if commit histories are rewritten, secrets can still be recovered…. we discovered we could recover the full contents of deleted commits from GitHub with only the commit’s SHA-1 ID.

Do repo cleaning tools such as https://rtyley.github.io/bfg-repo-cleaner/ leave the original commit's SHA-1 ID intact?

[+] aflag|7 years ago|reply
Shouldn't GitHub put in some sort of warning for potential leaks when they happen?
[+] Pawamoy|7 years ago|reply
I once pushed a GitHub token in a public repository. They immediately revoked the token and notified me by email.
[+] LyndsySimon|7 years ago|reply
I believe they have it - I've gotten notified in the past when I committed secrets on purpose, for test applications. I'm not 100% sure they were from GitHub, but I think they were.
[+] herohamp|7 years ago|reply
Wait do people not use .env files? I've aliased "gitinit" to make a .env file, .gitignore that ignores env nodemodules etc, then runs git init
[+] scarface74|7 years ago|reply
There is no excuse to ever have AWS secret keys anywhere in your code or your settings.

If you are running locally, you should be using your own secret keys that are configured in your user directory with

  aws configure
If you are running on anything within AWS you should be using a role attached to your EC2 instance or lambda and the SDK can retrieve your keys automatically.

Unfortunately, every single third party code sample on the internet has you including the secret keys in your code.

[+] alain_gilbert|7 years ago|reply
One must be careful if they use docker.

I have seen people doing this in their Dockerfile

    ADD . /src
To add all the sources in the image, and inadvertently just made public all the secrets that were in the .env file.

Personally, I like to keep all my secrets very far from my repository.

[+] 33degrees|7 years ago|reply
sometimes people use .env files, and then share them with other developers using a publicly accessible paste service like gist (I'm not kidding)
[+] bifrost|7 years ago|reply
If its harder than putting something in a file, people usually don't use it...
[+] sbov|7 years ago|reply
Since there's no standard, presumably different people use different methods, or sometimes none at all. Beyond that, people could still make a mistake and put something in code that belongs in an env file.
[+] lallysingh|7 years ago|reply
The title's a great example of Pun? Description. The pun gets the attention, the description tells you why you should click.