Good on them. GitHub secrets cause a lot of problems. They will always create a better idiot but this idiot trap is long past due.
I also can’t wait until people base64 their creds to get past this. Explaining to some that base64 isn’t encryption tends to be hard so I imagine people will feel safe just base64 and checking it in.
base64 is far too much work. A new dev turned '"AKIAIOSFODNN7EXAMPLE"' into '"AK"
+ "IAIOSFODNN7EXAMPLE"' to make the security alert go away.
Thankfully, the alert was sent to enough people it was caught by someone else, and the key was destroyed before someone outside could have fun with it.
I legitimately recently had to argue with a PM and his developers, that a base64 encoded user ID isn’t considered security best practices for API authentication. Even when I showed them how I can produce the “secret” myself, they kept arguing that I was wrong.
Does secret scanning also apply to public GitHub Action logs and Issues (or more generally, Checks logs)?
We found Action logs to be a much bigger threat now that many folks have learned not to embed secrets directly into the code and to use secret managers instead. But even then, the secrets retrieved in a step can be printed in plaintext if someone, for example, runs that step debug mode.
Issues can also accidentally leak secrets via for example, third-party code builders that print their output in an issue.
GitHub PM here. Right now we scan code, commit metadata, issues, and issue comments. We're expanding to other content types over time, with support for pull request bodies and comments coming in early 2023. Actions logs are on our list too, but will take a little longer.
(It's worth noting that any secrets in your Actions secret store will already be redacted in any Actions logs, so those won't leak there.)
Searching for creds can be tricky if they can't be readily distinguished from other text.
Can anyone think of a problem with generating customer API keys that have a known prefix that makes them more detectable?
For example, a key like "FooSecret.ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5". I wouldn't think that'd open up any new attacks, but I'm no expert on the matter.
GitHub PM here. We switched our own token format to something similar to the above in April of last year and have been encouraging other service providers to do the same.
The big benefit of highly identifiable tokens is not just that we can alert on them, but that we can scan for them at pre-receive time and prevent them from leaking (by rejecting the push). We already have that functionality as part of GitHub Advanced Security, and are planning to make it available (for free) on public repos in 2023.
I argued for something like that previously on HN, like adding a domain prefix 'myservice.com_secretkeyhere'. This would allow automatic discovery of the reporting/revocation endpoint from the key. Then someone pointed out that you could just use an actual URL as your secret key and have that be the URL you visit to revoke it, and I think that is genius.
Next service I make that has API keys, I will make them look like `https://secret.myservice.org/ZTNiMGM0NDI5OGZjMWM`. POSTing to that URL revokes the key, a GET shows a form explaining what it is and a button to revoke the key.
One issue is that some email services mangle URLs specifically, and that would be bad for keys.
Another approach is to identify likely ones based on the entropy of the strings. I used a tool that did precisely this once and found some, but can't find it anymore.
We use this at our company. Wildly successful at finding tokens for most of the usual suspects. If they are including secret blocking - it will prevent someone from doing the dumb as well.
One question/behavior - if the secret scanner found something and folks resolved it -> secret blocking is enabled -> and a developer does the dumb again, should it block the PR with the new secret? Wondering if we might have something misconfigured as I have seen new secrets get added after we enabled blocking.
Hello! I am an engineer on the Secret Scanning team, thanks for the kind words!
- "push protection" (as we call it) isn't available for free, and isn't part of this rollout.
- For folks who do pay, the flow may be: a developer tries to push, they bypass the secret, are now able to push. From there, an alert is created which they can resolve (maybe it is "used in tests").
- If the _same_ secret is pushed again, we won't block that push. We also won't create a new alert; however, a new location may be recorded within the resolved alert (if you click into it).
If you're seeing push _not_ get blocked, what's most likely is that we just don't support that specific token as part of push protection (we have some much-needed improvements to do to the docs to make this more clear). Since push protection sits in front of the developer, we try not to annoy them with high-false positive tokens. There are a few other possibilities though, so hard to say.
Don't let the perfect be the enemy of the good - this will start out in a limited detection of course, but can easily be improved with other hashes and scanning over time.
What's the workflow where people accidentally commit secrets to their git repos? I'm not sure I've ever done it; do we count the "base_secret" type of things web frameworks put in their default app templates? Certainly the more common mistake I make is forgetting to add new files, so it's mildly amusing that other people apparently have the opposite problem.
People keep adding whole tmp/ directories or output binaries to repositories, accidents like this stuff just happening. It is not a workflow, but for a scenario: people trying to run some test, on real service, to debug some weird issue, will temporary put credentials and forget to remove them before comiting the fix. Sure, someone probably will notice it in code review but it is too late if repo was public.
Lots of ways this happens either accidentally or intentionally. I think most common accident is due to forgetting to add a file to .gitignore and then using git add . . Intentionally, folks just embed secrets into code out of convenience while developing, and either never even think twice, or forget to remove them before commit & push (which becomes kinda an accident)
mostly accidental. you're working on a prototype, so to just get started you use a const at the top of your code with an API key, this then gets checked in and you then realise 'oh shit' , but by this point its within gits tree. It can still be removed, but its not a straightforward process.
Are a lot of "private-ish" repos (perhaps something that supports a real company) using Github and not self hosting? I presume this is the case, but it seems dumb.
[+] [-] flatiron|3 years ago|reply
I also can’t wait until people base64 their creds to get past this. Explaining to some that base64 isn’t encryption tends to be hard so I imagine people will feel safe just base64 and checking it in.
[+] [-] banana_giraffe|3 years ago|reply
Thankfully, the alert was sent to enough people it was caught by someone else, and the key was destroyed before someone outside could have fun with it.
[+] [-] depereo|3 years ago|reply
;)
[+] [-] woutr_be|3 years ago|reply
[+] [-] CSSer|3 years ago|reply
[+] [-] dinvlad|3 years ago|reply
We found Action logs to be a much bigger threat now that many folks have learned not to embed secrets directly into the code and to use secret managers instead. But even then, the secrets retrieved in a step can be printed in plaintext if someone, for example, runs that step debug mode.
Issues can also accidentally leak secrets via for example, third-party code builders that print their output in an issue.
[+] [-] greysteil|3 years ago|reply
(It's worth noting that any secrets in your Actions secret store will already be redacted in any Actions logs, so those won't leak there.)
[+] [-] runlevel1|3 years ago|reply
Can anyone think of a problem with generating customer API keys that have a known prefix that makes them more detectable?
For example, a key like "FooSecret.ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5". I wouldn't think that'd open up any new attacks, but I'm no expert on the matter.
[+] [-] greysteil|3 years ago|reply
The big benefit of highly identifiable tokens is not just that we can alert on them, but that we can scan for them at pre-receive time and prevent them from leaking (by rejecting the push). We already have that functionality as part of GitHub Advanced Security, and are planning to make it available (for free) on public repos in 2023.
[1] https://github.blog/2021-04-05-behind-githubs-new-authentica...
[+] [-] remram|3 years ago|reply
Next service I make that has API keys, I will make them look like `https://secret.myservice.org/ZTNiMGM0NDI5OGZjMWM`. POSTing to that URL revokes the key, a GET shows a form explaining what it is and a button to revoke the key.
One issue is that some email services mangle URLs specifically, and that would be bad for keys.
(edit: sudhirj is the genius: https://news.ycombinator.com/item?id=28299624)
[+] [-] nikeee|3 years ago|reply
https://github.blog/2021-04-05-behind-githubs-new-authentica...
They have a list of supported secrets they can find via automated scans:
https://docs.github.com/en/code-security/secret-scanning/sec...
[+] [-] tnorthcutt|3 years ago|reply
[+] [-] letmeinhere|3 years ago|reply
[+] [-] dghlsakjg|3 years ago|reply
Really easy to just grep through something looking for that prefix
[+] [-] evdubs|3 years ago|reply
[+] [-] heelix|3 years ago|reply
One question/behavior - if the secret scanner found something and folks resolved it -> secret blocking is enabled -> and a developer does the dumb again, should it block the PR with the new secret? Wondering if we might have something misconfigured as I have seen new secrets get added after we enabled blocking.
[+] [-] aashah-gh|3 years ago|reply
- "push protection" (as we call it) isn't available for free, and isn't part of this rollout.
- For folks who do pay, the flow may be: a developer tries to push, they bypass the secret, are now able to push. From there, an alert is created which they can resolve (maybe it is "used in tests").
- If the _same_ secret is pushed again, we won't block that push. We also won't create a new alert; however, a new location may be recorded within the resolved alert (if you click into it).
If you're seeing push _not_ get blocked, what's most likely is that we just don't support that specific token as part of push protection (we have some much-needed improvements to do to the docs to make this more clear). Since push protection sits in front of the developer, we try not to annoy them with high-false positive tokens. There are a few other possibilities though, so hard to say.
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] daguava|3 years ago|reply
[+] [-] andrewflnr|3 years ago|reply
[+] [-] rieTohgh6|3 years ago|reply
[+] [-] dinvlad|3 years ago|reply
[+] [-] decodebytes|3 years ago|reply
[+] [-] idoh|3 years ago|reply
[+] [-] eranation|3 years ago|reply
https://www.arnica.io/blog/secret-detection-needs-to-be-free...
[+] [-] joshxyz|3 years ago|reply
[+] [-] holdfastjody|3 years ago|reply
[+] [-] greysteil|3 years ago|reply
[+] [-] pabs3|3 years ago|reply
Also what about 2FA secrets like TOTP/WebAuthn?
[+] [-] jrm4|3 years ago|reply
Why not just self-host?