Exploiting CI / CD Pipelines for fun and profit

theamk|1 year ago

The real problem is keeping sensetive information in .git directory. Like WTH would you put your password, in plaintext, in some general ini file? (or into a source file for that matter)?

When I see things like those, they look so wrong to me. But sadly it's apparently uncommon nowadays: not only random bloggers, even my coworkers see nothing wrong with putting passwords or tokens into general config or source code files. "it's just for a quick test"1 they say and then they forget about it and the password is getting checked in, or shown at screenshare meeting.

Maybe that's why there are so many security problems in industry? /rant

(For those curious: for git specifically, use ssh with key auth. If for some reason you don't want this, you can set up git's credential helper to use your OS key store; or use plaintext git-crendetials, or even just good-old .netrc. For source code, something like "PASSWORD = open("/home/user/.config/mypass.txt").read().strip()" is barely longer than hardcoding it, but 100% eliminates chance of accidental secret checkin or upload)

OtherShrezzing|1 year ago

>The real problem is keeping sensetive information in .git directory. Like WTH would you put your password, in plaintext, in some general ini file? (or into a source file for that matter)?

People & organisations tend to follow the path of least resistance. If it's easier to put passwords into a plaintext config file than not, passwords will invariably end up in plaintext config files in some projects. `PASSWORD = open("/home/user/.config/mypass.txt").read().strip()` will work right up until a colleague without `"/home/user/.config/mypass.txt"` attempts to run the project - at which point it'll be replaced with `PASSWORD = "the_password123"`.

The only pragmatic solution is to make it easier to handle passwords securely than to handle them insecurely.

voiceblue|1 year ago

> 100% eliminates chance of accidental secret checkin or upload

You've never worked with humans, have you?

throwaway8481|1 year ago

At my work, I often see these 2 things throughout the codebase:

- an identifier for an environment variable that gives us the azure key vault scope (another identifier) - an identifier for the token to pull from that scope

Then the scope name and token name are used to pull the token secret value using the secrets api.

I am not experienced in how this is "supposed to be". Would it make sense to make both of these environment variables so neither identifier appears directly in code? (scope name and token name)

Thank you for the insight :)

lloeki|1 year ago

> The real problem is keeping sensetive information in .git directory. Like WTH would you put your password, in plaintext, in some general ini file? (or into a source file for that matter)?

Sometimes it's not "you":

https://github.com/actions/checkout/issues/485

jbverschoor|1 year ago

You know, Sun fixed this almost 30 years ago in the J2EE standard.

brightball|1 year ago

Gitleaks is the easiest way to deal with this. I make a point to include it in my build pipelines and have dev teams set it up as a precommit hook to prevent the problem.

paperplatter|1 year ago

Maybe they're use Google AppEngine and don't want to deal with storing config secrets the right way for it.

TiddoLangerak|1 year ago

Am I missing something, or does the step in

> Pushing Malicious Changes to the Pipeline

mean that they already have full access to the repository in the first place? Normally I wouldn't expect an attacker to be able to push to master (or any branch for that matter). Without that, the exploit won't work. And with that access, there's so many other exploits one can do that it's really no longer about ci/cd vulns.

kolme|1 year ago

From TFA:

> A surprising number of websites still expose their .git directories to the public. When scanning for such exposures on a target, I noticed that the .git folder was publicly accessible.

[...]

> With access to .git/config, I found credentials, which opened the door to further exploitation. I could just clone the entire repository using the URL found inside the config file.

The URL with credentials was found in the `.git/config` file, defined in the [remote "origin"] section. This is the way they won full access to the repo.

mukesh610|1 year ago

You're right, there are other avenues of exploitation. This particular approach was interesting to me because it is easily automatable (scour the internet for exposed credentials, clone the repo and detect if Pipelines are being used, profit).

Other exploits might need more targeted steps to achieve. For example, embedding a malware into the source code might require language / framework fingerprinting.

the_gipsy|1 year ago

I am not sure, but it sounds like the pipeline runs for any pushed branch/PR, and it runs the pipeline configuration of that branch (so you can run a pipeline configuration without having to merge to master).

I'm not saying that this is fine, just that access to master is probably protected, but it's still vulnerable.

ponytech|1 year ago

edit: Credentials for modifying the piepline were found in the .git/config file

ransom1538|1 year ago

100% of the script kiddies moved to .env and .git. My logs are filled with request for GET /.env 404. All the kiddies focus mainly on those two, I think the return is the best for their effort. The .env file is super trendy now and used across languages now.

yabones|1 year ago

A super easy way to protect yourself is to just block any IP that hits `/.env` or `/wp-admin`. I've taken this as far as to ban any IP that hits my default vhost (hitting the IP instead of actual hostname) more than ten times, and I get about about 99% less scanners and spam as a result.

https://nbailey.ca/post/block-scanners/

sebazzz|1 year ago

I don’t understand why some authentication mechanisms, like Github Tokens don’t use a refresh token mechanism. So the token can be handed in once to create a refresh token, and then with that expiring access token can be requested. Now we (as users) have to bother with constantly expiring long-term tokens, not nothing in which of the hunderds of places we’ve might have put them.

ghxst|1 year ago

Does this actually occur with real or high-value targets? I'm genuinely curious, as I can only envision this happening with smaller side projects. However, I'd be interested to hear any stories of encountering this in the wild. It's a good reminder to stay mindful of what might accidentally be exposed.

mlhpdx|1 year ago

I’ve never deployed a .git folder and wonder what systems/approaches lead to such a thing. How does that happen?

mukesh610|1 year ago

It's pretty common in systems where the final output to be deployed is the same as the root of the source tree. More often than not, lazy developers tend to just git clone the repo and point their web server's document root to the cloned source folder. In default configurations, .git is happily served to anyone asking for it.

This seems to be automatically mitigated in systems which might have a "build" / "compilation" phase, because for the application to work in the first place, you only need the compiled output to be deployed. For instance, Apache Tomcat.

JamesSwift|1 year ago

Its easy to miss that you need to duplicate your .gitignore into your .dockerignore

nurettin|1 year ago

For scripting languages, I sometimes clone a readonly repo to prod. Then I use git pull && systemctl --user restart srv to deploy. For compiled programs, I do it with rsync or docker pull.

jalict|1 year ago

index.html and rest of project is rooted in well.. root. And simple push deploy your root git repo to your /var/www or whatever.

Ex. you use github pages and do homemade brew html or whatever that is not static-generator that outputs to a subfolder.

faangguyindia|1 year ago

It's because dockerignore file does not take this into consideration.

ram_rattle|1 year ago

naive question: Doesn't github secret scan kind of thing wont catch this?

mukesh610|1 year ago

No, in my YAML example, you could see that there were no credentials directly hard-coded into the pipeline. The credentials are configured separately, and the Pipelines are free to use them to do whatever actions they want.

This is how all major players in the market recommend you set up your CI pipeline. The problem here lies in implicit trust of the pipeline configuration which is stored along with the code.

zarkenfrood|1 year ago

It was deployed using a Bitbucket pipeline which does have a secret scanner available. However the scanner would need to be manually configured to be fully effective.

44 comments