1. Hardcoded credentials are a plague. You should consider tagging all of your secrets so that they're easier to scan for. Github automatically scans for secrets, which is great.
2. Jenkins is particularly bad for security. I've seen it owned a million and one times.
3. Containers are overused as a security boundary and footguns like `--privileged` completely eliminate any boundary.
4. Environment variables are a dangerous place to store secrets - they're global to the process and therefor easy to leak. I've thought about this a lot lately, especially after log4j. I think one pattern that may help is clearing the variables after you've loaded them into memory.
Another I've considered is encrypting the variables. A lot of the time what you have is something like this:
Secret Store -> Control Plane Agent -> Container -> Process
Where secrets flow from left to right. The control plane agent and container have full access to the credentials and they're "plaintext" in the Process's environment.
In theory you should be able to pin the secrets to that process with a key. During your CD phase you would embed a private key into the process's binary (or a file on the container) and then tell your Secret Manager to use the associated public key to transmit the secrets. The process could decrypt those secrets with its private key but they're E2E encrypted across any hops between the Secret Store and Process and they can't be leaked without explicitly decrypting them first.
> Environment variables are a dangerous place to store secrets - they're global to the process and therefor easy to leak.
The two real problems with environment variables are:
1. Environment variables are traditionally readable by any other process in the system. There are settings you can do on modern kernels to turn this off, but how do you know that you will always run on such a system?
2. Environment variables are inherited to all subprocesses by default, unless you either unset them after you fork() (but before you exec()), or if you take special care to use execve() (or similar) function to provide your own custom-made environment for the new process.
I think your agent idea is good. I'd want to add in a way for the agent to detect when a key is used twice (to catch other processes using the key) or when the code you wrote didn't get the key directly (to catch proxies), and then a way to kill or suspend the process for review. Would be pretty sweet.
Interesting to learn that credentials in environment variables are frowned upon. I mean, makes sense if your threat model includes people pushing malicious code to CI, but aren't you more or less done for at that point anyway? If "legitimate" code can do a certain thing, then malicious code can do too. I guess you'll want to limit the blast radius, but drawing these boundaries seems like a nightmare for everyone...
> makes sense if your threat model includes people pushing malicious code to CI, but aren't you more or less done for at that point anyway? If "legitimate" code can do a certain thing, then malicious code can do too.
The answer is very much, 'it depends'. For oen thing, developers can run whatever code in CI before it's benn reviewed. I could just nab the env vars and post them wherever. If there are no sensitive env vars for me to nab and you have enforced code review, then I need a co-conspirator, and my change is probably going to leave a lot more of a paper trail.
Another risk is accidental disclosure - I have on at least two occasions accidentally logged sensitive environment variables in our CI environment. Now your threat model is not just a malicious developer pushing code - it's a developer making a mistake, plus anyone with read access to the CI system.
I don't know about your org, but at my job, the set of people who have read access to CI is a lot larger than the set who can push code, which is again a lot larger than the set of people who can merge code without a reviewer signing off.
> but drawing these boundaries seems like a nightmare for everyone...
As someone currently struggling with how to draw them, yup.
The article is more specific than that: They shouldn't be shared with code run by people/jobs who shouldn't have access to it. I.e. don't have secrets used for deploys in the environment that runs automatically on every PR if deploys are gated behind review by a more limited list of users.
> I mean, makes sense if your threat model includes people pushing malicious code to CI, but aren't you more or less done for at that point anyway?
Maybe. Back in the old days if you had the commit bit your badge didn’t get you into the server room. I get the impression a lot of shops are effectively giving their devs root but in the cloud this time, which isn’t necessary.
We’ve been using Sysbox (https://github.com/nestybox/sysbox) for our Buildkite based CI/CD setup, allows docker-in-docker without privileged containers. Paired with careful IAM/STS design we’ve ended up with isolated job containers with their own IAM roles limited to least-privilege.
Never head of Sysbox before. At a first glance, the comparison table in their GitHub repo and on their website[1] has a number of inaccuracies which makes me question the quality of their engineering:
— They claim that their solution has the same isolation level ("4 stars") than gVisor, unlike "standard containers", which are "2 stars" only (with Firecracker and Kubevirt being "5 stars). This is very wrong - as far as I can tell, they use regular Linux namespaces with some light eBPF-based filesystem emulation, while the vast majority of syscalls is still handled by the host kernel. Sorry, but this is still "2 stars" and far away from the isolation guarantees provided by gVisor (fully emulating the kernel in userspace, which is at the same level or even better than Firecracker) and nowhere close to a VM.
— Somehow, regular VMs (Kubevirt) get a "speed" rating of only "2 stars" - worse than gVisor ("3 stars") and Firecracker ("4 stars"), even though they both rely on virtually the same virtualization technology. If anything, gVisor is the slowest but most efficient solution while QEMU maintains some performance advantage over Firecracker[2]. These are basically random scores, it's not a good first impression–if you do a detailed comparison like that, at least do a proper evaluation before giving your own product the best score!
— They claim that "standard containers" cannot run a full OS. This isn't true - while it's typically a bad idea, this works just fine with rootless podman and, more recently, rootless docker. Allowing this is the whole point of user namespaces, after all! Maybe their custom procfs does a better job of pretending to be a VM - but it's simply false that you can't do these things without. You can certainly run a full OS inside Kata/Firecracker, too, I've actually done that.
Nitpicking over rating scales aside, the claim that their solution offers large security improvements over any other solution with user namespaces isn't true and the whole thing seems very marketing-driven. The isolation offered by user namespaces is still very weak and not comparable to gVisor or Firecracker (both in production use by Google/AWS for untrusted workloads!). False marketing is a big red flag, especially for something as critical as a container runtime.
Anyone who wants unprivileged system containers might want to look into rootless docker or podman rather than this.
Many of these points are about running pipelines in privileged containers. Something I actually took extra time to resolve for my team. That's when I discovered kaniko first, and shortly after podman/buildah.
After that podman and buildah have gotten a lot of great reviews from people so I think they're awesome.
For an old time Unix sysadmin it just doesn't make sense to run something as root unless you absolutely have to.
Which also makes the client excuse in the article so strange, they had to run the container privileged to run static code analysis. wtf. Doesn't that just mean they run a tool against a binary artefact from a previous job? I fail to see how that requires privileges.
This is a great resource. I'd love to see more reports like it published. CI/CD pipelines often run with highly elevated permissions (access to source code, artifact repositories, and production environments), but they are traditionally neglected.
I suspect this is also an under considered area even in organizations with lots of attention to security. So would be good to also get more mindshare, as after we discovered some of our own CI/CD related vulnerabilities[1], it feels like most approaches we looked at had similar problems, and it took alot of research to find the rare solution that we could be confident in.
I wouldn't say they are traditionally neglected, precisely. CI/CD systems are often treated as a place where devs hold infinite power with developer convenience prioritized above all else. Developers, who are generally not security experts, often expect to wholly own their build and deployment processes.
I've seen few things get engineer pushback quite like trying to tell engineers that they need to rework how they build and deploy because someone outside their team said so. It's just dev, not production, so why should they be so paranoid about it? Sheesh, stop screwing up their perfectly good workflows...
Is it just my impression or security in Jenkins seems much more challenging and more time-consuming than in GitLab?
This post gives many examples where GitLab was attacked, so of course bad practices like privileged containers can lead to the compromise of a server independently by the technology used, but from my experience with Jenkins, I've seen using passwords in plaintext so many times, even in big companies.
Jenkins is security game over if you overlook a small crucial configuration option or if you install any plugin (and it's unusable without some plugins), as plugin development is a free-for-all and dependencies between plugins are many. We basically decided that one instance of Jenkins plus slaves was unfixable and unconfigurable to use securely across multiple teams with developers of differing trust levels (external contributors vs normal in-house devs) and started fresh with a different CI design.
Jenkins is a batteries excluded pattern in one of its worst possible incarnations.
Jenkins is basically a CI framework for trusted users only. Untrusted workloads must not have access to anything Jenkins.
I don’t really like either. Both have traditionally been bad & related to on-prem legacy workloads. Building for SVN apps or teams new to git. It’s usually a mess.
Jenkins was also affected by numerous Java serialization vulnerabilities. It also used to be that any worker could escalate to the main Jenkins server pretty much by design, not sure what the current situation is.
> but from my experience with Jenkins, I've seen using passwords in plaintext so many times, even in big companies
I reckon this has to do with how the CI tools are configured.
Everyone knows you shouldn't commit a secret to Git, so tools like GitLab CI which require all their config be in git naturally will see less of this specific issue.
You would think by now we would have better credential methods. I still see username and passwords for system credentials. I see tokens created by three legged auths. I don’t get how that is an improvement. The problem is that most deployed code doesn’t have just one credential but a dozen. Multiply that with several environments and you get security fatigue and apathy.
The company I currently do contract work for, decided it would be best to have one large team in Azure DevOps and subdivide all teams in repositories etc with prefixes and homegrown "Governer" scripts, which are enforced in all pipelines.
Global find on some terms like "key", "password" etc were great fun. It really showed most people, our team included, struggled with getting the pipeline to work at all. Let alone doing it in a secure manner.
This is a 50k+ employee financial institute. I am honestly surprised these kind of attacks are not much more widespread.
A recurring theme is that they obtain secret credentials from a service which needs to verify credentials, and then turn around and use those to impersonate the entity providing those credentials. For example getting Jenkins to run some Groovy discovers credentials Jenkins uses to verify who is accessing it, and then you can just use those credentials yourself.
To fix this - almost anywhere - stop using shared secrets. Every time you visit a (HTTPS) web site, you are provided with the credentials to verify its identity. But, you don't gain the ability to impersonate the site because they're not secret credentials, they're public. You can and should use this in a few places in typical CI / CD type infrastructure today, and we should be encouraging other services to enable it too ASAP.
In a few places they mention MFA. Again, most MFA involves secrets, for example TOTP Relying Parties need to know what code you should be typing in, so, they need the seed from which to generate that code, and attackers can steal that seed. WebAuthn doesn't involve secrets, so, attackers who steal WebAuthn credentials don't achieve anything. Unfortunately chances are you enabled one or more vulnerable credential types "just in case"...
A weakness of modern secret management is that it isn’t.
A secret value ought to be very carefully guarded even from the host machine itself.
.NET for example has SecureString, which is a good start — it can’t be accidentally printed or serialised insecurely. If it is serialised, then it is automatically encrypted by the host OS data protection API.
Windows even has TPM-hosted certificates! They’re essentially a smart card plugged into the motherboard.
A running app can use a TPM credential to sign requests but it can’t read or copy it.
These advancements are just completely ignored in the UNIX world, where everything is blindly copied into easily accessible locations in plain text…
> The credentials gave the NCC Group consultant access as a limited user to the Jenkins Master web login UI which was only accessible internally and not from the Internet. After a couple of clicks and looking around in the cluster they were able to switch to an administrator account.
These kinds of statements are giving major "draw the rest of the owl" vibes.
Ultimately most CI/CD setups are basically systems administrators with privileged access to everything, network connected and running 24/7. It's pretty dangerous stuff.
I don't have an answer though, expect maybe to keep the CI and CD in separate, isolated instances that require manual intervention to bridge the gap on a case by case basis. That doesn't scale very well though.
I think in general we put too much logic into our CI/CD configurations.
There is an argument to be made for a minimalist CI/CD implementation that can handle task scheduling and dependencies, understands how to fetch and tag version control, count version numbers and not much else. Even extracting test result summaries, while handy, maybe should be handled another way.
For many of us, if CI is down you can't deploy anything to production, not even roll back to a previous build. Everything but the credentials should be under version control, and the right people should be able to fire off a one-liner from a runbook that has two to four sanity checked arguments in order to trigger a deployment.
[+] [-] staticassertion|4 years ago|reply
Some thoughts:
1. Hardcoded credentials are a plague. You should consider tagging all of your secrets so that they're easier to scan for. Github automatically scans for secrets, which is great.
2. Jenkins is particularly bad for security. I've seen it owned a million and one times.
3. Containers are overused as a security boundary and footguns like `--privileged` completely eliminate any boundary.
4. Environment variables are a dangerous place to store secrets - they're global to the process and therefor easy to leak. I've thought about this a lot lately, especially after log4j. I think one pattern that may help is clearing the variables after you've loaded them into memory.
Another I've considered is encrypting the variables. A lot of the time what you have is something like this:
Secret Store -> Control Plane Agent -> Container -> Process
Where secrets flow from left to right. The control plane agent and container have full access to the credentials and they're "plaintext" in the Process's environment.
In theory you should be able to pin the secrets to that process with a key. During your CD phase you would embed a private key into the process's binary (or a file on the container) and then tell your Secret Manager to use the associated public key to transmit the secrets. The process could decrypt those secrets with its private key but they're E2E encrypted across any hops between the Secret Store and Process and they can't be leaked without explicitly decrypting them first.
[+] [-] teddyh|4 years ago|reply
The two real problems with environment variables are:
1. Environment variables are traditionally readable by any other process in the system. There are settings you can do on modern kernels to turn this off, but how do you know that you will always run on such a system?
2. Environment variables are inherited to all subprocesses by default, unless you either unset them after you fork() (but before you exec()), or if you take special care to use execve() (or similar) function to provide your own custom-made environment for the new process.
[+] [-] notreallyserio|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] colek42|4 years ago|reply
There is a really good article that explains a different way of securing these systems though sets of attestations.
https://grepory.substack.com/p/der-softwareherkunft-software...
[+] [-] MauranKilom|4 years ago|reply
[+] [-] xmodem|4 years ago|reply
The answer is very much, 'it depends'. For oen thing, developers can run whatever code in CI before it's benn reviewed. I could just nab the env vars and post them wherever. If there are no sensitive env vars for me to nab and you have enforced code review, then I need a co-conspirator, and my change is probably going to leave a lot more of a paper trail.
Another risk is accidental disclosure - I have on at least two occasions accidentally logged sensitive environment variables in our CI environment. Now your threat model is not just a malicious developer pushing code - it's a developer making a mistake, plus anyone with read access to the CI system.
I don't know about your org, but at my job, the set of people who have read access to CI is a lot larger than the set who can push code, which is again a lot larger than the set of people who can merge code without a reviewer signing off.
> but drawing these boundaries seems like a nightmare for everyone...
As someone currently struggling with how to draw them, yup.
[+] [-] detaro|4 years ago|reply
[+] [-] mulmen|4 years ago|reply
Maybe. Back in the old days if you had the commit bit your badge didn’t get you into the server room. I get the impression a lot of shops are effectively giving their devs root but in the cloud this time, which isn’t necessary.
[+] [-] imachine1980_|4 years ago|reply
[+] [-] lox|4 years ago|reply
[+] [-] lima|4 years ago|reply
— They claim that their solution has the same isolation level ("4 stars") than gVisor, unlike "standard containers", which are "2 stars" only (with Firecracker and Kubevirt being "5 stars). This is very wrong - as far as I can tell, they use regular Linux namespaces with some light eBPF-based filesystem emulation, while the vast majority of syscalls is still handled by the host kernel. Sorry, but this is still "2 stars" and far away from the isolation guarantees provided by gVisor (fully emulating the kernel in userspace, which is at the same level or even better than Firecracker) and nowhere close to a VM.
— Somehow, regular VMs (Kubevirt) get a "speed" rating of only "2 stars" - worse than gVisor ("3 stars") and Firecracker ("4 stars"), even though they both rely on virtually the same virtualization technology. If anything, gVisor is the slowest but most efficient solution while QEMU maintains some performance advantage over Firecracker[2]. These are basically random scores, it's not a good first impression–if you do a detailed comparison like that, at least do a proper evaluation before giving your own product the best score!
— They claim that "standard containers" cannot run a full OS. This isn't true - while it's typically a bad idea, this works just fine with rootless podman and, more recently, rootless docker. Allowing this is the whole point of user namespaces, after all! Maybe their custom procfs does a better job of pretending to be a VM - but it's simply false that you can't do these things without. You can certainly run a full OS inside Kata/Firecracker, too, I've actually done that.
Nitpicking over rating scales aside, the claim that their solution offers large security improvements over any other solution with user namespaces isn't true and the whole thing seems very marketing-driven. The isolation offered by user namespaces is still very weak and not comparable to gVisor or Firecracker (both in production use by Google/AWS for untrusted workloads!). False marketing is a big red flag, especially for something as critical as a container runtime.
Anyone who wants unprivileged system containers might want to look into rootless docker or podman rather than this.
[1]: https://www.nestybox.com
[2]: https://www.usenix.org/system/files/nsdi20-paper-agache.pdf
[+] [-] xmodem|4 years ago|reply
[+] [-] INTPenis|4 years ago|reply
After that podman and buildah have gotten a lot of great reviews from people so I think they're awesome.
For an old time Unix sysadmin it just doesn't make sense to run something as root unless you absolutely have to.
Which also makes the client excuse in the article so strange, they had to run the container privileged to run static code analysis. wtf. Doesn't that just mean they run a tool against a binary artefact from a previous job? I fail to see how that requires privileges.
[+] [-] wdb|4 years ago|reply
[+] [-] dlor|4 years ago|reply
[+] [-] kevin_nisbet|4 years ago|reply
[1] - https://goteleport.com/blog/hack-via-pull-request/
[+] [-] Kalium|4 years ago|reply
I've seen few things get engineer pushback quite like trying to tell engineers that they need to rework how they build and deploy because someone outside their team said so. It's just dev, not production, so why should they be so paranoid about it? Sheesh, stop screwing up their perfectly good workflows...
[+] [-] Lucasoato|4 years ago|reply
[+] [-] 2ion|4 years ago|reply
Jenkins is a batteries excluded pattern in one of its worst possible incarnations.
Jenkins is basically a CI framework for trusted users only. Untrusted workloads must not have access to anything Jenkins.
[+] [-] ramoz|4 years ago|reply
[+] [-] formerly_proven|4 years ago|reply
[+] [-] kiallmacinnes|4 years ago|reply
I reckon this has to do with how the CI tools are configured.
Everyone knows you shouldn't commit a secret to Git, so tools like GitLab CI which require all their config be in git naturally will see less of this specific issue.
[+] [-] 0xbadcafebee|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] rawgabbit|4 years ago|reply
[+] [-] mvdwoord|4 years ago|reply
Global find on some terms like "key", "password" etc were great fun. It really showed most people, our team included, struggled with getting the pipeline to work at all. Let alone doing it in a secure manner.
This is a 50k+ employee financial institute. I am honestly surprised these kind of attacks are not much more widespread.
[+] [-] tialaramex|4 years ago|reply
To fix this - almost anywhere - stop using shared secrets. Every time you visit a (HTTPS) web site, you are provided with the credentials to verify its identity. But, you don't gain the ability to impersonate the site because they're not secret credentials, they're public. You can and should use this in a few places in typical CI / CD type infrastructure today, and we should be encouraging other services to enable it too ASAP.
In a few places they mention MFA. Again, most MFA involves secrets, for example TOTP Relying Parties need to know what code you should be typing in, so, they need the seed from which to generate that code, and attackers can steal that seed. WebAuthn doesn't involve secrets, so, attackers who steal WebAuthn credentials don't achieve anything. Unfortunately chances are you enabled one or more vulnerable credential types "just in case"...
[+] [-] jiggawatts|4 years ago|reply
A secret value ought to be very carefully guarded even from the host machine itself.
.NET for example has SecureString, which is a good start — it can’t be accidentally printed or serialised insecurely. If it is serialised, then it is automatically encrypted by the host OS data protection API.
Windows even has TPM-hosted certificates! They’re essentially a smart card plugged into the motherboard.
A running app can use a TPM credential to sign requests but it can’t read or copy it.
These advancements are just completely ignored in the UNIX world, where everything is blindly copied into easily accessible locations in plain text…
[+] [-] pxx|4 years ago|reply
[+] [-] mdoms|4 years ago|reply
These kinds of statements are giving major "draw the rest of the owl" vibes.
https://i.kym-cdn.com/photos/images/newsfeed/000/572/078/d6d...
[+] [-] movedx|4 years ago|reply
Ultimately most CI/CD setups are basically systems administrators with privileged access to everything, network connected and running 24/7. It's pretty dangerous stuff.
I don't have an answer though, expect maybe to keep the CI and CD in separate, isolated instances that require manual intervention to bridge the gap on a case by case basis. That doesn't scale very well though.
[+] [-] hinkley|4 years ago|reply
There is an argument to be made for a minimalist CI/CD implementation that can handle task scheduling and dependencies, understands how to fetch and tag version control, count version numbers and not much else. Even extracting test result summaries, while handy, maybe should be handled another way.
For many of us, if CI is down you can't deploy anything to production, not even roll back to a previous build. Everything but the credentials should be under version control, and the right people should be able to fire off a one-liner from a runbook that has two to four sanity checked arguments in order to trigger a deployment.
[+] [-] contingencies|4 years ago|reply
... via https://github.com/globalcitizen/taoup
[+] [-] thomasmarcelis|4 years ago|reply
>>“Pretend you have compromised a developer’s laptop.”
Most companies will fail right here. Especially outside of the tech world security hygiene with developer's laptops is very bad from what I have seen.
[+] [-] cerved|4 years ago|reply
cries in security
[+] [-] i_like_waiting|4 years ago|reply