> While running the exploit, CodeRabbit would still review our pull request and post a comment on the GitHub PR saying that it detected a critical security risk, yet the application would happily execute our code because it wouldn’t understand that this was actually running on their production system.
What a bizarre world we're living in, where computers can talk about how they're being hacked while it's happening.
Also, this is pretty worrisome:
> Being quick to respond and remediate, as the CodeRabbit team was, is a critical part of addressing vulnerabilities in modern, fast-moving environments. Other vendors we contacted never responded at all, and their products are still vulnerable. [emphasis mine]
Props to the CodeRabbit team, and, uh, watch yourself out there otherwise!
I cancelled my coderabbit paid subscription, because it always worries me when a post has to go viral on HN for a company to even acknowledge an issue occurred. Their blogs are clean of any mention of this vulnerability and they don't have any new posts today either.
I understand mistakes happen, but lack of transparency when these happen makes them look bad.
Both articles were published today. It seems to me that the researchers and coderabbit agreed to publish on the same day. This is a common practice when the company decides to disclose at all (disclosure is not required unless customer data was leaked and there's evidence of that, they are choosing to disclose unnecessarily here).
When the security researchers praise the response, it's a good sign tbh.
Most security bugs get fixed without any public notice. Unless there was any breach of customer information (and that can be often verified), there are typically no legal requirements. And there's no real benefit to doing it either. Why would you expect it to happen?
Yikes, this is a pretty bad vulnerability. It's good that they fixed it, but damning that it was ever a problem in the first place.
Rule #1 of building any cloud platform analyzing user code is that you must run analyzers in isolated environments. Even beyond analysis tools frequently allowing direct code injection through plugins, linters/analyzers/compiler are complex software artifacts with large surface areas for bugs. You should ~never assume it's safe to run a tool against arbitrary repos in a shared environment.
I also ran a code analysis platform, where we ran our own analyzer[1] against customer repos. Even though we developed the analyzer ourself, and didn't include any access to environment variables or network requests, I still architected it so executions ran in a sandbox. It's the only safe way to analyze code.
This is a great read, but unfortunately does not surprise me really, it was bound to happen given how people blindly add apps with wide permissions and githubs permissions model.
It amazes me how many people will install github apps that have wide scopes, primarily write permissions to their repositories. Even with branch protection, often people will allow privilaged access to their cloud in github actions from pull requests. To properly configure this, you need to change the github oidc audience and that is not well documented.
When you enquire with the company who makes an app and ask them to provide a different app with less scope to disable some features which require write, they often have no interest what so ever and don't understand the security concerns and potential implications.
I think github need to address this in part by allowing more granular app access defined by the installer, but also more granular permissions in general.
It is incredibly bad practice that their "become the github app as you desire" keys to the kingdom private key was just sitting in the environment variables. Anybody can get hacked, but that's just basic secrets management, that doesn't have to be there. Github LITERALLY SAYS on their doc that storing it in an environment variable is a bad idea. Just day 1 stuff. https://docs.github.com/en/apps/creating-github-apps/authent...
If it’s not a secret that is used to sign something, then the secret has to get from the vault to the application at some point.
What mechanism are you suggesting where access to the production system doesn’t let you also access that secret?
Like I get in this specific case where you are running some untrusted code, that environment should have been isolated and these keys not passed in, but running untrusted code isn’t usually a common feature of most applications.
> On January 24, 2025, security researchers from Kudelski Security disclosed a vulnerability to us through our Vulnerability Disclosure Program (VDP). The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment—a configuration that deviated from our standard security protocols.
Honestly, that last part sounds like a lie. Why would one task run in a drastically different architectural situation, and it happen to be the one exploited?
Yes, all the tools are fine and secure and sandoxed, just this one tool that was kind of randomly chosen by the security researcher because it is a tool that can execute Ruby code inside the environment - one could argue an especially dangerous tool to run - was not safe.
> Why would one task run in a drastically different architectural situation
Someone made a mistake. These things happen.
> and it happen to be the one exploited?
Why would the vulnerable service be the service that is exploited? It seems to me that's a far more likely scenario than the non-vulnerable service being exploited... no?
Oh my god. I haven't finished reading that yet, it became too much to comprehend. Too stressful to take in the scope. The part where he could have put malware into release files of 10s of thousands (or millions?) of open source tools/libraries/software. That could have been a worldwide catastrophe. And who knows what other similar vulnerabilities might still exist elsewhere.
I'm starting to think these 'Github Apps' are a bad idea. Even if CodeRabbit didn't have this vulnerability, what guarantee do we have that they will always be good actors? That their internal security measures will ensure that none of their employees may do any malicious things?
Taking care of private user data in a typical SaaS is one thing, but here you have the keys to make targetted supply chain attacks that could really wreak havoc.
I think that Security fuckups of this disastrous scale should get classified as "breaches" or "incidents" and be required to be publicly disclosed by the news media, in order to protect consumers.
Here is a tool with 7,000+ customers and access to 1 million code repositories which was breached with an exploit a clever 11 year old could created. (edit: 1 million repos, not customers)
When the exploit is so simple, I find it likely that bots or Black Hats or APTs had already found a way in and established persistence before the White Hat researchers reported the issue. If this is the case, patching the issue might prevent NEW bad actors from penetrating CodeRabbit's environment, but it might not evict any bad actors which might now be lurking in their environment.
Code Rabbit is a vibe coder company, what would you expect? Then they try to hide the breach and instead post marketing fluff on google cloud blog not even mentioning they got hacked and can not even give any proof there is no backdoor still running all the time.
Being a mere user of web or other apps developed using so clever and felxible and powerful services like this accidentally (due to sheer complexity) exposing all and everything I might consider dear makes me reconsider if I want to use any. When I am granted a real choice. Not so much as time progresses, not so much. Apps are there everywhere using other apps, mandated by organizations carrying out services outsourced by banks, governemnts, etc., granted third parties' access by me accepting T&C, probably catching trouble in the details, or probably not, cannot be sure.
A reassuring line like this >>This is not meant to shame any particular vendor; it happens to everyone<< may calm providers but scare the shit out of me as a user providing my sensitive data in exchange for something I need, or worst, must do.
The real answer is that they have absolutely no clue if customer data was accessed, and no way to tell. I'm not even sure Github could tell, but it's not clear if the exploits way of generating private keys to access private repositories is any different to what CodeRabbit does in normal operation.
One of the problems is that code analyzers, bundlers, compilers (like Rust compiler) allow running arbitrary code without any warning.
Imagine following case: an attacker pretending to represent a company sends you a repository as a test task before the interview. You run something like "npm install" or run Rust compiler, and your computer is controlled by an attacker now.
Or imagine how one coworker's machine gets hacked, the malicious code is written into a repository and whole G, F or A is now owned by foreign hackers. All thanks to npm and Rust compiler.
Maybe those tools should explicitly confirm executing every external command (with caching allowed commands list in order to not ask again). And maybe Linux should provide an easy to use and safe sandbox for developers. Currently I have to make sandboxes from scratch myself.
Also in maybe cases you don't need the ability to run external code, for example, to install a JS package all you need to do is to download files.
Also this is an indication why it is a bad idea to use environment variables for secrets and configuration. Whoever wrote "12 points app" doesn't know that there are command-line switches and configuration files for this.
> compilers (like Rust compiler) allow running arbitrary code without any warning.
It's safe to assume that the Rust compiler (like any compiler built on top of LLVM) has arbitrary code execution vulnerabilities, but as an intended feature I think this only exists in cargo, the popular/official build system, not rustc, the compiler.
> Maybe those tools should explicitly confirm executing every external command
This wouldn't work - it's not external commands that's the problem, it's arbitrary code that's being executed. That code has access to all regular system APIs/syscalls, so there's no way of explicitly confirming external commands.
Python/pip suffers the same problem btw, so I think that ship has sailed.
When I read up to "One can use the Rubocop configuration file to specify the path to an extension Ruby file" my immediate thought was "oh no, they didn't allow a user-extendable tool to run in their prod environment..." - and yes, they did. Not that it'd be properly secure without this glaring hole - I don't think many linters are properly audited and fuzzed against hostile inputs - but this is like opening the front door and hanging a blinking neon sign "Please Hack Us!" over it.
Can someone explain how is this not GitHub's fault that they don't allow the end-user to modify the permissions that all these services require? E.g., fine-grained permission control?
For example, why a tool like this code analysis service would need git write permission access in the first place?
The only consolation here is that it'd be difficult to forge git repositories because of the SHA hash conflicts for any existing checkout, although presumably even there, the success rates would still be high enough, especially if they attack front-end repositories where the maintainers may not understand what has happened, and simply move on with the replaced repo without checking what went on.
Oh, it really makes my day when we get hacker post here on the top. This is so well written too, no mystique, just a simple sequence of logical steps, with pictures.
> After responsibly disclosing this critical vulnerability to the CodeRabbit team, we learned from them that they had an isolation mechanism in place, but Rubocop somehow was not running inside it.
Curious what this (isolation mechanism) means if anyone knows.
> Curious what this (isolation mechanism) means if anyone knows.
If they're anything like the typical web-startup "developing fast but failing faster", they probably are using docker containers for "security isolation".
I did not understand something: why did CodeRabbit run external tools on external code within its own set of environment variables? Why are these variables needed for this entire tooling?
> Why are these variables needed for this entire tooling?
They are not. The Github API secret key should never be exposed in the environment, period; you're supposed to keep the key in an HSM and only use it to sign the per-repo access token. Per the GH docs [0]:
> The private key is the single most valuable secret for a GitHub App. Consider storing the key in a key vault, such as Azure Key Vault, and making it sign-only. This helps ensure that you can't lose the private key. Once the private key is uploaded to the key vault, it can never be read from there. It can only be used to sign things, and access to the private key is determined by your infrastructure rules.
> Alternatively, you can store the key as an environment variable. This is not as strong as storing the key in a key vault. If an attacker gains access to the environment, they can read the private key and gain persistent authentication as the GitHub App.
It sounds like they were putting these processes in a chroot jail or something and not allowing them to access the parent process env vars. There's a continuum of ways to isolate child processes in Linux that don't necessarily involve containers or docker.
They probably didn't know that rubocop could be configured to run arbitary code. When I 'cat' or 'grep' a file from a repository I don't run 'cat' or 'grep' in a sandbox. They probably assumed the same was true of rubocop - that it just treats its input as input and not as instructions.
Their own tools would need the various API keys, of course, and they did build a method to filter out those variables and managed most user code through it, but it sounds like they forgot to put Rubocop through the special method.
So this researcher may have gotten lucky in choosing to dig into the tool that CodeRabbit got unlucky in forgetting.
if op is reading the comments here: the screenshot where CodeRabbit has discovered the security vulnerability in the PR contains the actual ip address the env vars were sent to. No big deal, just you carefully used 1.2.3.4 in the rest of the article only to leak it in the screenshot. fyi.
This is very similar to a CVE I discovered in cdxgen (CVE-2024-50611), which is similar to another CVE in Snyk's plugin (CVE-2022-24441). tl;dr if you run a scanner on untrusted code, ensure it doesn't have a way of executing that code.
Some ways to prevent this from happening:
1. Don't let spawned processes have access to your env, there are ways to allowlist a set of env vars that are needed for a sub process in all major languages
2. Don't store secrets in env vars, use a good secrets vault (with a cache)
3. Tenant isolation as much as you can
4. And most obviously - don't run processes that can execute the code they are scanning, especially if that code is not your code (harder to tell, but always be paranoid)
hey, this is Howon from CodeRabbit here. we wish to note that this RCE was reported and fixed in January. it was entirely prospective and no customer data was affected. we have extensive sandboxing for basically any execution of anything now, including any and every tool and all generated code of any kind under the CodeRabbit umbrella.
Where can we find the blog post you made back in January about the RCE fix explaining what measures you took to check if any customer data had been affected?
how do you know that no customer data was affected? did you work with github and scan all uses of your keys? how do you know if a use of your github key was authentic or not? did you check with anthroipic/openai/etc to scan logs usage?
It's really hard to trust a "hey we got this guys" statement after a fuckup this big
Reading this, its not clear how your blog posts relates:
1. You run git clone inside the GCR function, so, you have at the very least a user token for the git provider
2. RCE exploit basically used the external tools, like a static analysis checker, which again, is inside your GCR function
3. As a contrived example, if I could RCE `console.log(process.env)` then seemingly I could do `fetch(mywebsite....`
I get it, you can hand wave some amount of "VPC" and "sandbox" here. But, you're still executing code, explicitly labeling it "untrusted" and "sandboxed" doesn't excuse it.
While I fully understand that things sometimes get missed, it just seems really bizarre to me that somehow “sandboxing/isolation” was never considered prior to this incident. To me, it feels like the first thing to implement in a system that is explicitly built to run third party untrusted code?
> Sandboxing: All Cloud Run instances are sandboxed with two layers of sandboxing and can be configured to have minimal IAM permissions via dedicated service identity. In addition, CodeRabbit is leveraging Cloud Run's second generation execution environment, a microVM providing full Linux cgroup functionality. Within each Cloud Run instance, CodeRabbit uses Jailkit to create isolated processes and cgroups to further restrict the privileges of the jailed process.
I've ranted about this before and been downvoted, ignored as "not an issue" but, IMO, Github is majorly to blame for this. They under-invested in their permission system so 3rd party apps are effectively encouraged to ask for "root" permissions.
Effectively, many (most?) 3rd party github integrations basically ask you to type in your github ID. Then they use the github API and ask for maximal permissions. This lets them make it easy for you to use their services because they can do all the rest of the setup for you. But, NO ONE SHOULD EVER GIVE THIS KIND OF PERMISSION.
Any 3rd party service that said "give us root to your servers" would be laughed out of the market. But, that's what github has encouraged because their default workflow leaves it up to the developer to do the right thing.
Instead, github's auth UX should (1) require you to choose repos (2) not allow picking "all repos" (3) require to you select each and every permission (4) not have an option for "all permissions".
As an analogy (though poor). iOS and MacOS don't say "this app wants all these permissions, yes/no" (android used to do this). Instead, they ask one at a time (camera? mic? photos? network?) etc... I'm not suggesting that github ask one at a time. I am suggesting that github provide a default UI that lists all the permissions, per repo, and has no way to auto-populate it so the user is required to choose.
Further, I would argue that github should show the integrations and permissions for any repo. The hope being if I see "lib X uses integration Y with write permission" then I know lib X is not trustworthy because it's open to supply chain attacks (more than lib Z which has no write integrations)
1. Allow poorly-vetted third-party tools to run in CodeRabbit's privileged environment. The exploit used a Ruby code analysis tool that was probably written 15 years ago and meant to be run locally by trusted developers, who already had access to /bin/sh.
2. Ask for coarse-grained permission to access and modify others' code without any checks.
Either of those by itself would be bad enough. The future looks bright for black or white hats who understand computers.
I can't say I'm surprised they didn't pay a bounty when they couldn't even own up to this on their own blog [1].
Instead they took it as an opportunity to market their new sandboxing on Google's blog [2] again with no mention of why their hand was forced into building the sandboxing they should have had before they rushed to onboard thousands of customers.
I have no idea what their plan was. They had to have known the researchers would eventually publish this. Perhaps they were hoping it wouldn't get the same amount of attention it would if they posted it on their own blog.
Developer tools really need to be more mindful of the fact that on developer machines, the current directory should not be trusted, and arbitrary code should not be executed from it. The git project has been learning this the hard way, and others should too.
For check-all-the-things (meta-linter), we disable the rubocop default config file using the "--config /dev/null" options.
That’s why I’m worried about the growing centralization of things such as Chrome, Gmail, AWS, Cloudflare…
It’s very efficient to delegate something to one major actor but we are introducing single points of failure and are less resilient to vulnerabilities.
Critical systems should have defenses in depth, decentralized architectures and avoid trusting new providers with too many moving parts.
While GitHub needs to invest in finer grained permissioning, I do think there’s lots of lessons for companies building with and customers using GitHub App based deployments. Jotted down my thoughts here https://www.endorlabs.com/learn/when-coderabbit-became-pwned...
My nightmare is that one of those auto updating vim/vscode/your-favorite-IDE plug-ins That many of us happily use on all the monorepos we work on, at one point invoke a "linter" (or as in this case, configure a linter maliciously) and we start leaking the precious IP to random attackers :-(
If I were a CodeRabbit customer, I'd still be pretty concerned after reading that.
How can CodeRabbit be certain that the GitHub App key was not exfiltrated and used to sign malicious tokens for customer repos (or even used for that in-situ)? I'm not sure if GitHub supports restricting the source IPs of API requests, but if it does, it'd be a trivial mitigation - and one that is absent from the blog post.
The claim that "no malicious activity occurred" implies that they audited the activities of every repo that used Rubocop (or any other potential unsandboxed tool) from the point that support was added for it until the point that the vulnerability was fixed. That's a big claim.
And why only publish this now, when the Kudelski article makes it to the top of HN, over six months after it was disclosed to them?
Unrelated to the article, but the first time I saw them was in a twitter ad with a completely comically bull** suggestion. I cannot take a company seriously that had something like that inside an ad that is supposed to show the best they're capable of.
So if their GH API token with access to million plus repos was this easy to compromise, isn't it plausible that their token could have been used to clone clone said repos? Is it possible to audit the clone history of a token?
Even with proper sandboxing, storing all sensitive credentials as environment variables is still a security anti-pattern. ENV vars are too easily accessible - any process can just run ENV.to_h and dump everything.
How are they getting access to the PostgreSQL database, unless this running code can communicate with it? That’s a big red flag, user provided code should always be sandboxed and isolated right?
Besides that this was clearly a security f*ckup, in my mind it's almost equivalent to running those third party liters in our Internet-connection-enabled editors and IDEs. Other than one banking project, I don't think I ever had to sandbox my editor in any way.
global scoped installations or keys always scare me for this reason
i believe the answer here was to exchange the token for something scoped to the specific repo coderabbit is running in, but alas, that doesn't remove the "RCE" _on_ the repo
They do that, this is how GH apps work. There is no reason to expose the app's private key in the environment for the code that actually runs on the PR.
The security researcher noticed that CodeRabbit runs linters against your code base and noticed that Rubocop was among the provided linters. Rubocop supports extensions that contain custom code, so he crafted an extension that exfiltrated the environment variables of the running Rubocop process when it linted the contents of his PR.
If you're a concerned user and you're looking for a solution founded by 2 people with a security background who have sandboxed execution (and network limited) so stuff like this can't happen you should check us out.
We even offer a self hosted deployment which sidesteps this entirely. (feel free to reach out).
> Instead, it would be best to assume that the user may be able to run untrusted code through these tools. So, running them in an isolated environment, with only the minimum information required to run the tools themselves, and not passing them any environment variables would be much better. Even if arbitrary code execution would be possible, the impact would be much less severe.
> For defense in depth, one should add a mechanism that prevents sending private information to an attacker-controlled server. For example, only allow outgoing traffic to whitelisted hosts, if possible. If the tool doesn’t require internet access, then all network traffic may even be disabled in that isolated environment. This way it would make it harder for an attacker to exfiltrate secrets.
I yearn to live in a world where this is the default or at least REALLY EASY to do, where you just fall into the pit of success.
And yet, we live in a borderline insane world where one key getting leaked can pwn a million repos - if nothing else, there should be one key per interaction with account/repo. Not to mention that Rubocop (and probably other tools, eventually) have arbitrary code execution as a feature.
I don't think that CodeRabbit messed up, as much as everything around them is already messed up.
I've noticed CodeRabbit at times does reviews that are super. It is able to catch bugs that even claude code misses on our Github PRs. Blows my mind at times tbh.
Based on the env vars seems like they're using anthropic, openai, etc. only?
ketzo|6 months ago
What a bizarre world we're living in, where computers can talk about how they're being hacked while it's happening.
Also, this is pretty worrisome:
> Being quick to respond and remediate, as the CodeRabbit team was, is a critical part of addressing vulnerabilities in modern, fast-moving environments. Other vendors we contacted never responded at all, and their products are still vulnerable. [emphasis mine]
Props to the CodeRabbit team, and, uh, watch yourself out there otherwise!
progforlyfe|6 months ago
htrp|6 months ago
unknown|6 months ago
[deleted]
_Algernon_|6 months ago
shreddit|6 months ago
vadepaysa|6 months ago
I understand mistakes happen, but lack of transparency when these happen makes them look bad.
sophacles|6 months ago
When the security researchers praise the response, it's a good sign tbh.
curuinor|6 months ago
viraptor|6 months ago
morgante|6 months ago
Rule #1 of building any cloud platform analyzing user code is that you must run analyzers in isolated environments. Even beyond analysis tools frequently allowing direct code injection through plugins, linters/analyzers/compiler are complex software artifacts with large surface areas for bugs. You should ~never assume it's safe to run a tool against arbitrary repos in a shared environment.
I also ran a code analysis platform, where we ran our own analyzer[1] against customer repos. Even though we developed the analyzer ourself, and didn't include any access to environment variables or network requests, I still architected it so executions ran in a sandbox. It's the only safe way to analyze code.
[1] https://github.com/getgrit/gritql
smarx007|6 months ago
KingOfCoders|6 months ago
willejs|6 months ago
It amazes me how many people will install github apps that have wide scopes, primarily write permissions to their repositories. Even with branch protection, often people will allow privilaged access to their cloud in github actions from pull requests. To properly configure this, you need to change the github oidc audience and that is not well documented.
When you enquire with the company who makes an app and ask them to provide a different app with less scope to disable some features which require write, they often have no interest what so ever and don't understand the security concerns and potential implications.
I think github need to address this in part by allowing more granular app access defined by the installer, but also more granular permissions in general.
thyrfa|6 months ago
doesnt_know|6 months ago
What mechanism are you suggesting where access to the production system doesn’t let you also access that secret?
Like I get in this specific case where you are running some untrusted code, that environment should have been isolated and these keys not passed in, but running untrusted code isn’t usually a common feature of most applications.
curuinor|6 months ago
robomc|6 months ago
> On January 24, 2025, security researchers from Kudelski Security disclosed a vulnerability to us through our Vulnerability Disclosure Program (VDP). The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment—a configuration that deviated from our standard security protocols.
Honestly, that last part sounds like a lie. Why would one task run in a drastically different architectural situation, and it happen to be the one exploited?
KingOfCoders|6 months ago
jdlshore|6 months ago
unknown|6 months ago
[deleted]
sophacles|6 months ago
Someone made a mistake. These things happen.
> and it happen to be the one exploited?
Why would the vulnerable service be the service that is exploited? It seems to me that's a far more likely scenario than the non-vulnerable service being exploited... no?
Romario77|6 months ago
They don't write the details of how they got to this particular tool - you could also see from the article they tried a different approach first.
chanon|6 months ago
chanon|6 months ago
Taking care of private user data in a typical SaaS is one thing, but here you have the keys to make targetted supply chain attacks that could really wreak havoc.
risyachka|6 months ago
It is absurd that anyone can mess up anything and have absolutely 0 consequences.
sciencejerk|6 months ago
Here is a tool with 7,000+ customers and access to 1 million code repositories which was breached with an exploit a clever 11 year old could created. (edit: 1 million repos, not customers)
When the exploit is so simple, I find it likely that bots or Black Hats or APTs had already found a way in and established persistence before the White Hat researchers reported the issue. If this is the case, patching the issue might prevent NEW bad actors from penetrating CodeRabbit's environment, but it might not evict any bad actors which might now be lurking in their environment.
I know Security is hard, but come on guys
smarx007|6 months ago
https://en.m.wikipedia.org/wiki/Cyber_Resilience_Act
Lionga|6 months ago
What a piece of shit company.
mihaaly|6 months ago
Being a mere user of web or other apps developed using so clever and felxible and powerful services like this accidentally (due to sheer complexity) exposing all and everything I might consider dear makes me reconsider if I want to use any. When I am granted a real choice. Not so much as time progresses, not so much. Apps are there everywhere using other apps, mandated by organizations carrying out services outsourced by banks, governemnts, etc., granted third parties' access by me accepting T&C, probably catching trouble in the details, or probably not, cannot be sure.
A reassuring line like this >>This is not meant to shame any particular vendor; it happens to everyone<< may calm providers but scare the shit out of me as a user providing my sensitive data in exchange for something I need, or worst, must do.
SCdF|6 months ago
> No customer data was accessed
As far as I can tell this is a lie.
The real answer is that they have absolutely no clue if customer data was accessed, and no way to tell. I'm not even sure Github could tell, but it's not clear if the exploits way of generating private keys to access private repositories is any different to what CodeRabbit does in normal operation.
codedokode|6 months ago
Imagine following case: an attacker pretending to represent a company sends you a repository as a test task before the interview. You run something like "npm install" or run Rust compiler, and your computer is controlled by an attacker now.
Or imagine how one coworker's machine gets hacked, the malicious code is written into a repository and whole G, F or A is now owned by foreign hackers. All thanks to npm and Rust compiler.
Maybe those tools should explicitly confirm executing every external command (with caching allowed commands list in order to not ask again). And maybe Linux should provide an easy to use and safe sandbox for developers. Currently I have to make sandboxes from scratch myself.
Also in maybe cases you don't need the ability to run external code, for example, to install a JS package all you need to do is to download files.
Also this is an indication why it is a bad idea to use environment variables for secrets and configuration. Whoever wrote "12 points app" doesn't know that there are command-line switches and configuration files for this.
gpm|6 months ago
It's safe to assume that the Rust compiler (like any compiler built on top of LLVM) has arbitrary code execution vulnerabilities, but as an intended feature I think this only exists in cargo, the popular/official build system, not rustc, the compiler.
criemen|6 months ago
This wouldn't work - it's not external commands that's the problem, it's arbitrary code that's being executed. That code has access to all regular system APIs/syscalls, so there's no way of explicitly confirming external commands.
Python/pip suffers the same problem btw, so I think that ship has sailed.
morgante|6 months ago
raggi|6 months ago
jeremyjh|6 months ago
That would mean all those values are in the clear in the process table. You couldn’t do a “ps” without exposing them.
smsm42|6 months ago
frankfrank13|6 months ago
> The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment
I don't think that was the main problem lol
cnst|6 months ago
For example, why a tool like this code analysis service would need git write permission access in the first place?
The only consolation here is that it'd be difficult to forge git repositories because of the SHA hash conflicts for any existing checkout, although presumably even there, the success rates would still be high enough, especially if they attack front-end repositories where the maintainers may not understand what has happened, and simply move on with the replaced repo without checking what went on.
nphardon|6 months ago
elpakal|6 months ago
Curious what this (isolation mechanism) means if anyone knows.
diggan|6 months ago
If they're anything like the typical web-startup "developing fast but failing faster", they probably are using docker containers for "security isolation".
benmmurphy|6 months ago
kachapopopow|6 months ago
(likely asked AI to implement x and ai completely disregarded the need to sandbox).
brainless|6 months ago
tadfisher|6 months ago
They are not. The Github API secret key should never be exposed in the environment, period; you're supposed to keep the key in an HSM and only use it to sign the per-repo access token. Per the GH docs [0]:
> The private key is the single most valuable secret for a GitHub App. Consider storing the key in a key vault, such as Azure Key Vault, and making it sign-only. This helps ensure that you can't lose the private key. Once the private key is uploaded to the key vault, it can never be read from there. It can only be used to sign things, and access to the private key is determined by your infrastructure rules.
> Alternatively, you can store the key as an environment variable. This is not as strong as storing the key in a key vault. If an attacker gains access to the environment, they can read the private key and gain persistent authentication as the GitHub App.
[0]: https://docs.github.com/en/apps/creating-github-apps/authent...
gdbsjjdn|6 months ago
immibis|6 months ago
The_Fox|6 months ago
So this researcher may have gotten lucky in choosing to dig into the tool that CodeRabbit got unlucky in forgetting.
elpakal|6 months ago
vedmakk|6 months ago
hahn-kev|6 months ago
dpcx|6 months ago
dpacmittal|6 months ago
cube00|6 months ago
[1]: https://news.ycombinator.com/item?id=44954242
eranation|6 months ago
Some ways to prevent this from happening:
1. Don't let spawned processes have access to your env, there are ways to allowlist a set of env vars that are needed for a sub process in all major languages
2. Don't store secrets in env vars, use a good secrets vault (with a cache)
3. Tenant isolation as much as you can
4. And most obviously - don't run processes that can execute the code they are scanning, especially if that code is not your code (harder to tell, but always be paranoid)
curuinor|6 months ago
if you want to learn how CodeRabbit does the isolation, here's a blog post about how: https://cloud.google.com/blog/products/ai-machine-learning/h...
mpeg|6 months ago
cleverwebb|6 months ago
It's really hard to trust a "hey we got this guys" statement after a fuckup this big
thyrfa|6 months ago
frankfrank13|6 months ago
1. You run git clone inside the GCR function, so, you have at the very least a user token for the git provider
2. RCE exploit basically used the external tools, like a static analysis checker, which again, is inside your GCR function
3. As a contrived example, if I could RCE `console.log(process.env)` then seemingly I could do `fetch(mywebsite....`
I get it, you can hand wave some amount of "VPC" and "sandbox" here. But, you're still executing code, explicitly labeling it "untrusted" and "sandboxed" doesn't excuse it.
progbits|6 months ago
Someone could have taken the private github key and cloned your customers' private repos.
You would need to audit every single access to github made via your app since the beginning and link it somehow to your side. Did you do this?
yunohn|6 months ago
elpakal|6 months ago
In case you don't want to read through the PR
KingOfCoders|6 months ago
tadfisher|6 months ago
smsm42|6 months ago
jsbg|6 months ago
cleverwebb|6 months ago
curuinor|6 months ago
socalgal2|6 months ago
Effectively, many (most?) 3rd party github integrations basically ask you to type in your github ID. Then they use the github API and ask for maximal permissions. This lets them make it easy for you to use their services because they can do all the rest of the setup for you. But, NO ONE SHOULD EVER GIVE THIS KIND OF PERMISSION.
Any 3rd party service that said "give us root to your servers" would be laughed out of the market. But, that's what github has encouraged because their default workflow leaves it up to the developer to do the right thing.
Instead, github's auth UX should (1) require you to choose repos (2) not allow picking "all repos" (3) require to you select each and every permission (4) not have an option for "all permissions".
As an analogy (though poor). iOS and MacOS don't say "this app wants all these permissions, yes/no" (android used to do this). Instead, they ask one at a time (camera? mic? photos? network?) etc... I'm not suggesting that github ask one at a time. I am suggesting that github provide a default UI that lists all the permissions, per repo, and has no way to auto-populate it so the user is required to choose.
Further, I would argue that github should show the integrations and permissions for any repo. The hope being if I see "lib X uses integration Y with write permission" then I know lib X is not trustworthy because it's open to supply chain attacks (more than lib Z which has no write integrations)
username223|6 months ago
1. Allow poorly-vetted third-party tools to run in CodeRabbit's privileged environment. The exploit used a Ruby code analysis tool that was probably written 15 years ago and meant to be run locally by trusted developers, who already had access to /bin/sh.
2. Ask for coarse-grained permission to access and modify others' code without any checks.
Either of those by itself would be bad enough. The future looks bright for black or white hats who understand computers.
edm0nd|6 months ago
cube00|6 months ago
Instead they took it as an opportunity to market their new sandboxing on Google's blog [2] again with no mention of why their hand was forced into building the sandboxing they should have had before they rushed to onboard thousands of customers.
I have no idea what their plan was. They had to have known the researchers would eventually publish this. Perhaps they were hoping it wouldn't get the same amount of attention it would if they posted it on their own blog.
[1]: https://news.ycombinator.com/item?id=44954560
[2]: https://news.ycombinator.com/item?id=44954242
mpeg|6 months ago
pabs3|6 months ago
For check-all-the-things (meta-linter), we disable the rubocop default config file using the "--config /dev/null" options.
iTokio|6 months ago
It’s very efficient to delegate something to one major actor but we are introducing single points of failure and are less resilient to vulnerabilities.
Critical systems should have defenses in depth, decentralized architectures and avoid trusting new providers with too many moving parts.
ewok94301|6 months ago
sitzkrieg|6 months ago
kmarc|6 months ago
In fact, I use rubocop every day lately LOL
curuinor|6 months ago
https://www.coderabbit.ai/blog/our-response-to-the-january-2...
marksomnian|6 months ago
How can CodeRabbit be certain that the GitHub App key was not exfiltrated and used to sign malicious tokens for customer repos (or even used for that in-situ)? I'm not sure if GitHub supports restricting the source IPs of API requests, but if it does, it'd be a trivial mitigation - and one that is absent from the blog post.
The claim that "no malicious activity occurred" implies that they audited the activities of every repo that used Rubocop (or any other potential unsandboxed tool) from the point that support was added for it until the point that the vulnerability was fixed. That's a big claim.
And why only publish this now, when the Kudelski article makes it to the top of HN, over six months after it was disclosed to them?
jatins|6 months ago
How do they know this -- Do they have any audit logs confirming this? A malicious actor could have been using this for months for all they know
unknown|6 months ago
[deleted]
kachapopopow|6 months ago
elpakal|6 months ago
t0duf0du|6 months ago
nodesocket|6 months ago
megamorf|6 months ago
``` "POSTGRESQL_DATABASE": "(CENSORED)", "POSTGRESQL_HOST": "(CENSORED)", "POSTGRESQL_PASSWORD": "(CENSORED)", "POSTGRESQL_USER": "(CENSORED)", ```
_pdp_|6 months ago
viraptor|6 months ago
pengaru|6 months ago
Why would you even grant it such permissions? this is ridiculous.
kmarc|6 months ago
Scary.
mariocandela|6 months ago
thewisenerd|6 months ago
i believe the answer here was to exchange the token for something scoped to the specific repo coderabbit is running in, but alas, that doesn't remove the "RCE" _on_ the repo
tadfisher|6 months ago
KingOfCoders|6 months ago
Or did I read the article wrong?
megamorf|6 months ago
ianbutler|6 months ago
We even offer a self hosted deployment which sidesteps this entirely. (feel free to reach out).
www.bismuth.sh
KronisLV|6 months ago
> For defense in depth, one should add a mechanism that prevents sending private information to an attacker-controlled server. For example, only allow outgoing traffic to whitelisted hosts, if possible. If the tool doesn’t require internet access, then all network traffic may even be disabled in that isolated environment. This way it would make it harder for an attacker to exfiltrate secrets.
I yearn to live in a world where this is the default or at least REALLY EASY to do, where you just fall into the pit of success.
And yet, we live in a borderline insane world where one key getting leaked can pwn a million repos - if nothing else, there should be one key per interaction with account/repo. Not to mention that Rubocop (and probably other tools, eventually) have arbitrary code execution as a feature.
I don't think that CodeRabbit messed up, as much as everything around them is already messed up.
dominodave01|6 months ago
gregjw|6 months ago
eoncode|6 months ago
rohitcoder|6 months ago
[deleted]
megamindbrian2|6 months ago
[deleted]
apps4datr2025|6 months ago
[deleted]
unit149|6 months ago
[deleted]
laxaman|6 months ago
[deleted]
binarydreams|6 months ago
Based on the env vars seems like they're using anthropic, openai, etc. only?
tnolet|6 months ago
jacobsenscott|6 months ago
Is that good? I assume it just catches a different 10% of the bugs.