It's pretty interesting that they didn't just introduce an RCE that anyone can exploit, it requires the attacker's private key. It's ironically a very security conscious vulnerability.
I suspect the original rationale is about preserving the longevity of the backdoor. If you blow a hole wide open that anyone can enter, it’s going to be found and shut down quickly.
If this hadn’t had the performance impact that brought it quickly to the surface, it’s possible that this would have lived quietly for a long time exactly because it’s not widely exploitable.
If you think of it as a state sponsored attack it makes a lot of sense to have a "secure" vulnerability in system that your own citizens might use.
It looks like the whole contribution to xz was an effort to just inject that backdoor. For example the author created the whole test framework where he could hide the malicious payload.
Before he started work on xz, he made contribution to libarchive in BSD which created a vulnerability.
For real, it's almost like a state-sponsored exploit. It's crafted and executed incredibly well, the performance issue feels like pure luck it got found.
Am I reading it correctly that the payload signature includes the target SSH host key? So you can't just spray it around to servers, it's fairly computationally expensive to send it to a host.
It's a (failed) case study in "what if we backdoor it in a way only good guys can use but bad guys can't?"
Computers don't know who's / what's good or bad. They're deterministic machines which responds to commands.
I don't know whether there'll be any clues about who did this, but this will be the poster child of "you can't have backdoors and security in a single system" argument (which I strongly support).
IMHO it's not that surprising; asymmetric crypto has been common in ransomware for a long time, and of course ransomware in general is based on securing data from its owner.
This whole thing has been consuming me over the whole weekend. The mechanisms are interesting and a collection of great obfuscations, the social engineering is a story that’s shamefully all too familiar for open source maintainers.
I find most interesting how they chose their attack vector of using BAD test data, it makes the rest of the steps incredibly easier when you have a good archive, manipulate it in a structured method (this should show on a graph of the binary pattern btw for future reference) then use it for a fuzzy bad data test. It’s great.
The rest of the techniques are banal enough except the most brilliant move seems to be that they could add “patches” or even whole new back doors using the same pattern on a different test file. Without being noticed.
Really really interesting, GitHub shouldn’t have hidden and removed the repo though. It’s not helpful at all to work through this whole drama.
Edit: I don’t mean to say this is banal in any way, but once the payload was decided and achieved through a super clever idea, the rest was just great obfuscation.
It’s got me suspicious of build-time dependency we have in an open source tool, where the dependency goes out of its way to prefer xz and we even discovered that it installs xz on the host machine if it isn’t already installed — as a convenience. Kinda weird because it didn’t do that for any other dependencies.
These long-games are kinda scary and until whatever “evil” is actually done you have no idea what is actually malicious or just weird.
A main culprit seems to be the addition of binary files to the repo, to be used as test inputs. Especially if these files are “binary garbage” to prove a test fails. Seems like an obvious place to hide malicious stuff.
Super impressed how quickly the community and in particular amlweems were able to implement and document a POC. If the cryptographic or payload loading functionality has no further vulnerabilities, this would have been also at least not introducing a security flaw to all the other attackers until the key is broken or something.
Edit: I think what's next for anyone is to figure out a way to probe for vulnerable deployments (which seems non-trivial) and also perhaps possibly ?upstreaming? a way to monitor if someone actively probes ssh servers with the hardcoded key.
Well, it's a POC against a re-keyed version of the exploit; a POC against the original version would require the attacker's private key, which is undisclosed.
Probing for vulnerable deployments over the network (without the attacker's private key) seems impossible, not non-trivial.
The best one could do is more micro-benchmarking, but for an arbitrary Internet host you aren't going to know whether it's slow because it's vulnerable, or because it's far away, or because the computer's slow in general -- you don't have access to how long connection attempts to that host took historically. (And of course, there are also routing fluctuations.)
Has anyone tried the PoC against one of the anomalous process behavior tools? (Carbon Black, AWS GuardDuty, SysDig, etc.) I’m curious how likely it is that someone would have noticed relatively quickly had this rolled forward and this seems like a perfect test case for that product category.
Depends how closely the exploit mirrors and/or masks itself within normal compression behavior imo.
I don’t think GuardDuty would catch it as it doesn’t look at processes like an EDR does (CrowdStrike, Carbon black), I don’t think sysdig would catch it as looks at containers and cloud infra. Handwaving some complexity here, as GD and sysdig could prob catch something odd via privileges gained and follow-on efforts by the threat actor via this exploit.
So imo means only EDRs (monitoring processes on endpoints) or software supply chain evaluations (monitoring sec problems in upstream FOSS) are most likely to catch the exploit itself.
Leads into another fairly large security theme interestingly - dev teams can dislike putting EDRs on boxes bc of the hit on compute and UX issues if a containment happens, and can dislike limits policy and limits around FOSS use. So this exploit hits at the heart of a org-driven “vulnerability” that has a lot of logic to stay exposed to or to fix, depending on where you sit. Security industry’s problem set in a nutshell.
Sysdig released a blog on friday. "For runtime detection, one way to go about it is to watch for the loading of the malicious library by SSHD. These shared libraries often include the version in their filename."
The blog has the actual rule content which I haven't seen from other security vendors
Edit: I misunderstood what I was reading in the link below, my original comment is here for posterity. :)
> From down in the same mail thread: it looks like the individual who committed the backdoor has made some recent contributions to the kernel as well... Ouch.
No that patch series is from Lasse. He said himself that it's not urgent in any way and it won't be merged this merge window, but nobody (sane) is accusing Lasse of being the bad actor.
The parallels in this one to the audacity event a couple years back are ridiculous.
Cookie guy claimed that he got stabbed and that the federal police was involved in the case, which kind of hints that the events were connected to much bigger actors than just 4chan. At the time a lot of people thought its just Muse Group that's involved, but maybe it was a (Russian) state actor?
Because before that he claimed that audacity had lots of telemetry/backdoors which were the reason he forked and removed in his first commits. Maybe audacity is backdoored after all?
When loading liblzma, it patches the ELF GOT (global offset table) with the address of the malicious code. In case it's loaded before libcrypto, it registers a symbol audit handler (a glibc-specific feature, IIUC) to get notified when libcrypto's symbols are resolved so it can defer patching the GOT.
Do we know if this exploit only did something if a SSH connection was made? There's a list of strings from it on Github that includes "DISPLAY" and "WAYLAND_DISPLAY":
These don't have any obvious connection to SSH, so maybe it did things even if there was no connection. This could be important to people who ran the code but never exposed their SSH server to the Internet, which some people seem to be assuming was safe.
> Note: successful exploitation does not generate any log entries.
Does this mean, had this exploit gone unnoticed, the attacker could have executed arbitrary commands as root without even a single sshd log entry on the compromised host regarding the 'connection'?
I wonder if separating the test files out into their own repo, so that they would not have been available at build time could have made this harder. The reasoning being that anything available and this potentially involved in the build should be human readable.
> Anything available and this potentially involved in the build should be human readable.
That's actually a good principle to adopt overall.
We should treat this attack like an air plane accident and adopt new rules that mitigate the chances of it being successfully carried out again. We might not be able to vet every single person who contributes, but we should be able to easily separate out noisy test data.
Is there anything actually illegal here? Like is it a plausible “business” model for talented and morally compromised developers to do this and then sell the private key to state actors without actually breaking in themselves or allowing anyone else to break in.
Edit: MIT license provides a pretty broad disclaimer to say it isn’t fit for any purpose implied or otherwise.
One thing I notice about state-level espionage and backdoors. The USA seems to have an affinity for hardware interdiction as opposed to software backdoors. Hardware backdoors make sense since much of it passes through the USA.
Other countries such as Israel are playing the long-con with very well engineered, multi-year software backdoors. A much harder game to play.
This whole thing makes me wonder if AI could detect "anomalies" like the human who found the actual hack did. Observe a system (lots of systems) and use that data to spot anomylous behavior from "new" versions of packages being added, throw up a red flag that is like "this really doesn't act like it did in the past because parameter(s) are unusual given previous versions"
Is there any progress on identifying the attacker? This would make it much easier to find out it this was really a state-sponsored attack.
If this backdoor can be classified as a crime, github logs can identify the IP/location/other details of the attacker which is more than enough to identify them, unless their OPSEC is perfect, which it almost never is (e.g. Ross Ulbricht).
[+] [-] asveikau|2 years ago|reply
[+] [-] haswell|2 years ago|reply
If this hadn’t had the performance impact that brought it quickly to the surface, it’s possible that this would have lived quietly for a long time exactly because it’s not widely exploitable.
[+] [-] takeda|2 years ago|reply
It looks like the whole contribution to xz was an effort to just inject that backdoor. For example the author created the whole test framework where he could hide the malicious payload.
Before he started work on xz, he made contribution to libarchive in BSD which created a vulnerability.
[+] [-] Alifatisk|2 years ago|reply
[+] [-] cesarb|2 years ago|reply
[+] [-] linsomniac|2 years ago|reply
[+] [-] Taniwha|2 years ago|reply
[+] [-] bayindirh|2 years ago|reply
Computers don't know who's / what's good or bad. They're deterministic machines which responds to commands.
I don't know whether there'll be any clues about who did this, but this will be the poster child of "you can't have backdoors and security in a single system" argument (which I strongly support).
[+] [-] userbinator|2 years ago|reply
"It's not only the good guys who have guns."
[+] [-] nialv7|2 years ago|reply
[+] [-] whirlwin|2 years ago|reply
[+] [-] aaron695|2 years ago|reply
[deleted]
[+] [-] bilekas|2 years ago|reply
I find most interesting how they chose their attack vector of using BAD test data, it makes the rest of the steps incredibly easier when you have a good archive, manipulate it in a structured method (this should show on a graph of the binary pattern btw for future reference) then use it for a fuzzy bad data test. It’s great.
The rest of the techniques are banal enough except the most brilliant move seems to be that they could add “patches” or even whole new back doors using the same pattern on a different test file. Without being noticed.
Really really interesting, GitHub shouldn’t have hidden and removed the repo though. It’s not helpful at all to work through this whole drama.
Edit: I don’t mean to say this is banal in any way, but once the payload was decided and achieved through a super clever idea, the rest was just great obfuscation.
[+] [-] withinboredom|2 years ago|reply
These long-games are kinda scary and until whatever “evil” is actually done you have no idea what is actually malicious or just weird.
[+] [-] mondrian|2 years ago|reply
[+] [-] throw156754228|2 years ago|reply
[+] [-] miduil|2 years ago|reply
Edit: I think what's next for anyone is to figure out a way to probe for vulnerable deployments (which seems non-trivial) and also perhaps possibly ?upstreaming? a way to monitor if someone actively probes ssh servers with the hardcoded key.
Kudos!
[+] [-] rst|2 years ago|reply
[+] [-] cjbprime|2 years ago|reply
The best one could do is more micro-benchmarking, but for an arbitrary Internet host you aren't going to know whether it's slow because it's vulnerable, or because it's far away, or because the computer's slow in general -- you don't have access to how long connection attempts to that host took historically. (And of course, there are also routing fluctuations.)
[+] [-] acdha|2 years ago|reply
[+] [-] dogman144|2 years ago|reply
I don’t think GuardDuty would catch it as it doesn’t look at processes like an EDR does (CrowdStrike, Carbon black), I don’t think sysdig would catch it as looks at containers and cloud infra. Handwaving some complexity here, as GD and sysdig could prob catch something odd via privileges gained and follow-on efforts by the threat actor via this exploit.
So imo means only EDRs (monitoring processes on endpoints) or software supply chain evaluations (monitoring sec problems in upstream FOSS) are most likely to catch the exploit itself.
Leads into another fairly large security theme interestingly - dev teams can dislike putting EDRs on boxes bc of the hit on compute and UX issues if a containment happens, and can dislike limits policy and limits around FOSS use. So this exploit hits at the heart of a org-driven “vulnerability” that has a lot of logic to stay exposed to or to fix, depending on where you sit. Security industry’s problem set in a nutshell.
[+] [-] knoxa2511|2 years ago|reply
The blog has the actual rule content which I haven't seen from other security vendors
https://sysdig.com/blog/cve-2024-3094-detecting-the-sshd-bac...
[+] [-] saagarjha|2 years ago|reply
[+] [-] bobby_the_whale|2 years ago|reply
[deleted]
[+] [-] faxmeyourcode|2 years ago|reply
> From down in the same mail thread: it looks like the individual who committed the backdoor has made some recent contributions to the kernel as well... Ouch.
https://www.openwall.com/lists/oss-security/2024/03/29/10
The OP is such great analysis, I love reading this kind of stuff!
[+] [-] ibotty|2 years ago|reply
[+] [-] davikr|2 years ago|reply
[+] [-] Denvercoder9|2 years ago|reply
[+] [-] wezdog1|2 years ago|reply
[+] [-] dxthrwy856|2 years ago|reply
Cookie guy claimed that he got stabbed and that the federal police was involved in the case, which kind of hints that the events were connected to much bigger actors than just 4chan. At the time a lot of people thought its just Muse Group that's involved, but maybe it was a (Russian) state actor?
Because before that he claimed that audacity had lots of telemetry/backdoors which were the reason he forked and removed in his first commits. Maybe audacity is backdoored after all?
Have to check the audacity source code now.
[+] [-] MuffinFlavored|2 years ago|reply
How did the exploit do this at runtime?
I know the chain was:
opensshd -> systemd for notifications -> xz included as transient dependency
How did liblzma.so.5.6.1 hook/patch all the way back to openssh_RSA_verify when it was loaded into memory?
[+] [-] tadfisher|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] jeffrallen|2 years ago|reply
[+] [-] mrob|2 years ago|reply
https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01
These don't have any obvious connection to SSH, so maybe it did things even if there was no connection. This could be important to people who ran the code but never exposed their SSH server to the Internet, which some people seem to be assuming was safe.
[+] [-] herpderperator|2 years ago|reply
Does this mean, had this exploit gone unnoticed, the attacker could have executed arbitrary commands as root without even a single sshd log entry on the compromised host regarding the 'connection'?
[+] [-] sureglymop|2 years ago|reply
[+] [-] heeen2|2 years ago|reply
[+] [-] Zuiii|2 years ago|reply
That's actually a good principle to adopt overall.
We should treat this attack like an air plane accident and adopt new rules that mitigate the chances of it being successfully carried out again. We might not be able to vet every single person who contributes, but we should be able to easily separate out noisy test data.
[+] [-] gghffguhvc|2 years ago|reply
Edit: MIT license provides a pretty broad disclaimer to say it isn’t fit for any purpose implied or otherwise.
[+] [-] declan_roberts|2 years ago|reply
Other countries such as Israel are playing the long-con with very well engineered, multi-year software backdoors. A much harder game to play.
[+] [-] jobs_throwaway|2 years ago|reply
[+] [-] aborsy|2 years ago|reply
It’s almost never recommended, in favor of curve 25519.
[+] [-] arnaudsm|2 years ago|reply
[+] [-] wolverine876|2 years ago|reply
I'm especially interested in how such exploits can be prevented in the future.
[+] [-] EasyMark|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] cgh|2 years ago|reply
>Interestingly enough, "Jia Tan" is very close to 加蛋 in Mandarin, meaning "to add an egg". Unlikely to be a real name or a coincidence.
[+] [-] AlexanderTheGr8|2 years ago|reply
If this backdoor can be classified as a crime, github logs can identify the IP/location/other details of the attacker which is more than enough to identify them, unless their OPSEC is perfect, which it almost never is (e.g. Ross Ulbricht).