Xzbot: Notes, honeypot, and exploit demo for the xz backdoor

[+] asveikau|2 years ago|reply

It's pretty interesting that they didn't just introduce an RCE that anyone can exploit, it requires the attacker's private key. It's ironically a very security conscious vulnerability.

[+] haswell|2 years ago|reply

I suspect the original rationale is about preserving the longevity of the backdoor. If you blow a hole wide open that anyone can enter, it’s going to be found and shut down quickly.

If this hadn’t had the performance impact that brought it quickly to the surface, it’s possible that this would have lived quietly for a long time exactly because it’s not widely exploitable.

[+] takeda|2 years ago|reply

If you think of it as a state sponsored attack it makes a lot of sense to have a "secure" vulnerability in system that your own citizens might use.

It looks like the whole contribution to xz was an effort to just inject that backdoor. For example the author created the whole test framework where he could hide the malicious payload.

Before he started work on xz, he made contribution to libarchive in BSD which created a vulnerability.

[+] Alifatisk|2 years ago|reply

For real, it's almost like a state-sponsored exploit. It's crafted and executed incredibly well, the performance issue feels like pure luck it got found.

[+] cesarb|2 years ago|reply

This is called NOBUS: https://en.wikipedia.org/wiki/NOBUS

[+] linsomniac|2 years ago|reply

Am I reading it correctly that the payload signature includes the target SSH host key? So you can't just spray it around to servers, it's fairly computationally expensive to send it to a host.

[+] Taniwha|2 years ago|reply

I wonder if anyone has packet logs from the past few weeks that show attempts at sshd - might be some incriminating IP addresses to start hunting with

[+] bayindirh|2 years ago|reply

It's a (failed) case study in "what if we backdoor it in a way only good guys can use but bad guys can't?"

Computers don't know who's / what's good or bad. They're deterministic machines which responds to commands.

I don't know whether there'll be any clues about who did this, but this will be the poster child of "you can't have backdoors and security in a single system" argument (which I strongly support).

[+] userbinator|2 years ago|reply

IMHO it's not that surprising; asymmetric crypto has been common in ransomware for a long time, and of course ransomware in general is based on securing data from its owner.

"It's not only the good guys who have guns."

[+] nialv7|2 years ago|reply

how are you going to sell it if anyone can get in?

[+] whirlwin|2 years ago|reply

Imagine how much the private key is worth on the black market

[+] aaron695|2 years ago|reply

[deleted]

[+] bilekas|2 years ago|reply

This whole thing has been consuming me over the whole weekend. The mechanisms are interesting and a collection of great obfuscations, the social engineering is a story that’s shamefully all too familiar for open source maintainers.

I find most interesting how they chose their attack vector of using BAD test data, it makes the rest of the steps incredibly easier when you have a good archive, manipulate it in a structured method (this should show on a graph of the binary pattern btw for future reference) then use it for a fuzzy bad data test. It’s great.

The rest of the techniques are banal enough except the most brilliant move seems to be that they could add “patches” or even whole new back doors using the same pattern on a different test file. Without being noticed.

Really really interesting, GitHub shouldn’t have hidden and removed the repo though. It’s not helpful at all to work through this whole drama.

Edit: I don’t mean to say this is banal in any way, but once the payload was decided and achieved through a super clever idea, the rest was just great obfuscation.

[+] withinboredom|2 years ago|reply

It’s got me suspicious of build-time dependency we have in an open source tool, where the dependency goes out of its way to prefer xz and we even discovered that it installs xz on the host machine if it isn’t already installed — as a convenience. Kinda weird because it didn’t do that for any other dependencies.

These long-games are kinda scary and until whatever “evil” is actually done you have no idea what is actually malicious or just weird.

[+] mondrian|2 years ago|reply

A main culprit seems to be the addition of binary files to the repo, to be used as test inputs. Especially if these files are “binary garbage” to prove a test fails. Seems like an obvious place to hide malicious stuff.

[+] throw156754228|2 years ago|reply

They removed the repo so only the attackers had access to the code and know how.

[+] miduil|2 years ago|reply

Super impressed how quickly the community and in particular amlweems were able to implement and document a POC. If the cryptographic or payload loading functionality has no further vulnerabilities, this would have been also at least not introducing a security flaw to all the other attackers until the key is broken or something.

Edit: I think what's next for anyone is to figure out a way to probe for vulnerable deployments (which seems non-trivial) and also perhaps possibly ?upstreaming? a way to monitor if someone actively probes ssh servers with the hardcoded key.

Kudos!

[+] rst|2 years ago|reply

Well, it's a POC against a re-keyed version of the exploit; a POC against the original version would require the attacker's private key, which is undisclosed.

[+] cjbprime|2 years ago|reply

Probing for vulnerable deployments over the network (without the attacker's private key) seems impossible, not non-trivial.

The best one could do is more micro-benchmarking, but for an arbitrary Internet host you aren't going to know whether it's slow because it's vulnerable, or because it's far away, or because the computer's slow in general -- you don't have access to how long connection attempts to that host took historically. (And of course, there are also routing fluctuations.)

[+] acdha|2 years ago|reply

Has anyone tried the PoC against one of the anomalous process behavior tools? (Carbon Black, AWS GuardDuty, SysDig, etc.) I’m curious how likely it is that someone would have noticed relatively quickly had this rolled forward and this seems like a perfect test case for that product category.

[+] dogman144|2 years ago|reply

Depends how closely the exploit mirrors and/or masks itself within normal compression behavior imo.

I don’t think GuardDuty would catch it as it doesn’t look at processes like an EDR does (CrowdStrike, Carbon black), I don’t think sysdig would catch it as looks at containers and cloud infra. Handwaving some complexity here, as GD and sysdig could prob catch something odd via privileges gained and follow-on efforts by the threat actor via this exploit.

So imo means only EDRs (monitoring processes on endpoints) or software supply chain evaluations (monitoring sec problems in upstream FOSS) are most likely to catch the exploit itself.

Leads into another fairly large security theme interestingly - dev teams can dislike putting EDRs on boxes bc of the hit on compute and UX issues if a containment happens, and can dislike limits policy and limits around FOSS use. So this exploit hits at the heart of a org-driven “vulnerability” that has a lot of logic to stay exposed to or to fix, depending on where you sit. Security industry’s problem set in a nutshell.

[+] knoxa2511|2 years ago|reply

Sysdig released a blog on friday. "For runtime detection, one way to go about it is to watch for the loading of the malicious library by SSHD. These shared libraries often include the version in their filename."

The blog has the actual rule content which I haven't seen from other security vendors

https://sysdig.com/blog/cve-2024-3094-detecting-the-sshd-bac...

[+] saagarjha|2 years ago|reply

That entire product category is for the most part snake oil.

[+] bobby_the_whale|2 years ago|reply

[deleted]

[+] faxmeyourcode|2 years ago|reply

Edit: I misunderstood what I was reading in the link below, my original comment is here for posterity. :)

> From down in the same mail thread: it looks like the individual who committed the backdoor has made some recent contributions to the kernel as well... Ouch.

https://www.openwall.com/lists/oss-security/2024/03/29/10

The OP is such great analysis, I love reading this kind of stuff!

[+] ibotty|2 years ago|reply

No that patch series is from Lasse. He said himself that it's not urgent in any way and it won't be merged this merge window, but nobody (sane) is accusing Lasse of being the bad actor.

[+] davikr|2 years ago|reply

Lasse Collin is not Jia Tan until proven otherwise.

[+] Denvercoder9|2 years ago|reply

The referenced patch series had not made it into the kernel yet.

[+] wezdog1|2 years ago|reply

Also it may br a coincidence but JiaT75 looks a lot like Transponder 7500 which in aviation means hijacked...

[+] dxthrwy856|2 years ago|reply

The parallels in this one to the audacity event a couple years back are ridiculous.

Cookie guy claimed that he got stabbed and that the federal police was involved in the case, which kind of hints that the events were connected to much bigger actors than just 4chan. At the time a lot of people thought its just Muse Group that's involved, but maybe it was a (Russian) state actor?

Because before that he claimed that audacity had lots of telemetry/backdoors which were the reason he forked and removed in his first commits. Maybe audacity is backdoored after all?

Have to check the audacity source code now.

[+] MuffinFlavored|2 years ago|reply

Instead of needing the honeypot openssh.patch at compile-time https://github.com/amlweems/xzbot/blob/main/openssh.patch

How did the exploit do this at runtime?

I know the chain was:

opensshd -> systemd for notifications -> xz included as transient dependency

How did liblzma.so.5.6.1 hook/patch all the way back to openssh_RSA_verify when it was loaded into memory?

[+] tadfisher|2 years ago|reply

When loading liblzma, it patches the ELF GOT (global offset table) with the address of the malicious code. In case it's loaded before libcrypto, it registers a symbol audit handler (a glibc-specific feature, IIUC) to get notified when libcrypto's symbols are resolved so it can defer patching the GOT.

[+] unknown|2 years ago|reply

[deleted]

[+] unknown|2 years ago|reply

[deleted]

[+] jeffrallen|2 years ago|reply

ifunc

[+] mrob|2 years ago|reply

Do we know if this exploit only did something if a SSH connection was made? There's a list of strings from it on Github that includes "DISPLAY" and "WAYLAND_DISPLAY":

https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01

These don't have any obvious connection to SSH, so maybe it did things even if there was no connection. This could be important to people who ran the code but never exposed their SSH server to the Internet, which some people seem to be assuming was safe.

[+] herpderperator|2 years ago|reply

> Note: successful exploitation does not generate any log entries.

Does this mean, had this exploit gone unnoticed, the attacker could have executed arbitrary commands as root without even a single sshd log entry on the compromised host regarding the 'connection'?

[+] sureglymop|2 years ago|reply

Yes.. The RCE happens at the connection stage before anything is logged.

[+] heeen2|2 years ago|reply

I wonder if separating the test files out into their own repo, so that they would not have been available at build time could have made this harder. The reasoning being that anything available and this potentially involved in the build should be human readable.

[+] Zuiii|2 years ago|reply

> Anything available and this potentially involved in the build should be human readable.

That's actually a good principle to adopt overall.

We should treat this attack like an air plane accident and adopt new rules that mitigate the chances of it being successfully carried out again. We might not be able to vet every single person who contributes, but we should be able to easily separate out noisy test data.

[+] gghffguhvc|2 years ago|reply

Is there anything actually illegal here? Like is it a plausible “business” model for talented and morally compromised developers to do this and then sell the private key to state actors without actually breaking in themselves or allowing anyone else to break in.

Edit: MIT license provides a pretty broad disclaimer to say it isn’t fit for any purpose implied or otherwise.

[+] declan_roberts|2 years ago|reply

One thing I notice about state-level espionage and backdoors. The USA seems to have an affinity for hardware interdiction as opposed to software backdoors. Hardware backdoors make sense since much of it passes through the USA.

Other countries such as Israel are playing the long-con with very well engineered, multi-year software backdoors. A much harder game to play.

[+] jobs_throwaway|2 years ago|reply

Imagine how frustrating it has to be for the attacker to meticulously plan and execute this and get foiled so late in the game, and so publicly

[+] aborsy|2 years ago|reply

Why ED448?

It’s almost never recommended, in favor of curve 25519.

[+] arnaudsm|2 years ago|reply

Have we seen exploitation in the wild yet?

[+] wolverine876|2 years ago|reply

Have the heads of the targeted projects - including xz (Lasse Collin?), OpenSSH (Theo?), and Linux (Linus) - commented on it?

I'm especially interested in how such exploits can be prevented in the future.

[+] EasyMark|2 years ago|reply

This whole thing makes me wonder if AI could detect "anomalies" like the human who found the actual hack did. Observe a system (lots of systems) and use that data to spot anomylous behavior from "new" versions of packages being added, throw up a red flag that is like "this really doesn't act like it did in the past because parameter(s) are unusual given previous versions"

[+] unknown|2 years ago|reply

[deleted]

[+] cgh|2 years ago|reply

Comment I saw on Ars:

>Interestingly enough, "Jia Tan" is very close to 加蛋 in Mandarin, meaning "to add an egg". Unlikely to be a real name or a coincidence.

[+] AlexanderTheGr8|2 years ago|reply

Is there any progress on identifying the attacker? This would make it much easier to find out it this was really a state-sponsored attack.

If this backdoor can be classified as a crime, github logs can identify the IP/location/other details of the attacker which is more than enough to identify them, unless their OPSEC is perfect, which it almost never is (e.g. Ross Ulbricht).

469 comments