What we know about the xz Utils backdoor that almost infected the world

[+] sandstrom|1 year ago|reply

    The goal is to use a standardized test framework to ease writing of tests in XZ. 
    Much of the functionality remains untested, so it will be helpful for long term project stability to have more tests
    
    -- Jia, 2022-06-17

This was a long time in the making.

[+] usrusr|1 year ago|reply

What I haven't seen discussed much is the linking mechanism that allowed the lib to hook into RSA_public_decrypt. Plenty of talk about what could or could not be achieved by even more process separation and the like, but little about that function call redirect. Could it be possible to establish a way to link critical components like the code for incoming ssh with libraries in some tiered trust way? "I trust you when and where I call you, but I won't allow you to introduce yourself to other call sites"?

This would surely fall into the category of "there would be ways around it, so why bother?" that triggers a "by obscurity" reflex in many, but I'd consider it reduced attack surface.

[+] supriyo-biswas|1 year ago|reply

An Erlang actor like model where a program’s sub components call each other by message passing with each having its own security context may work.

However, there are multiple security contexts at play in an operating system; with regards to the XZ backdoor it’s mostly about capability based security at the module level, but you also have capabilities at the program level, and isolation at the memory level (paging), isolation at the micro architectural level, and so on. Ensuring all of these elements work together while still delivering performance seems to be rather challenging, and it’s definitely not possible for the Unix-likes to make a move to such a model because of the replacement of the concept of processes.

[+] sliken|1 year ago|reply

From what I can tell the problem is the use of glibc's IFUNC. This broke the testing for XZ, suddenly accounts appeared to lobby for disabling that testing, which enabled the exploit.

[+] MonkeeSage|1 year ago|reply

Essentially the code patches ifunc resolvers to call into a supplied malicious object by surreptitiously modifying c files at build time (when ./configure && make is run) and links the compromised object files with a `now` linker flag that causes the ifuncs to be resolved immediately at library load time, which calls the patched resolvers while the process linkage table is still writable, which is the important part that allows them to just hijack the RSA_public_decrypt function in memory when the library is loaded.

There's an excellent technical breakdown of the backdoor *injection process here: https://research.swtch.com/xz-script

[+] kzrdude|1 year ago|reply

systemd merged a change to using dlopen for compression libraries recently https://github.com/systemd/systemd/pull/31550 which is a safer linking method in that sense.

[+] cozzyd|1 year ago|reply

would it be sufficient to monitor the symbols called by ltrace, somehow?

[+] browser1|1 year ago|reply

[deleted]

[+] AshamedCaptain|1 year ago|reply

Why do they say "almost" infected the world? At least 3 quite popular Linux distributions (arch, gentoo, and opensuse tumbleweed) ended up shipping the backdoor _for weeks_ , and it was most definitely working in at least tumbleweed. For weeks! A backdoored ssh! Hardly "almost".

[+] christophilus|1 year ago|reply

The exploit didn’t actually run on those distros, though. Only on deb/rpm targets. So, effectively, it was stopped before it hit production.

[+] acdha|1 year ago|reply

Arch and Gentoo are fairly popular as hobbyist distributions but they’re far less common in professional use, especially for the servers running SSH which this attack targeted. That doesn’t mean what happened is in any way okay but if this hadn’t been noticed long enough to make it into RHEL or Debian/Ubuntu stable you would be hearing about it in notifications from your bank, healthcare providers, etc. A pre-auth RCE would mean anyone who doesn’t have a tightly-restricted network and robust flow logging would struggle to say that they hadn’t been affected.

[+] joveian|1 year ago|reply

It doesn't seem to have been actually included in the Arch (binary) package but only because the backdoor build system itself didn't include the backdoor for Arch. If you cmp -l liblzma.so.5.6.1 between xz-5.6.1-1 and xz-5.6.1-2 there are only tiny differences. I'm guessing they didn't notice this before writing the advisory.

https://github.com/QubesOS/qubes-issues/issues/9067#issuecom...

[+] nequo|1 year ago|reply

My guess is that most Linux boxes that run sshd use an LTS distro, not one with rolling release. But I don’t know how to get any data on this.

[+] markus_zhang|1 year ago|reply

I hope Open Source maintainers and the big companies get the message -- they need to change the financial outlook of open source maintaining.

[+] Denvercoder9|1 year ago|reply

Arch, Gentoo and openSUSE Tumbleweed are hardly the world.

[+] unknown|1 year ago|reply

[deleted]

[+] zouhair|1 year ago|reply

Also, what about those that are not yet discovered?

[+] stefanka|1 year ago|reply

I heard that sshd in arc doesn’t link to xz

[+] nabla9|1 year ago|reply

I bet $101 that we find something similar in the wild in the next 12 months as the maintainers start to look at each other's past commits with suspicion.

[+] dartos|1 year ago|reply

I think attacks like this have been on the rise for the past decade.

Just look at any critical, yet largely unknown codebase with very few maintainers.

[+] bombcar|1 year ago|reply

I wonder if we'll find the cases that were done and used, because if I had something like this and it worked, afterwards I'd "find it" with another account and get it fixed ...

[+] EdiX|1 year ago|reply

My personal takeaways from this:

1. Source distribution tarballs that contain code different from what's in the source repository are bad, we should move away from them. The other big supply chan attack (event-stream) also took advantage of something similar.

1a. As a consequence of (1) autogenerated artifacts should always be committed.

2. Autogenerated artifacts that everyone pagedowns over during code reviews is a problem. If you have this type of stuff in your repository also have an automatic test that checks that nobody tampered with it (it will also keep you from having stale autogenerated files in your repository).

3. A corollary of (1) and (2) is that autotools is bad and the autotools culture is bad.

4. Libsystemd is a problem for the ecosystem. People get dismissed as systemd haters for pointing this out but it's big, complicated, has a lot of dependencies and most programs use a tiny fraction of it. Encouraging every service to depend on it for initialization notifications is insane.

5. In general there's a culture that code reuse is always good, that depending on large libraries for small amounts of functionality is good. This is not true, dependencies are maintenance burden and a security risk, this needs to be weighted against the functionality they bring in.

6. Distro maintainers applying substantial patches to packages is a problem, it creates widely used de facto forks for libraries and applications that do not have real maintainers looking at them.

7. We need to make OSS work from the financial point of view for developers. Liblzma and xz-utils probably have tens of millions of install but a single maintainer with mental health problems.

8. This sucks to say, but code reviews and handing off maintainership, at the moment, need to take into account geopolitical considerations.

[+] shp0ngle|1 year ago|reply

I didn't realize it was a Microsoft engineer that works on Azure Postgres that found the issue.

Thanks, Microsoft, I like Azure now.

[+] mapasj|1 year ago|reply

I’m guessing the original maintainer of xz handed responsibilities to Jia Tan without ever seeing him/her or at least sharing a phone call. Is that common to only communicate only through email/github? I guess some maintainers of open source projects will be more cautious after this story.

[+] CaptainOfCoit|1 year ago|reply

> Is that common to only communicate only through email/github?

Absolutely. I've both taken over libraries as a maintainer and given away the responsibility of maintaining a library after only communicating via text, and having no idea who the "real" person is.

> I guess some maintainers of open source projects will be more cautious after this story.

Which is completely the wrong takeaway. It's not the maintainer who is responsible for what people end up pulling into their project, it's up to the people who work on the project. Either you trust the maintainer, or you don't, and when you start to depend on a library, you're implicitly signing up for updating yourself on who you are trusting. For better or worse.

[+] Gigachad|1 year ago|reply

That’s basically how it is right now. Millions of companies freeloading off the work of unpaid open source developers. Unsurprisingly they sometimes leave and it causes problems.

[+] secondcoming|1 year ago|reply

What difference would a phone call have made? How would it have added any confidence as to the intentions of the person whatsoever?

[+] hk__2|1 year ago|reply

> Is that common to only communicate only through email/github?

Yes. I’ve joined half a dozen open-source projects of various sizes (from 100 to 30k stars on GitHub) without ever calling anyone; written communication is the standard.

[+] supriyo-biswas|1 year ago|reply

If you’re being berated by multiple people as to your speed of delivery, then it is not unexpected for them to be convinced that they are somehow the problem, and transfer the project to whoever they feel at the time is the best choice without thinking through their decisions.

However, knowing a person personally doesn’t necessarily solve the problem.

I used to work on an open source project a long time ago (under a pseudonym) that I do not wish to name here for reasons that’ll become clear shortly. The lead programmer had a co-maintainer who the lead seemed to have known quite well.

The co-maintainer constantly gaslit me, and later, other maintainers, belittled them, criticized them for the smallest of bugs etc. (and not in a Linus Torvalds way, where the rants are educational if you remove the insults) until they left; and was egged on by the lead maintainer as they agreed with the technical substance of these arguments.

Many years later, the co-maintainer attempted a hostile takeover of the project, which did not go as expected, and soon after, multiple private correspondences with other people became public where it became clear that the co-maintainer always wanted to do this, and gaslighting other maintainers was just part of this goal. All of this, despite the fact that the two of them knew each other.

[+] thinkingemote|1 year ago|reply

They did communicate off list and non publicly, that's as much as we know at the moment.

As an open source developer he might have received donations too from the adversary - it's reasonably common for devs to get donations to "say thanks". He might have had voice chats with them, who knows. The emails might be with LEO at the moment but I think its in the public interest for all communications to be released.

[+] thiht|1 year ago|reply

What does it change? Assuming that either:

- Jia Tan was initially a trustworthy actor that subsequently became malicious (maybe they were paid or compromised somehow)

- Jia Tan was always malicious, but played the long game by starting with legitimate contributions/intent for 1-2 years

How would meeting them for real have any impact?

[+] 2OEH8eoCRo0|1 year ago|reply

Our goodwill is being used against us.

Suppose you have a chat with them and see that they're Chinese. What are your next actions? If you exclude them then that's racist right?

I don't have answers

[+] otherme123|1 year ago|reply

I guess the blame is on the people who decide to depend on a very small (by team size at least) project: https://xkcd.com/2347/ . While having plenty of safer alternatives.

Lets suppose I create a personal and hobby project. Suddenly RedHat, Debian, Amazon, Google... you name it, decide to put my project as a fundamental dependency of their toolchain, without giving me at least some support in the form of trustable developers. The more cautious I would be is to shut down the project entirely or abandon it, but more probably I would have fallen to Jia Tan tricks.

Also, the phone call and even a face to face meeting wouldn't give you extra security. In what scenario a phone conversation with Jia would expose him, or would make you suspicious enough to not delegate?

[+] wiredfool|1 year ago|reply

Yes, pretty much.

[+] supposemaybe|1 year ago|reply

So while everyone thinks this backdoor was caught early, its purpose might have been achieved already. Especially if those targets were developers who used rolling release distros, like Kali and Debian.

[+] nubinetwork|1 year ago|reply

This might be possible. I picked up some SSH traffic earlier in the week, and didn't think much of it at the time. Of course, this could also be a red herring. https://www.nubi-network.com/news.php?id=21

[+] GoblinSlayer|1 year ago|reply

>and argued that Lasse Collin, the longtime maintainer of xz Utils, hadn’t been updating the software often or fast enough.

This meme was a mistake.

[+] phkahler|1 year ago|reply

Why does SSH use xz? Should it? Is it really that important?

[+] Tuna-Fish|1 year ago|reply

It does not.

OpenSSH pulled in libsystemd to provide startup notification. Libsystemd pulled in liblzma. No code from liblzma normally ends up in OpenSSH. But because it is built as a dependency for libsystemd, it's build scripts are ran in the same environment as libsystemd, and OpenSSH.

The attack payload was hidden as an obfuscated binary blob in the liblzma tests directory, masqueraded as a compression test case. When lzma was compiled from the git sources, generating the build scripts using autotools, nothing untoward was done. But lzma was also provided as a source tarball that was used by distro packagers, that had the autotools already ran. The attacker replaced the autogenerated, unreadable script output with one that checked if liblzma was being compiled in the same environment as OpenSSH and if it was being compiled so that it was going to end up as a .deb or .rpm package, and if both were true, embed the attack payload into OpenSSH.

Then the attack payload started with a lot of checks, including testing whether OpenSSH was being started normally by init scripts or manually, and for the presence of usual debugging tools, and only attached the payload to the running process if it seemed like a "natural" bootup with no running debugging tools. When running, the payload hooked into private key verification, and if the correct private key attempted to login, the payload would take the rest of the incoming packet and call system with it, that is, provide remote code execution as root.

[+] jerrygenser|1 year ago|reply

openssh does not use xz but in some distros xz is used when it interfaces with systemd

[+] TacticalCoder|1 year ago|reply

> Malicious updates made to a ubiquitous tool were a few weeks away from going mainstream.

Imagine working, as an individual or as a group, for years and then getting caught mere weeks or months before most major distros were to incorporate your backdoor.

Someone or several people out there must be pissed off.

[+] 1vuio0pswjnm7|1 year ago|reply

"OpenSSH, the most popular sshd implementation, doesn't link the liblzma library, but Debian and many other Linux distributions add a patch to link sshd to systemd, a program that loads a variety of services during the system bootup. Systemd, in turn, links to liblzma, and this allows xz Utils to exert control over sshd."

Compare with:

"Xz is an open-source compression program, as well as a library that can be used to help you write your own program that deals with compressed data. It is used by a fairly large number of other programs, one of which is OpenSSH."

https://news.ycombinator.com/item?id=39881049

GNU's binutils links to liblzma. binutils is even more ubiquitous than OpenSSH; in most cases it's probably used in the compilation of OpenSSH, the operating systems on which sshd runs, and so on. The bad guys certainly picked a good project to potentially get deep into open source software.

[+] jbritton|1 year ago|reply

Does anyone know if this has been reported to the FBI and Homeland Security? Or can one just assume they will become aware?

[+] levi_n|1 year ago|reply

Im curious as to why they picked the commit cadence they did. Why do this over the course of two years and not, say 8 months or 15 months? After committing the first patch, why did they wait x days/weeks/months to commit the second? Were they timing the commits off of release schedules, following some predetermined schedule, or something else?

[+] yosito|1 year ago|reply

Are we ever going to figure out who Jia Tan is?

[+] unknown|1 year ago|reply

[deleted]

[+] andrewstuart|1 year ago|reply

I don’t suppose this will lead the us govt to invest heavily in the open source projects behind SSH?

[+] markus_zhang|1 year ago|reply

I have said this before and I'm saying it again: Open source maintainers' first responsibility is to take care of him/herself, mentally and financially. You guys should proactively seek payments from whoever uses your work commercially, and if the $$ is not good enough, you are on your own to continue do this voluntarily.

If you keep the front door open, eventually thieves will come and steal your stuffs. By the same principal, if you do not rigorously request payments and donations, you should expect people to take advantage of you.

From this perspective, Tan, whoever he/she is, actually did you a great service -- those big companies got a rude awakening and are scrambling to double check whether they are impacted, over weekend. It is a pity that this did NOT get into a stable release -- I know it's very rude to say so, but let me be honest here -- it will be a greater help to all open source maintainers if this actually gets into a stable release and fuck over as many people as possible.

[+] xyst|1 year ago|reply

What can be learned here and improve?

- systemd and libsystemd is an absolute mess. What are the alternatives at this point?

- what processes can be improved for Linux distributions to catch malicious actors trying to push back doors into the ecosystem?

- is there a dependency graph that would show how many programs use a specific utility library? Might aid in finding other possible attempts to backdoor

[+] mrkramer|1 year ago|reply

At the end of the day, backdoors in the open-source software projects will be caught sooner or later, the problem are backdoors in closed source projects ala NSA backdoors.

To this day, I'm not completely sure if my Windows machine is secure or not. I remember Gates once said that governments in the early days of Windows demanded from Microsoft to show them the source code of Windows but even that is not enough since Microsoft can hotpatch whatever they want in their machines.

320 comments