Even though the video is somewhat sensationalized at some points, it is well worth a watch for people who are interested in computers but don't have a background in it. There is a nice mixture of everything from history (e.g. the founding of the FSF) to a clear explanation of a compression algorithm (clear enough that one should be able to implement it). It also makes claims that should make some people stop and think about the industry as a whole (such as Linux being the most important contemporary operating system).
I'm not sure if it is HN-crowd type material since it is easy enough information for most of us to dig up, assuming we didn't already know it. Yet it does not simplify things to the point of, "technology is magic."
The huffman tree, LZ77 and LZMA explanation is truly excellent for how concise the explanation is.
The earlier Veritasium video on Markov Chains in itself is linked if you don't know what a markov chain is.
I expected Veritasium to tank when it got sold to private equity & Derek went to Australia, but been surprised to see the quality of the long form stuff churned out by Casper, Petr, Henry & Greg.
This is IMO one of the coolest tech stories to ever happen, seriously amazing spycraft & hacking skills, but I haven't been keeping up with new developments from this story since it broke. Last I heard, the best guess at what happened was some state-sponsored actor worked very hard to get this merged, and it was caught luckily at the last minute. But no one had any smoking gun as to who did it or why or who they were targeting. Any new developments since then? Are we still just totally in the dark about what was going on here?
This isn't correct at all. The changes were merged into xz and made it into testing branches of major Linux distros.
It was caught at T plus a few minutes only because a neurotic Microsoft employee performing debugging noticed an obscure performance issue.
You can literally say Microsoft saved Linux that day. Imagine thinking this 25 years ago.
It's the difference between something really bad which happened, and something really, really, really, really bad: a malicious actor having RCE credentials to every new Debian and Red Hat box on planet Earth.
> A lot of the aliases, like Jia Tan, they sound like Asian names, and the published changes are all timestamped in UTC+8, Beijing time. So the signs point to China. And that's why it's probably not China. I mean, why would they make it that obvious? Every other part of the operation has been so meticulous, so cautious.
> And they also worked on Chinese New Year, but not on Christmas. And over the years, there were nine changes that fall outside of the Beijing time into UTC+2, which is a time zone that includes Israel and parts of Western Russia. That's why some experts have speculated that this could be the work of APT29, a Russian-state-backed hacker group also known as Cozy Bear. But again, do we know? No, of course we don't know who it is, and we likely will never know.
I don't think it's a money thing really. IIRC the regular XZ creator/maintainer had a regular job and enough money already, and it was more of a burnout thing from dealing with the usual hassles of OSS. Which means what it really needs is to be taken over by an actual business organization, with a team of developers and professional project managers and customer support people etc so no one person gets too burnt out and if anyone does, they have plenty of backup.
Something that's puzzling in that XZ backdoor attempt is that the attacker had to hide the evil payload. And he hid it in test files and AIUI it was injected at build time through a modified build script and that went unnoticed (it's a compiled, deployed, version that got caught by someone and raised alarm bell).
Why are build scripts not operating in a clean directory, stripping away all test related files?
Isn't this something we should begin to consider doing, seen that it's all too easy to put arbitrary things in test files (you can just pretend stuff is "fuzzed" or "random" or "test vectors" and whatnots: there's always going to be room to hide mischief in test files)?
Like literally building, but only after having erased all test directories/files/data.
Or put it this way: how many backdoors are actually live but wouldn't be if every single build was only done after carefully deleting all the irrelevant files related to tests?
I had a question. People are claiming that the spike in time for ssh or the performance degradation is not much.
But in the video itself, they show that the actual ssh time was about 100 ms and the new time it took was about 600 ms. It is almost 6 times the actual time. I am expecting the performance of the benchmark to significantly drop with these times. And it should be obvious to see that something was wrong.
( I am taking nothing from Andres here. I think he's a brilliant engineer to actually find the root cause of this himself. He is a hero. I am just pointing that 500 ms is not something obscure time interval).
...and yet, zero mention of systemd's recommendation for programs to link in the libsystemd kitchen sink just to call sd_notify() (which should really be its own library)
...and no mention of why systemd felt the need to preemptively load compression libraries, which it only needs to read/write compressed log files, even if you don't read/write log files at all? Again, it's a whole independent subsystem that could be its own library.
The video showed that xz was a dependency of OpenSSH. It showed on screen, but never said aloud, that this was only because of systemd. Debian/Redhat's sshd [0] was started with systemd and they added in a call to the sd_notify() helper function (which simply sends a message to the $NOTIFY_SOCKET socket), just to inform systemd of the exact moment sshd is ready. This loads the whole of libsystemd. That loads the whole of liblzma. Since the xz backdoor, OpenSSH no longer uses the sd_notify() function directly, it writes its own code to connect to $NOTIFY_SOCKET. And the sd_notify manpage begrudgingly gives a listing of code you can use to avoid calling it, so if you're an independent program with no connection to systemd, you just want to notify it you've started... you don't need to pull in the libsystemd kitchen sink. As it should've been in the first place.
Is the real master hacker Lennart Poettering, for making sure his architectural choices didn't appear in this video?
[0]: as an aside, the systemd notification code is only in Debian, Redhat et al because OpenSSH is OpenBSD's fork of Tatu Ylönen's SSH, which went on to become proprietary software. systemd is Linux-only and will never support OpenBSD, so likewise OpenBSD don't include any lines of code in OpenSSH to support systemd. Come to think of it, "BSD" is another thing they don't mention in the script, despite mentioning the AT&T lawsuit (https://en.wikipedia.org/wiki/USL_v._BSDi)
When I was being interviewed, we did talk about exactly this, including that libsystemd is a kitchen sink, and that eventually OpenSSH went with open-coding the equivalent to sd_notify instead of depending on libsystemd. (Also that ahem Red Hat added the dependency on libsystemd in a downstream patch oops).
However the editors (correctly IMHO) took the decision to simplify the whole story of dependencies. In an early draft they simplified it too much, sort of implying that sshd depended directly on liblzma, but they corrected that (adding the illustration of dependencies) after I pointed out it was inaccurate.
I agree with everything you say, but you have to pick your battles when explaining very complicated topics like shared libraries to a lay audience.
In general I was impressed by their careful fact checking and attention to detail.
Sadly they missed the misspelling (UNRESOVLED) even though I pointed it out last week :-( But that's literally the only thing they didn't fix after my feedback.
It did get mentioned - in the context of the upstream change to dynamically load those libraries being a threat to the hack's viability which may have caused "Jia Tan" to rush and accidentally make mistakes in the process.
From my vague memory of xz backdoor, I don't even recall systemd being involved. Now, I get what people are talking about when they said systemd is taking over everything and why there was so much pushback to systemd when it was being added to distros. For me as a end user/dev, it mattered little whether services were started by systemd, openrc etc.
I actually watched this last night, and while I totally understand that criticism is easy, and making things is hard (and the production quality here is great); I got a weird vibe from the video when it comes to who it is for.
The technical explanations are way too complex (even though they're "dumbed down" somewhat with the colour mixing scenario), that anyone who understands those will also know about how dependencies work and how Linux came to be.
It feels almost like it's made for people like my mum, but it will lose them almost immediately at the first mention of complex polynomials.
The actual weight of the situation kinda lands though, and that's important. It's really difficult to overstate how incredibly lucky we were to catch it, and how sophisticated the attack actually was.
I'm really sad that we will genuinely never know who was behind it, and anxious that such things are already in our systems.
My partner who is an accountant, so intelligent but not technical, watched some Veritasium documentaries the other day.
Her comment was that she was really impressed that it didnt dumb anything down like normal documentaries do. She was able to follow along more technical stuff than she anticipated, and that made her enjoy it even more.
I think we need to give people more credit when it comes to complex or techincal explanations. If people are enjoying the context but dont understand the techincal, they can just gloss over that if they prefer. But I felt this was quite telling at how and why Veritasium is such a popular channel.
Veritasium started out as a physics channel, and they've covered a wide variety of physics, math and science topics. They are never afraid of showing you the math, but one of the things I think they are really good at is not losing the human part of the story even if you can't follow the numbers exactly. At the end of the day it's humans who came up with this stuff in the first place, so it must be possible to understand it.
They aren't really a technology channel though, at least as it relates to software/computers, so that's probably why the video starts out with a brief history of Linux.
With the enormous budgets we allocate in the name of "national security", this is exactly the kind of work I expect TLAs to do.
Instead we have come to expect them to cowardly sit on exploits, or actively introduce them, rather than working to secure the general public from adversaries.
II2II|3 days ago
I'm not sure if it is HN-crowd type material since it is easy enough information for most of us to dig up, assuming we didn't already know it. Yet it does not simplify things to the point of, "technology is magic."
gopalv|3 days ago
The huffman tree, LZ77 and LZMA explanation is truly excellent for how concise the explanation is.
The earlier Veritasium video on Markov Chains in itself is linked if you don't know what a markov chain is.
I expected Veritasium to tank when it got sold to private equity & Derek went to Australia, but been surprised to see the quality of the long form stuff churned out by Casper, Petr, Henry & Greg.
jraph|3 days ago
coldpie|3 days ago
tokyobreakfast|3 days ago
This isn't correct at all. The changes were merged into xz and made it into testing branches of major Linux distros.
It was caught at T plus a few minutes only because a neurotic Microsoft employee performing debugging noticed an obscure performance issue.
You can literally say Microsoft saved Linux that day. Imagine thinking this 25 years ago.
It's the difference between something really bad which happened, and something really, really, really, really bad: a malicious actor having RCE credentials to every new Debian and Red Hat box on planet Earth.
nerevarthelame|3 days ago
> A lot of the aliases, like Jia Tan, they sound like Asian names, and the published changes are all timestamped in UTC+8, Beijing time. So the signs point to China. And that's why it's probably not China. I mean, why would they make it that obvious? Every other part of the operation has been so meticulous, so cautious.
> And they also worked on Chinese New Year, but not on Christmas. And over the years, there were nine changes that fall outside of the Beijing time into UTC+2, which is a time zone that includes Israel and parts of Western Russia. That's why some experts have speculated that this could be the work of APT29, a Russian-state-backed hacker group also known as Cozy Bear. But again, do we know? No, of course we don't know who it is, and we likely will never know.
leonidasv|3 days ago
ting0|3 days ago
This is the scariest part to me:
> A pull request (https://github.com/jamespfennell/xz/pull/2) to a go library by a 1Password employee is opened asking to upgrade the library to the vulnerable version
2OEH8eoCRo0|3 days ago
k2enemy|3 days ago
https://oxide-and-friends.transistor.fm/episodes/discovering...
forinti|3 days ago
Europe should have an equivalent scheme for programmers of important Open Source projects such as this one.
anarazel|3 days ago
mc32|3 days ago
Also today as I understand it much of OSS is done in-house by major companies (red hat, Ubuntu, ibm, Google, etc)
ufmace|2 days ago
TacticalCoder|3 days ago
Why are build scripts not operating in a clean directory, stripping away all test related files?
Isn't this something we should begin to consider doing, seen that it's all too easy to put arbitrary things in test files (you can just pretend stuff is "fuzzed" or "random" or "test vectors" and whatnots: there's always going to be room to hide mischief in test files)?
Like literally building, but only after having erased all test directories/files/data.
Or put it this way: how many backdoors are actually live but wouldn't be if every single build was only done after carefully deleting all the irrelevant files related to tests?
sricharanbattu|2 days ago
But in the video itself, they show that the actual ssh time was about 100 ms and the new time it took was about 600 ms. It is almost 6 times the actual time. I am expecting the performance of the benchmark to significantly drop with these times. And it should be obvious to see that something was wrong.
( I am taking nothing from Andres here. I think he's a brilliant engineer to actually find the root cause of this himself. He is a hero. I am just pointing that 500 ms is not something obscure time interval).
amiga386|3 days ago
...and yet, zero mention of systemd's recommendation for programs to link in the libsystemd kitchen sink just to call sd_notify() (which should really be its own library)
...and no mention of why systemd felt the need to preemptively load compression libraries, which it only needs to read/write compressed log files, even if you don't read/write log files at all? Again, it's a whole independent subsystem that could be its own library.
The video showed that xz was a dependency of OpenSSH. It showed on screen, but never said aloud, that this was only because of systemd. Debian/Redhat's sshd [0] was started with systemd and they added in a call to the sd_notify() helper function (which simply sends a message to the $NOTIFY_SOCKET socket), just to inform systemd of the exact moment sshd is ready. This loads the whole of libsystemd. That loads the whole of liblzma. Since the xz backdoor, OpenSSH no longer uses the sd_notify() function directly, it writes its own code to connect to $NOTIFY_SOCKET. And the sd_notify manpage begrudgingly gives a listing of code you can use to avoid calling it, so if you're an independent program with no connection to systemd, you just want to notify it you've started... you don't need to pull in the libsystemd kitchen sink. As it should've been in the first place.
Is the real master hacker Lennart Poettering, for making sure his architectural choices didn't appear in this video?
[0]: as an aside, the systemd notification code is only in Debian, Redhat et al because OpenSSH is OpenBSD's fork of Tatu Ylönen's SSH, which went on to become proprietary software. systemd is Linux-only and will never support OpenBSD, so likewise OpenBSD don't include any lines of code in OpenSSH to support systemd. Come to think of it, "BSD" is another thing they don't mention in the script, despite mentioning the AT&T lawsuit (https://en.wikipedia.org/wiki/USL_v._BSDi)
rwmj|3 days ago
However the editors (correctly IMHO) took the decision to simplify the whole story of dependencies. In an early draft they simplified it too much, sort of implying that sshd depended directly on liblzma, but they corrected that (adding the illustration of dependencies) after I pointed out it was inaccurate.
I agree with everything you say, but you have to pick your battles when explaining very complicated topics like shared libraries to a lay audience.
In general I was impressed by their careful fact checking and attention to detail.
Sadly they missed the misspelling (UNRESOVLED) even though I pointed it out last week :-( But that's literally the only thing they didn't fix after my feedback.
dralley|3 days ago
mayama|3 days ago
bigbadfeline|3 days ago
systemd is doing what it was designed to do... Cute videos are doing what they were designed to do too - hiding that!
> OpenBSD don't include any lines of code in OpenSSH to support systemd. Come to think of it, "BSD" is another thing they don't mention in the script
And this!
dijit|3 days ago
The technical explanations are way too complex (even though they're "dumbed down" somewhat with the colour mixing scenario), that anyone who understands those will also know about how dependencies work and how Linux came to be.
It feels almost like it's made for people like my mum, but it will lose them almost immediately at the first mention of complex polynomials.
The actual weight of the situation kinda lands though, and that's important. It's really difficult to overstate how incredibly lucky we were to catch it, and how sophisticated the attack actually was.
I'm really sad that we will genuinely never know who was behind it, and anxious that such things are already in our systems.
alt227|3 days ago
Her comment was that she was really impressed that it didnt dumb anything down like normal documentaries do. She was able to follow along more technical stuff than she anticipated, and that made her enjoy it even more.
I think we need to give people more credit when it comes to complex or techincal explanations. If people are enjoying the context but dont understand the techincal, they can just gloss over that if they prefer. But I felt this was quite telling at how and why Veritasium is such a popular channel.
alnwlsn|3 days ago
They aren't really a technology channel though, at least as it relates to software/computers, so that's probably why the video starts out with a brief history of Linux.
unknown|3 days ago
[deleted]
mbauman|3 days ago
(But also, my conspiratorially-inclined mind is quite entertained by the thought of some sort of parallel construction or tip from a TLA.)
wasting_time|3 days ago
Instead we have come to expect them to cowardly sit on exploits, or actively introduce them, rather than working to secure the general public from adversaries.
What a mess.
badocr|2 days ago
For sure you were/are not alone in this thinking. How fast the whole thing was exposed in decent enough details was... surprising.