The thing that concerns me most is looking at the fix it is very difficult to see why this fix is correct. It also appears as there is lots of code without explicit bounds checks. It makes me worried because while the logic may be safe this makes the logic very complex. I wonder what the cost would be to add an explicit, local bounds check at every array access. This would serve as a backup that is much easier to verify. I suspect the cost would be relatively small. Small enough that I personally would be happy to pay it.
This is also a great reminder that fuzzing isn't a solution to memory unsafe languages and libraries. If anything the massive amount of bugs found via fuzzing should scare us as it is likely only scratching the surface of the vulnerabilities that still lie in the code, a couple too many branches away from being likely to be found by fuzzing.
> ...isn't a solution to memory unsafe languages and libraries. If anything the massive amount of bugs found via fuzzing should scare us as it is likely only scratching the surface of the vulnerabilities that still lie in the code
Yup. For example, the Linux code for its relatively new[1] io_uring subsystem was so memory-exploit-ridden that Google disabled it for apps on Android, and entirely on ChromeOS, and their servers[2]. It is insane that this is how easy it has been for a person to break into Linux.
[1] released with kernel version 5.1 which came out in May 2019
Fuzzing needs to cover all important bits of the code to be useful. The problem I see is that incomplete coverage creates a false sense of security. Projects have some minimal fuzzing coverage (e.g. in oss-fuzz) and care less about quality of the code, thinking fuzzing will catch all security bugs.
Rust code needs proper fuzzing too. It takes a lot of effort to ensure everything is covered and stays covered as the code is developed. Crashing libraries or applications can be a denial of service. Sure, it's lower impact than an RCE due to a buffer overflow, but it is still a security issue.
This is the sort of thing where I am very curious to hear what happens if you fine-tune an uncensored version of GPT4 on some memory exploits like this along with the vulnerable code and ask it for more.
It seems like it ought to be really good at this and I am suspicious that people in the know are afraid to talk about it publicly because it's too good at it and once people start weaponizing LLMs for this purpose we just won't be able to use memory unsafe code anymore.
> To put this in context: if this bug does affect Android, then it could potentially be turned into a remote exploit for apps like Signal and WhatsApp. I'd expect it to be fixed in the October bulletin.
Interesting quote from Ben Hawkes (former Project Zero manager) in the article. I regularly compile Signal-Android from source and happened to notice they vendored libwebp a few days ago:
Android is particularly troublesome here with the number of phones out there receiving no updates, and just a single download away from being exploited.
For me, personally, it's a race to see if Google can get this patched for my Pixel 5 before security updates stop in October.
My phone came out in 2017. I'm still on Android 8. There's an update to Android 9.1 available somewhere, but AT&T never got around to porting it to their crapware-riddled fork.
The point about Android is particularly important. I wouldn't like to estimate the proportion of Android phones that are in regular use that no longer receive security updates.
Android phones don't have an awful lot of attack surface area for typical users though. Messenger apps already will refuse to display arbitrary images - Whatsapp for example will only display jpegs and mp4's sent from other contacts.
Is there any site that shows what phones are still getting security updates? I'm worried that this will be the thing that makes me retire my son's old Moto G... 5 I think? It's probably out of security updates. Which kills me, he's a careful boy and it's a solid phone, this is unnecessary E-Waste.
I’m surprised, given the history of exploits, Google didn’t decide to start shipping the image decoders as an APEX system component in the Play Store, with the built-in ones serving as a fallback. They just might now.
I'm tired and cranky today so this will lack subtlety, but:
You don't have to use Rust but you **can't** use C.
There's no reason to be finding these bugs in 2023; period, we can do better and we know how to do better, there's just no reason apart from legacy code (and even then) that you should be using memory unsafe languages in production.
This is pretty bad, there will be many vulnerable applications and someone just posted how to exploit it. I just reproduced it on a VM:
SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/<name>/webp_test/examples/dwebp+0xb24e9) in BuildHuffmanTable
Shadow bytes around the buggy address:
...
Also, here's the webp it generated as base64, but chrome doesn't crash or anything. Other apps may handle it differently, but many just say invalid format.
Remember when the primary purpose of flatpak's shared runtimes was so that libraries like libwebp are not actually bundled per-application? libwebp in the runtimes has been patched with a fix for over a week.
Buffer overflows are bugs. You're asking how bugs can still be happening in this day and age? The answer is inadequate testing and broken engineering processes.
Additional timeline info, as I was curious myself. WebP is old enough that a memory safe language was not a feasible option when the project started.
Android 12 was the first version to support Rust code, and came out in 2021 [0, link talks about the first year of integration].
On the iOS side (which also was affected by this), Swift 1.0 came out in ~2014.
As far as I can tell, Chrome doesn't yet support a memory safe language, but do have a bunch of other safety things built in (see MiraclePtr, sandboxing, etc). Since both WebP and Chrome are from Google, this would stop a possible transition.
WebP was announced in 2010, and had its first stable release in 2018 [1].
> Chrome doesn't yet support a memory safe language
In addition to the safety features you mentioned, Chrome supports Wuffs, a memory safe programming language that supports runtime correctness checks, designed for writing parsers for untrusted files. I don’t think it existed at the start of the webp project either, but that’s what I would expect the webp parser to be written in, over Rust or a garbage collected language.
Is Rust in practice a memory safe language when you're doing tricks like decoding huffman-decoding huffman tables into buffers? It seems like once you optimize for performance this much, you're liable to turn off bounds checking here or there.
This is a needlessly aggressive take IMO, and hailing Rust as the be-all and end-all for secure software lacks nuance, giving the language and its adherents a bad reputation.
Observation: Uncompressed bitmaps, while bloated in terms of necessary bandwidth, still are provably the most secure form of bitmap -- just as uncompressed video (again, while super-bloaty and bandwidth intensive) would be...
That is, to abstract, our security issue exists because:
A) There is complex compression/decompression software/code;
B) To implement this compression/decompression -- there are one or more lookup tables in effect;
C) The software implementing those lookup tables and the decompression side of things were never properly fuzzed, bound-checked, and/or mathematically proven not to create out-of-bounds errors, that is, for every potential use of the lookup table to be guaranteed as correct mathematically given any possibility, any combination of data in an input stream.
Also -- Didn't stuff like this already happen in GIF and PNG formats? Weren't there past security vulnerabilities for those formats?
Isn't this just (I dare to say) -- computer history repeating itself ?
Point: Software Engineering Discipline:
If you as a Software Engineer implement a decompressor for whatever format (or hell, more broadly and generically something that uses lookup tables on incoming streams of data) -- then the "burden of proof" is on you, to prove (one way or another) that it does not have vulnerabilities.
Fuzzing can help, mathematics and inductive/deductive logic can help, testing can help, and paranoid coding (always bounds-checking array accesses, etc.) can help, running in virtual machines and other types of limited environments and sandboxes can help. Running as interpreted code (albeit slower) could help. Deferring decompression to other local network attached resources running in limited execution environments could help.
In short... a monumental challenge... summed up as:
"Prove that all code which uses lookup tables can not generate hidden/unwanted states."
Also... many posters suggest use of Rust may be a solution...
It may turn out to be... but I think a more broader generalization of the language /compiler aspect of things may be:
Not so much to "use Rust", so much as "NOT to use C".
That is -- any library performing decompression of any sort (regardless of whether that decompression is related to visual images or not), if it uses C and lookup tables and implements decompression -- should at least be considered a potential source of future problems, and should (ideally) be migrated to a more memory-safe, bounds-checked language -- in the future.
Rust -- may or may not turn out to be this language...
Negatives for Rust -- large size and complexity of Rust's compiler source code.
Positives for Rust -- Rust's treatment of memory and bounds-checking.
Anyway, some thoughts on the language/compiler aspect of this...
Yes, but as counterargument: there are battle-tested compression formats that offer pretty-damned-good compression. If anything, this shows some questions of calculus when it comes to risk-vs-reward of embracing new standards for compression formats. When was the last time there was a serious vulnerability in major JPEG libs? Or even h.264, which is much newer? Yes, a .webp is like 2/3 the size of a similar JPEG, but for most people are JPEG image sizes a huge cause for concern? Once you're doing lossy compression you're already sacrificing fidelity for filesize so you don't see a lot of machine-crushing-huge JPEGs.
[+] [-] kevincox|2 years ago|reply
https://github.com/webmproject/libwebp/commit/902bc919033134...
This is also a great reminder that fuzzing isn't a solution to memory unsafe languages and libraries. If anything the massive amount of bugs found via fuzzing should scare us as it is likely only scratching the surface of the vulnerabilities that still lie in the code, a couple too many branches away from being likely to be found by fuzzing.
[+] [-] winter_blue|2 years ago|reply
Yup. For example, the Linux code for its relatively new[1] io_uring subsystem was so memory-exploit-ridden that Google disabled it for apps on Android, and entirely on ChromeOS, and their servers[2]. It is insane that this is how easy it has been for a person to break into Linux.
[1] released with kernel version 5.1 which came out in May 2019
[2] in June 2023: https://en.m.wikipedia.org/wiki/Io_uring#Security
[+] [-] jsnell|2 years ago|reply
(Discussed in https://news.ycombinator.com/item?id=37600852)
[+] [-] londons_explore|2 years ago|reply
We just keep track of the table size and make it bigger if necessary.
We additionally have a mode where we just calculate the necessary size, without writing any data structures.
So we have kinda added double safety against this particular bug.
[+] [-] mlichvar|2 years ago|reply
Rust code needs proper fuzzing too. It takes a lot of effort to ensure everything is covered and stays covered as the code is developed. Crashing libraries or applications can be a denial of service. Sure, it's lower impact than an RCE due to a buffer overflow, but it is still a security issue.
[+] [-] lukeschlather|2 years ago|reply
It seems like it ought to be really good at this and I am suspicious that people in the know are afraid to talk about it publicly because it's too good at it and once people start weaponizing LLMs for this purpose we just won't be able to use memory unsafe code anymore.
[+] [-] toomuchtodo|2 years ago|reply
[+] [-] aorth|2 years ago|reply
Interesting quote from Ben Hawkes (former Project Zero manager) in the article. I regularly compile Signal-Android from source and happened to notice they vendored libwebp a few days ago:
https://github.com/signalapp/Signal-Android/commit/a7d9fd19d...
[+] [-] nikanj|2 years ago|reply
*For the small fraction of Android phones that are new enough to get updates
[+] [-] baybal2|2 years ago|reply
[deleted]
[+] [-] nickcw|2 years ago|reply
For me, personally, it's a race to see if Google can get this patched for my Pixel 5 before security updates stop in October.
[+] [-] matthewdgreen|2 years ago|reply
[+] [-] izacus|2 years ago|reply
Pixel 5 should still qualifiy for it.
[+] [-] ryukoposting|2 years ago|reply
[+] [-] josefresco|2 years ago|reply
[+] [-] changelink|2 years ago|reply
The security bulletin doesn't reference this CVE specifically but does mention a critical vulnerability that could lead to RCE.
[+] [-] mnw21cam|2 years ago|reply
[+] [-] londons_explore|2 years ago|reply
[+] [-] Pxtl|2 years ago|reply
[+] [-] gjsman-1000|2 years ago|reply
[+] [-] xavxav|2 years ago|reply
[+] [-] yellow_lead|2 years ago|reply
This is pretty bad, there will be many vulnerable applications and someone just posted how to exploit it. I just reproduced it on a VM:
SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/<name>/webp_test/examples/dwebp+0xb24e9) in BuildHuffmanTable Shadow bytes around the buggy address: ...
Also, here's the webp it generated as base64, but chrome doesn't crash or anything. Other apps may handle it differently, but many just say invalid format.
UklGRukAAABXRUJQVlA4TN0AAAAvAAAAAPAAWgAAsKwlnZsEAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAALRt27Zt27Zt27Zt27Zt2/b92fUAWgAAsLTknJskSQAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAALw/23oALQAAWFpyzk2SJAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAN6f bT2AFgAALC055yZJEgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAO/Pth7A/xsAQoutz/f3 fwAwMzNvVVXt7p5zLw==
[+] [-] X6S1x6Okd1st|2 years ago|reply
[+] [-] izacus|2 years ago|reply
[+] [-] ars|2 years ago|reply
[+] [-] michaelmrose|2 years ago|reply
[+] [-] kirbyfan64sos|2 years ago|reply
[+] [-] ramesh31|2 years ago|reply
[+] [-] ojii|2 years ago|reply
[+] [-] hgs3|2 years ago|reply
[+] [-] barryrandall|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] albntomat0|2 years ago|reply
Android 12 was the first version to support Rust code, and came out in 2021 [0, link talks about the first year of integration].
On the iOS side (which also was affected by this), Swift 1.0 came out in ~2014.
As far as I can tell, Chrome doesn't yet support a memory safe language, but do have a bunch of other safety things built in (see MiraclePtr, sandboxing, etc). Since both WebP and Chrome are from Google, this would stop a possible transition.
WebP was announced in 2010, and had its first stable release in 2018 [1].
[0]: https://security.googleblog.com/2022/12/memory-safe-language...
[1]: https://en.wikipedia.org/wiki/WebP
[+] [-] MatthiasPortzel|2 years ago|reply
In addition to the safety features you mentioned, Chrome supports Wuffs, a memory safe programming language that supports runtime correctness checks, designed for writing parsers for untrusted files. I don’t think it existed at the start of the webp project either, but that’s what I would expect the webp parser to be written in, over Rust or a garbage collected language.
https://github.com/google/wuffs
[+] [-] matthewdgreen|2 years ago|reply
[+] [-] postalrat|2 years ago|reply
[+] [-] 0xParlay|2 years ago|reply
[deleted]
[+] [-] Cthulhu_|2 years ago|reply
[+] [-] pjmlp|2 years ago|reply
However we are going into the right direction.
Microsoft also has put this into place for all those codebases that aren't being rewritten any time soon,
https://learn.microsoft.com/en-us/cpp/code-quality/build-rel...
[+] [-] sjsdaiuasgdia|2 years ago|reply
[+] [-] frizlab|2 years ago|reply
[+] [-] mnw21cam|2 years ago|reply
[+] [-] mistrial9|2 years ago|reply
[+] [-] ocdtrekkie|2 years ago|reply
"Oops"
[+] [-] peter_d_sherman|2 years ago|reply
That is, to abstract, our security issue exists because:
A) There is complex compression/decompression software/code;
B) To implement this compression/decompression -- there are one or more lookup tables in effect;
C) The software implementing those lookup tables and the decompression side of things were never properly fuzzed, bound-checked, and/or mathematically proven not to create out-of-bounds errors, that is, for every potential use of the lookup table to be guaranteed as correct mathematically given any possibility, any combination of data in an input stream.
Also -- Didn't stuff like this already happen in GIF and PNG formats? Weren't there past security vulnerabilities for those formats?
Isn't this just (I dare to say) -- computer history repeating itself ?
Point: Software Engineering Discipline:
If you as a Software Engineer implement a decompressor for whatever format (or hell, more broadly and generically something that uses lookup tables on incoming streams of data) -- then the "burden of proof" is on you, to prove (one way or another) that it does not have vulnerabilities.
Fuzzing can help, mathematics and inductive/deductive logic can help, testing can help, and paranoid coding (always bounds-checking array accesses, etc.) can help, running in virtual machines and other types of limited environments and sandboxes can help. Running as interpreted code (albeit slower) could help. Deferring decompression to other local network attached resources running in limited execution environments could help.
In short... a monumental challenge... summed up as:
"Prove that all code which uses lookup tables can not generate hidden/unwanted states."
[+] [-] Dwedit|2 years ago|reply
https://cve.mitre.org/cgi-bin/cvename.cgi?name=can-2004-0566
[+] [-] peter_d_sherman|2 years ago|reply
It may turn out to be... but I think a more broader generalization of the language /compiler aspect of things may be:
Not so much to "use Rust", so much as "NOT to use C".
That is -- any library performing decompression of any sort (regardless of whether that decompression is related to visual images or not), if it uses C and lookup tables and implements decompression -- should at least be considered a potential source of future problems, and should (ideally) be migrated to a more memory-safe, bounds-checked language -- in the future.
Rust -- may or may not turn out to be this language...
Negatives for Rust -- large size and complexity of Rust's compiler source code.
Positives for Rust -- Rust's treatment of memory and bounds-checking.
Anyway, some thoughts on the language/compiler aspect of this...
[+] [-] Pxtl|2 years ago|reply