top | item 40103188

Glibc Buffer Overflow in Iconv

185 points| theamk | 1 year ago |openwall.com | reply

103 comments

[+] blueflow|1 year ago|reply

Man, i wish everything was UTF-8 so we iconv would not be needed anymore. Too bad its defined in POSIX.

[+] Karellen|1 year ago|reply

> Too bad its defined in POSIX.

Well, a conforming implementation could just return -1/EINVAL from `iconv_open()` for any given pairs of character codes.

https://manpages.debian.org/bookworm/manpages-dev/iconv_open...

[+] rwmj|1 year ago|reply

A corollary to this is that if we had a simpler function for converting between UTF-8 and UTF-16LE, then I could remove all uses of iconv from my code, since I only use it to convert to/from MS Windows formats. (iconv's API is ugly and difficult to use correctly.)

[+] bawolff|1 year ago|reply

We will always have to read historical formats from time to time. UTF-8 already has extremely good penetration.

[+] snnn|1 year ago|reply

Man, if English is the only human language in this world, who would need UTF-8? The other encodings exist because they are more efficient for the other languages. Especially, for the Chinese, Japanese, and Korean languages. UTF-8 takes 50% more space than the alternatives. To bad modern Linux systems only support UTF-8 locales.

[+] kingspact|1 year ago|reply

UTF-8 is bloat and should never have been a first-class encoding on Unix. Unix is Western and Latin.

[+] TacticalCoder|1 year ago|reply

> Man, i wish everything was UTF-8 so we iconv would not be needed anymore. Too bad its defined in POSIX.

I wish nothing was in UTF-8 and UTF-8 was relegated to properties files. There are codebases out there with complete i18n and l10n in more languages that most here have ever worked with where there's zero Unicode characters allowed in source code files (with pre-commit hooks preventing committing such source code files).

Bruce Schneier was right all along in 1998 or whatever the date was when he said: "Unicode is too complex to ever be secure".

We've seen countless exploits based on Unicode. The latest (re)posted here on HN was a few days ago: some Unicode parsing but affecting OpenSSL. Why? To allow support for internationalized domain names and/or internationalized emails.

Something that should never have been authorized.

We don't need more of what brings countless security exploits: we need less of it.

Relegated Unicode to translation/properties file, where it belongs.

Sure, Unicode is great for documents, chat, etc.

But everything in UTF-8? emails? domain names? source code? This is madness.

I don't understand how anyone can admire the fact that HANGUL fillers are valid in source code are somehow a great win for our industry.

[+] CodesInChaos|1 year ago|reply

I wonder what the PHP specific part is. Does it automatically convert the encoding based on a request header?

[+] fweimer|1 year ago|reply

That's the other direction (legacy charset conversion to UCS-4 or UTF-8). This other direction is often reachable using the charset parameter in the Content-Type header and similar MIME contexts.

HTTP theoretically supports Accept-Charset, but it's deprecated:

https://www.rfc-editor.org/rfc/rfc9110.html#name-accept-char...

But I think on-the-fly charset conversion in the web server is quite rare. Apache httpd does not seem to implement it: https://httpd.apache.org/docs/2.4/content-negotiation.html#m...

The charset in question does not have a locale associated with it (it's not even ASCII-transparent), so I don't think it's usable in a local context together with SUID/SGID/AT_SECURE programs.

[+] smsm42|1 year ago|reply

I seriously doubt you can make PHP convert anything to that exotic charset automatically even with creative configuration, and pretty much sure it wouldn't do any of the sort in common configuration. What I suspect is going on is that the author is interested in exploiting PHP engine and is assuming PHP code using iconv() and wants to talk about how to get from there to full scale RCE. It is indeed a fascinating and non-trivial topic, though the relationship between a particular CVE and the PHP angle is rather coincidental - any buffer overflow would do, it's just the author happened to have one in a reasonably common function.

[+] lyu07282|1 year ago|reply

My guess is, it's application specific, php applications that use the iconv function in some specific way, in some specific context, will be vulnerable.

https://www.php.net/manual/en/function.iconv.php

[+] pengaru|1 year ago|reply

Your comment knocked loose a long dormant memory

https://en.wikipedia.org/wiki/Magic_quotes

[+] keikobadthebad|1 year ago|reply

Sounds like you had better upgrade to the fixed glibc version if you're running php...

[+] thenickdude|1 year ago|reply

Or else edit /usr/lib/x86_64-linux-gnu/gconv/gconv-modules and comment out this section:

  #       from                    to                      module          cost
  alias  ISO2022CNEXT//          ISO-2022-CN-EXT//
  module ISO-2022-CN-EXT//       INTERNAL                ISO-2022-CN-EXT 1
  module INTERNAL                ISO-2022-CN-EXT//       ISO-2022-CN-EXT 1

Then run "iconvconfig" to rebuild the iconv cache. This disables that charset completely.

[+] smsm42|1 year ago|reply

Or if you're not running PHP, since surprisingly enough PHP is not the only software using iconv() function.

[+] saagarjha|1 year ago|reply

> I hope Charles will share further detail with oss-security in due time, but meanwhile his upcoming OffensiveCon talk abstract reveals a bit

Wonder what the story is here. Burned 0 day? Not worth exploiting? lolz?

[+] bawolff|1 year ago|reply

Why do you think there is a story here? Witholding the nitty gritty details until your conference talk is not that unusual.