top | item 45779832

(no title)

hannob | 4 months ago

Counterpoint: XML is a horrible format.

Why? Answer this question: how can you use XML in a way that does not create horrible security vulnerabilities?

I know the answer, but it is extremely nontrivial, and highly dependent on which programming language, library, and sometimes even which library function you use. The fact that there's no easy way to use XML without creating a security footgun is reason enough to avoid it.

discuss

order

Mikhail_Edoshin|4 months ago

I myself know only two "security vulnerabilities":

1. The entity bomb. An entity that expands to another, which expands to another, and so on so that the final result is enormous. This is an issue of the implementation: if it expands the entities eagerly then the bomb will work. But it it first examines them and checks how much space they require it can safely reject the document if it exceeds some configurable limit. As far as I know this has been fixed in all XML processors.

2. An entity can resolve to a local or remote file. First, this is a feature. Imagine a large collection of bibliographic records, each in a separate file. A publication can provide its list of references as a list of entities that refer to these files using entities. (There is an RFC that uses this as an example.) And, of course, we need both local and remote entities.

But, of course, if your XML comes from an untrusted source and you read it with this feature enabled this can lead to obvious disasters. Yet it is not a vulnerability of XML. Again, as far as I know all XML processors can disable access to local or remote entities.

rhdunn|4 months ago

You can say the same thing about HTML forms (see CORS et. al.), innerHTML, rendering user-submitted data, SQL, JSON, etc. That does not mean that you remove HTML forms or SQL databases.

If you removed support for anything that has/could have security vulnerabilities you would remove everything.

da_chicken|4 months ago

That's not any different than JSON, though. Injection, insecure deserialization , etc. can all exist in that format as well.

There's plenty of reasons to criticize XML, and plenty more to criticize XSLT. But security being the one you call out feels at least moderately disingenuous. It's a criticism of the library, not the standard or the format.

dtech|4 months ago

There's an extremely large difference in that a JSON deserialization vulnerability is almost always a bug in the library. JSON is not an inherently insecure format.

XML is so complex that a 100% bug-free compliant library is inherently insecure, and the vulnerability is a "user is holding it wrong" siutation, they should have disabled specific XML features etc. That means XML is an inherently much more insecure format.

There's a reason there's name for vulnerabilities like XML External Entity (XXE) injection [1] and they're named after XML, and not "bug in lib/software X". JSON and most other data formats don't have that.

[1] https://portswigger.net/web-security/xxe

nothrabannosir|4 months ago

Do those points not apply verbatim to HTML?

Let alone JavaScript…