top | item 46728793

(no title)

XML was a product of its time, when after almost 20 years of CPUs rapidly getting quicker, we contemplated that the size of data wouldn't matter, and data types won't matter (hence XML doesn't have them, but after that JSON got them back) -- we expected languages with weak type systems to dominate forever, and that we would be working and thinking levels above all this, abstractly, and so on.

I remember XML proponents back then argued that it allows semantics -- although, it was never clear how a non-human would understand it and process.

The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL! It feels like mathematical concepts -- coordinate spaces, numeric spaces with different number 1 and base space vectors -- applied to HTML. It may be useful in rare cases. But few can wrap their heads around it, and right from the start, most tools worked only with exactly named prefixes, and everyone had to follow this way.

discuss

zzo38computer|1 month ago

> data types won't matter (hence XML doesn't have them, but after that JSON got them back)

JSON does not have very much or very good data types either, but (unlike XML) at least JSON has data types. ASN.1 has more data types (although standard ASN.1 lacks one data type that JSON has (key/value list), ASN.1X includes it), and if DER or another BER-related format is used then all types use the same framing, unlike JSON. One thing JSON lacks is octet string type, so instead you must use hex or base64, and must be converted after it has been read rather than during reading because it is not a proper binary data type.

> The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL!

This is true of any format that you can import with your own names though, and since the names might otherwise conflict, it can also be necessary. This issue is not only XML (and JSON does not have namespaces at all, although some application formats that use it try to add them in some ways).

g-b-r|1 month ago

> right from the start, most tools worked only with exactly named prefixes, and everyone had to follow this way

What tools? Namespaces being defined by their urls is sure not the reason XML is complex, and the tools I remember running into supported it well

culebron21|1 month ago

Ok, I remember people complaining of this, so I have got it wrong.

Mikhail_Edoshin|1 month ago

Semantic in machine processing is actually very simple: if a machine has an instruction to process an element and we know what it does, then the element is semantic.

So, for example, <b> and <i> have perfect semantic, while <article> not so much. What does the browser do with an <article>? Or maybe it is there for an indexing engine? I myself have no idea (nor that I investigated that, I admit).

But all that was misunderstood, very much like XML itself.

zzo38computer|1 month ago

The <article> command in HTML can be useful, even if most implementations do not do much with it. For example, a browser could offer the possibility to print or display only the contents of a single <article> block, or to display marks in the scrollbar for which positions in the scrollbar correspond to the contents of the <article> block. It would also be true of <time>; although many implementations do not do much with it, they could do stuff with it. And, also of <h1>, <h2>, etc; although browsers have built-in styles for them, allowing the end user to customize them is helpful, and so is the possibility of using them to automatically display the table of contents in a separate menu. None of these behaviours should need to be standardized; they can be by the implementation and by the end user configuration etc; only the meaning of the commands will be standardized, not their behaviour.