Smells like an article from someone that didn’t really USE the XML ecosystem.
First, there is modeling ambiguity, too many ways to represent the same data structure. Which means you can’t parse into native structs but instead into a heavy DOM object and it sucks to interact with it.
Then, schemas sound great, until you run into DTD, XSD, and RelaxNG. Relax only exists because XSD is pretty much incomprehensible.
Then let’s talk about entity escaping and CDATA. And how you break entire parsers because CDATA is a separate incantation on the DOM.
And in practice, XML is always over engineered. It’s the AbstractFactoryProxyBuilder of data formats. SOAP and WSDL are great examples of this, vs looking at a JSON response and simply understanding what it is.
I worked with XML and all the tooling around it for a long time. Zero interest in going back. It’s not the angle brackets or the serialization efficiency. It’s all of the above brain damage.
XML grew from SGML (like HTML did), and it brought from it a bunch of things that are useless outside a markup language. Attributes were a bad idea. Entities were a so-so idea, which became unapologetically terrible when URLs and file references were allowed. CDATA was an interesting idea but an error-prone one, and likely it just did not belong.
OTOH namespaces, XSD, XSLT were great, modulo the noisy tags. XSLT was the first purely functional language that enjoyed mass adoption in the industry. (It was also homoiconic, like Lisp, amenable to metaprogramming.) Namespaces were a lifesaver when multiple XML documents from different sources had to be combined. XPath was also quite nice for querying.
XML is noisy because of the closing tags, but it also guarantees a level of integrity, and LZ-type compressors, even gzip, are excellent at compacting repeated strings.
Importantly, XML is a relatively human-friendly format. It has comments, requires no quoting, no commas between list items, etc.
Complexity killed XML. JSON was stupid simple, and thus contained far fewer footguns, which was a very welcome change. It was devised as a serialization format, a bit human-hostile, but mapped ideally to bag-of-named-values structures found in basically any modern language.
Now we see XML tools adopted to JSON: JSONSchema, JSONPath, etc. JSON5 (as used in e.g. VSCode) allows for comments, trailing commas and other creature comforts. With tools like that, and dovetailing tools like Pydantic, XML lost any practical edge over JSON it might ever have.
What's missing is a widespread replacement for XSLT. Could be a fun project.
And of course XML libraries haven't had any security issues (oh look CVE-2025-49796) and certainly would not need to make random network requests for a DTD of "reasonable" complexity. I also dropped XML, and that's after having a website that used XML, XSLT rendering to different output forms, etc. There were discussions at the time (early to mid 2000s) of moving all the config files on unix over to XML. Various softwares probably have the scars of that era and therefore an XML dependency and is that an embiggened attack surface? Also namespaces are super annoying, pretty sure I documented the ughsauce necessary to deal with them somewhere. Thankfully, crickets serenade the faint cries of "Bueller".
The contrast with only JSON is far too simplistic; XML got dropped from places where JSON is uninvolved, like why use a relational database when you can have an XML database??? Or those config files on unix are for the most part still not-XML and not-JSON. Or there's various flavors of markdown which do not give you the semi-mythical semantic web but can be banged out easily enough in vi or whatever and don't require schemas and validation or libraries with far too many security problems and I wouldn't write my documentation (these days) using S-expressions anyhow.
This being said there probably are places where something that validates strictly is optimal, maybe financial transactions (EDIFACT and XML are different hells, I guess), at least until some cheeky git points out that data can be leaked by encoding with tabs and spaces between the elements. Hopefully your fancy and expensive XML security layer normalizes or removes that whitespace?
I read the article and my first thought was it was entirely missing the complexity of XML. It started out relatively simple and easy to understand and most people/programs wrote simple XML that looked a lot like HTML still does.
But it didn't take long before XML might well be a binary format for all it matters to us humans looking at it, parsing it, dealing with it.
JSON came along and it's simplicity was baked in. Anyone can argue it's not a great format but it forcefully maintains the simplicity that XML lost quite quickly.
> JSON has no such mechanism built into the format. Yes, JSON Schema exists, but it is an afterthought, a third-party addition that never achieved universal adoption.
This really seems like it's written by someone who _did not_ use XML back in the day. XSD is no more built-in than JSON Schema is. XSD was first-party (it was promoted by W3C), but it was never a "built-in" component of XML, and there were alternative schema formats. You can perfectly write XML without XSD and back in the heyday of XML in the 2000s, most XML documents did not have XSD.
Nowadays most of the remaining XML usages in production rely heavily on XSD, but that's a bit of a survivorship bias. The projects that used ad-hoc XML as configuration files, simple document files or as an interchange format either died out, converted to another format or eventually adopted XSD. Since almost no new projects are choosing XML nowadays, you don't get an influx of new projects that skip the schema part to ship faster, like you get with JSON. When new developers encounter XML, they are generally interacting with long-established systems that have XSD schemas.
This situation is purely incidental. If you want to get the same result with JSON, you can just use JSON Schema. But if we somehow magically convince everybody on the planet to ditch JSON and return to XML (please not), we'll get the same situation we have had with JSON, only worse. We'll just get to wear we've been in the early 2000s, and no, this wasn't good.
> I worked with XML and all the tooling around it for a long time. Zero interest in going back. It’s not the angle brackets or the serialization efficiency. It’s all of the above brain damage.
I remember a decade ago seeing job ads that explicitly requested XML skills. The fact that being able to do something with XML was considered a full time job requiring a specialist says everything there is to be said about XML.
Hence why in 2026, I still hang around programming stacks, like Java and .NET, where XML tooling is great, instead of having to fight with YAML format errors, Norway error, or JSON without basic stuff like comments.
>First, there is modeling ambiguity, too many ways to represent the same data structure. Which means you can’t parse into native structs but instead into a heavy DOM object and it sucks to interact with it.
I don’t get this argument. There exist streaming APIs with convenient mapping. Yes, there can exist schemas with weird structure, but in practice they are uncommon. I have seen a lot of integration formats in XML, never had the need to parse to DOM first.
You managed to convey my thoughts exactly, and you only used term "SOAP" once. Kudos!
SOAP was terrible everywhere, not just in Nigeria as OP insinuates. And while the idea of XML sounds good, the tools that developed on top of it were mostly atrocious. Good riddance.
SOAP is the biggest mindfuck I've ever interacted with. You have a complex and verbose XML based format where you repeat the tag name, but then you realize that the implementations don't care what the parameters are named and just read them in a fixed order. Coming from JSON based APIs this took me extremely long to realize.
I had great experiences with XSD as a contract in systems integration scenarios, particularly with big systems integrators. It's pretty clear whose fault it is when somebodys XML doesn't validate.
I would encourage anyone who thinks that XML is strictly inferior to attempt integration with certain banking vendors without use of their official XSD/WSDL sources. I've generated service references that are in the tens of megabytes. This stuff is not bloat. There are genuinely this many types and properties in some business systems. There is no way you could hand code this and still get everything else done.
The entire point of heavy-handed XML is to 1:1 the type system across the wire. Once I generate my service references, it is as if the service is on my local machine. The productivity gains around having strongly typed proxies of the remote services are impossible to overstate. I can wire up entirely new operations without looking at the documentation most of the time. Intellisense surfaces everything I need automatically as I drill into the type system.
JSON can work and provide much of the same, but XML has already proven to work in some of the nastiest environments. It's not the friendliest or most convenient technology, but it is an extremely effective technology. I am very confident that the vendors I work with will continue to use XML/WCF/SOAP into 2030.
UAE Central Bank has launched its Digital transformation initiatives: AANI and Jaywan.
UAE's AEP (Al Etihad Payments) launched AANI (It is actually based on India's phenomally successfully "UPI" - its technology stack was licensed to UAE) as digital payments platform.
Jaywan is UAE's domestic cards scheme (in competition to Visa, MasterCard, etc.) (It is actually based on India's successfully RuPay technology stack, licensed to UAE).
And Jaywan uses XML for its files!
So these brand new banking initiatives in Middle East, use XML as the primary file format, because those Banks know that all the thousands of fields/columns in the CBS (Core Banking System) and upstream and downstream system, need a strict file format specification for file loading, processing, Reconciliations, Settlement, Disputes/Chargeback, etc.
Openapi can do that too. But the real benefit is that it forces a simplification of the interface. XML has too many outs for architectural astronauts. JSON has close to none.
You must have been very lucky. Every SOAP service I had the (dis)pleasure to integrate with was a wholly different nightmare-ish can of worms. Even when we get to the very binding of WSDL, there are way too many variations on SOAP: RPC-Encoded? RPC-Literal? Document-Literal? Wrapped Document-Literal?
The problem is part of the same myth many people (like the OP author) have about XML and SOAP: There was "One True Way™" from the beginning, XML schemas were always XSD, SOAP always required WSDL service definition and the style was always wrapped document-literal, with everything following WS-I profiles with the rest of the WS-* suite like WS-Security, WS-Trust, etc. Oh, and of course we don't care about having a secure spec and avoiding easy-to-spoof digital signatures and preventing XML bombs.
Banking systems are mature and I guess everybody already settled and standardized they way they use soap, so you don't have to get into all this mess (And security? Well, if most banks in the world were OK with mandatory maximum password lengths of 8 characters until recently, they probably never heard about XMLdDsig issues or the billion laughs attack).
But you know what also gives you auto-generated code that works perfectly without a hitch, with full schema validation? OpenAPI. Do you prefer RPC style? gRPC and Avro will give you RPC with 5% of the wire bloat that XML does. Message size does matter some times after all.
All of the things that you mentioned are not unique to XML and SOAP. Any well-specified system that combines an interchange format, a schema format, an RPC schema format and an RPC transport can do the achieve the same thing. Some standards had all of this settled from day one: I think Cap'n Proto, Avro and Thrift fit this description. Other systems like CORBA or Protocol Buffers missed some of the components or did not have a well-defined standard[1].
JSON is often criticized by XML-enthusiasts for not having a built-in schema, but his seems like selective amnesia (or maybe all of these bloggers are zoomers or younger millennials?). When XML was first released, there was nothing. Yes, you could cheat and use DTD[2]. But DTD was hard to use and most programmers eschewed writing XML schemas until XSD and Relax-NG came out. SOAP was also very basic (and lightweight!) when it first came out. XSD and WSDL quickly became the standard way to use SOAP, but it took at least a decade to standardize the WSDL binding style (or was it ever standardized)?
Doing RPC in JSON now is still as messy as SOAP has been, but if you want RPC instead of REST, you wouldn't be going to JSON in the first place.
---
[1] IIRC, Protocol Buffers 2 had a rudimentary RPC system which never gained traction outside outside of Google and has been entirely replaced by gRPC after version 3 was released.
[2] DTD wasn't really designed for XML, but since XML was a subset of SGML, you could use the SGML DTD. But DTD wasn't a good fit for XML, and it was quickly replaced by XSD (and for a while - Relax-NG) for a reason.
XML lost because 1) the existence of attributes means a document cannot be automatically mapped to a basic language data structure like an array of strings, and 2) namespaces are an unmitigated hell to work with. Even just declaring a default namespace and doing nothing else immediately makes your day 10x harder.
These items make XML deeply tedious and annoying to ingest and manipulate. Plus, some major XML libraries, like lxml in Python, are extremely unintuitive in their implementation of DOM structures and manipulation. If ingesting and manipulating your markup language feels like an endless trudge through a fiery wasteland then don't be surprised when a simpler, more ergonomic alternative wins, even if its feature set is strictly inferior. And that's exactly what happened.
I say this having spent the last 10 years struggling with lxml specifically, and my entire 25 year career dealing with XML in some shape or form. I still routinely throw up my hands in frustration when having to use Python tooling to do what feels like what should be even the most basic XML task.
> Plus, some major XML libraries, like lxml in Python, are extremely unintuitive in their implementation of DOM structures and manipulation.
Lxml, or more specifically its inspiration ElementTree is specifically not a (W3C) DOM or dom-style API. It was designed for what it called “data-style” XML documents where elements would hold either text or sub-elements but not both, which is why mixed-content interactions are a chore (lxml augments the API by adding more traversal axis but elementtree does not even have that, it’s a literal tree of elements). effbot.org used to have a page explaining its simplified infoset before Fredrik passed and registration lapsed, it can be accessed through archive.org.
That means lxml is, by design, not the right tool to interact with mixed-content documents. But of course the issue is there isn’t really a right tool for that, as to my knowledge nobody has bothered building a fast DOM-style library for Python.
If you approach lxml as what ElementTree was designed as it’s very intuitive: an element is a sequence of sub-elements, with a mapping of attributes. It’s a very straightforward model and works great for data documents, as well as fits great within the langage. But of course that breaks down for mixed content documents as your text nodes get relegated to `tail` attributes (and ElementTree straight up discards comments and PIs, though lxml reverted that).
This is performance art, right? The very first bullet point it starts with is extolling the merits of XSD. Even back in the day when XML was huge, XSD was widely recognized as a monstrosity and a boondoggle -- the real XMLheads were trying to make RELAX NG happen, but XSD got jammed through because it was needed for all those monstrous WS-* specs.
XML did some good things for its day, but no, we abandoned it for very good reasons.
XSD was (is) not so easy to adopt, but I don't agree that it's a monstrosity.
Schema are complicated. XSD is a response to that reality.
The XML ecosystem is messy. But people don't need to adopt everything. Ignore Relax-NG, ignore DTD, use namespaces sparingly, adopt conventions around NOT using attributes. It generally works quite well.
It's a challenge to get comfortable with XSD but once that happens, it's not a monstrosity. Similarly, XSLT. It requires a different way of thinking, and once you get that, you're productive.
Also as someone else pointed out the same complaints that JSON Schema "isn't in the standard, it's a separate standard" apply to XSD. It is still a different standard even though during the height of XML mania it sometimes seemed like XSD was inseperable. XML did have DTD baked in, and maybe the author meant DTD in that section, but that was even worse than XSD (and again both were why RELAX NG happened).
I remember spending hours just trying to properly define the XML schema I wanted to use.
Then if there were any problems in my XML, trying to decipher horrible errors determining what I did wrong.
The docs sucked and where "enterprise grade", the examples sucked (either too complicated or too simple), and the tooling sucked.
I suspect it would be fine now days with LLMs to help, but back when it existed, XML was a huge hassle.
I once worked on a robotics project where a full 50% of the CPU was used for XML serialization and parsing. Made it hard to actually have the robot do anything. XML is violently wordy and parsing strings is expensive.
We do XML processing, albeit with XQuery, as a small business.
It is a very niche solution but actually very stable and quite handy for all kinds of data handling; web-based applications and APIs as it nicely integrates with all kinds of text-based formats such as JSON, CSV or XML.
Yet I can easily comprehend how people get lost in all kinds of standards, meta-standards, DTDs, schemas, namespaces, and modeling the whole enterprise in SOAP.
However, you can do simple things simply and small, but in my experience, most tools promised to solve problems with ever-layered complexities.
Little disclaimer, I am probably biased, as I am with BaseX, an open-source XQuery processor :-)
I am a BaseX user and I really appreciate it! I actually do not mind XML at all. XQuery and BaseX makes searching large numbers of XML file or just one large XML file really easy.
Its really bizarre, you talk about Rust or TypeScript and everyone understands how doing a little bit extra planning up front yields to great results as everyone can work from solid foundations, but you suggest they do the same for your data by using XML and its wailing and gnashing of teeth, bringing up anecdotes about SOAP and DTDs like we're all still living in 2003, concatenating strings together for our XML and trying to find answers to problems on forums or on ExpertSexChange.
The vast, vast majority of devs today have never known anything except JSON for their React frontends, but honestly if they gave XML a try and weren't working from second hand horror stories from 20 years ago I think a lot more people would like it than you expect.
I tried using XML on a lark the other day and realized that XSDs are actually somewhat load bearing. It's difficult to map data in XML to objects in your favorite programming language without the schema being known beforehand as lists of a single element are hard to distinguish from just a property of the overall object.
Maybe this is okay if you know your schema beforehand and are willing to write an XSD. My usecase relied on not knowing the schema. Despite my excitement to use a SAX-style parser, I tucked my tail between my legs and switched back to JSONL. Was I missing something?
This is a debate I've had many times. XML, and REST, are extremely useful for certain types of use cases that you quite often run into online.
The industry abandoned both in favor of JSON and RPC for speed and perceived DX improvements, and because for a period of time everyone was in fact building only against their own servers.
There are plenty of examples over the last two decades of us having to reinvent solutions to the same problems that REST solved way back then though. MCP is the latest iteration of trying to shoehorn schemas and self-documenting APIs into a sea of JSON RPC.
What I miss the most about the XML ecosystem is the tooling. And I think, this is what most people are sentimental about. There was a time it was so easy to generate contracts using XSDs and it made it easy to validate the data. OpenAPI slowly reaches parity to what I worked with in 2006.
But what I do not miss is the over-engineering that happened in the ecosystem, especially with everything SOAP. Yes, when it worked, it worked. But when it didn’t work, which was often the case when integrating different enterprise systems, then well… lord have mercy on me.
Sometimes I still use XSD to define a schema for clients, because in some areas there’s still better tooling for XML. And it gives me the safety of getting valid input data, if the XML couldn’t be validated.
And in the enterprise world, XML is far from being dead anyways.
> the various XML-based "standards" spawned by enterprise committees are monuments to over-engineering. But the core format (elements, attributes, schemas, namespaces) remains sound. We threw out the mechanism along with its abuses.
It's mostly only arguing for using the basic XML in place of the basic JSON.
I largely agree to that, although I wouldn't consider the schemas among its core, go read the Schema specifications and tell me when you come out.
But I agree that a good part of XML's downfall was due to its enterprise committees: no iteration, and few incentives to make things lean and their specifications simple; a lot of the companies designing them had an interest in making them hard to implement.
XML was a product of its time, when after almost 20 years of CPUs rapidly getting quicker, we contemplated that the size of data wouldn't matter, and data types won't matter (hence XML doesn't have them, but after that JSON got them back) -- we expected languages with weak type systems to dominate forever, and that we would be working and thinking levels above all this, abstractly, and so on.
I remember XML proponents back then argued that it allows semantics -- although, it was never clear how a non-human would understand it and process.
The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL! It feels like mathematical concepts -- coordinate spaces, numeric spaces with different number 1 and base space vectors -- applied to HTML. It may be useful in rare cases. But few can wrap their heads around it, and right from the start, most tools worked only with exactly named prefixes, and everyone had to follow this way.
> data types won't matter (hence XML doesn't have them, but after that JSON got them back)
JSON does not have very much or very good data types either, but (unlike XML) at least JSON has data types. ASN.1 has more data types (although standard ASN.1 lacks one data type that JSON has (key/value list), ASN.1X includes it), and if DER or another BER-related format is used then all types use the same framing, unlike JSON. One thing JSON lacks is octet string type, so instead you must use hex or base64, and must be converted after it has been read rather than during reading because it is not a proper binary data type.
> The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL!
This is true of any format that you can import with your own names though, and since the names might otherwise conflict, it can also be necessary. This issue is not only XML (and JSON does not have namespaces at all, although some application formats that use it try to add them in some ways).
Semantic in machine processing is actually very simple: if a machine has an instruction to process an element and we know what it does, then the element is semantic.
So, for example, <b> and <i> have perfect semantic, while <article> not so much. What does the browser do with an <article>? Or maybe it is there for an indexing engine? I myself have no idea (nor that I investigated that, I admit).
But all that was misunderstood, very much like XML itself.
I think the industry settled on pretty good answers, using lots of XML-like syntax (HTML, JSX) but rarely using XML™.
1. Following Postel's law, don't reject "invalid" third-party input; instead, standardize how to interpret weird syntax. This is what we did with HTML.
2. Use declarative schema definitions sparingly, only for first-party testing and as reference documentation, never to automatically reject third-party input.
3. Use XML-like syntax (like JSX) in a Turing-complete language for defining nested UI components.
Think of UI components as if they're functions, accepting a number of named, optional arguments/parameters (attributes!) and an array of child components with their own nested children. (In many UI frameworks, components literally are functions with opaque return types, exactly like this.)
Closing tags like `</article>` make sense when you're going to nest components 10+ layers deep, and when the closing tag will appear hundreds of lines of code later.
Most code shouldn't look like that, but UI code almost always does, which is why JSX is popular.
Comparing the XML ecosystem to JSON is like comparing railroads to bicycles.
The main difference is that with enterprise companies and consultancies pushed complex XML solutions that differentiated them and created a moat (involving developer tools and compliance). JSON has always just been a way to sling data around, with a modicum of sanity. Hence the overbuilt/underbuilt split.
XML saved our axx. We had both internal and external API's with complex objects in JSON which failed constantly with mismatching implementations, causing friction with clients. Switching both to XML with schema solved that forever. But this was for complex B2B. We still used json for trivial web UI interactions.
Developers (even web developers!) were familiar with XML for many years before JSON was invented.
Also "worse is better". Many developer still prefer to use something that is similar to notepad.exe, instead of actual tools that understand the formats on a deeper level.
XML and XSD were not meant to be edited by hand, by humans.
They thrived when we used proper XML/XSD editing tools.
Although ironically there are less production-time human mistakes when editing an XML that is properly validated with a XSD than a YAML file, because Norway.
There were efforts to make XML 1. more ergonomic and 2. more performant, and while (2) was largely successful, (1) never got there, unfortunately - but seem https://github.com/yaml/sml-dev-archive for some history of just one of the discussions (sml-dev mailing list).
[+] [-] in_a_society|1 month ago|reply
First, there is modeling ambiguity, too many ways to represent the same data structure. Which means you can’t parse into native structs but instead into a heavy DOM object and it sucks to interact with it.
Then, schemas sound great, until you run into DTD, XSD, and RelaxNG. Relax only exists because XSD is pretty much incomprehensible.
Then let’s talk about entity escaping and CDATA. And how you break entire parsers because CDATA is a separate incantation on the DOM.
And in practice, XML is always over engineered. It’s the AbstractFactoryProxyBuilder of data formats. SOAP and WSDL are great examples of this, vs looking at a JSON response and simply understanding what it is.
I worked with XML and all the tooling around it for a long time. Zero interest in going back. It’s not the angle brackets or the serialization efficiency. It’s all of the above brain damage.
[+] [-] jancsika|1 month ago|reply
Boy, are you telling me!
Boy are you a person, one of whose attributes is telling me!
Boy are you a person whose telling-me attribute is set to true!
Boy-who-is-telling-me, this space is left intentionally blank!
Out of all the key value pairs, you are the boy key and your adjacent sibling string type value is "Telling Me!"
Edit: fixed a CVE
[+] [-] nine_k|1 month ago|reply
OTOH namespaces, XSD, XSLT were great, modulo the noisy tags. XSLT was the first purely functional language that enjoyed mass adoption in the industry. (It was also homoiconic, like Lisp, amenable to metaprogramming.) Namespaces were a lifesaver when multiple XML documents from different sources had to be combined. XPath was also quite nice for querying.
XML is noisy because of the closing tags, but it also guarantees a level of integrity, and LZ-type compressors, even gzip, are excellent at compacting repeated strings.
Importantly, XML is a relatively human-friendly format. It has comments, requires no quoting, no commas between list items, etc.
Complexity killed XML. JSON was stupid simple, and thus contained far fewer footguns, which was a very welcome change. It was devised as a serialization format, a bit human-hostile, but mapped ideally to bag-of-named-values structures found in basically any modern language.
Now we see XML tools adopted to JSON: JSONSchema, JSONPath, etc. JSON5 (as used in e.g. VSCode) allows for comments, trailing commas and other creature comforts. With tools like that, and dovetailing tools like Pydantic, XML lost any practical edge over JSON it might ever have.
What's missing is a widespread replacement for XSLT. Could be a fun project.
[+] [-] tolciho|1 month ago|reply
The contrast with only JSON is far too simplistic; XML got dropped from places where JSON is uninvolved, like why use a relational database when you can have an XML database??? Or those config files on unix are for the most part still not-XML and not-JSON. Or there's various flavors of markdown which do not give you the semi-mythical semantic web but can be banged out easily enough in vi or whatever and don't require schemas and validation or libraries with far too many security problems and I wouldn't write my documentation (these days) using S-expressions anyhow.
This being said there probably are places where something that validates strictly is optimal, maybe financial transactions (EDIFACT and XML are different hells, I guess), at least until some cheeky git points out that data can be leaked by encoding with tabs and spaces between the elements. Hopefully your fancy and expensive XML security layer normalizes or removes that whitespace?
[+] [-] wvenable|1 month ago|reply
But it didn't take long before XML might well be a binary format for all it matters to us humans looking at it, parsing it, dealing with it.
JSON came along and it's simplicity was baked in. Anyone can argue it's not a great format but it forcefully maintains the simplicity that XML lost quite quickly.
[+] [-] unscaled|1 month ago|reply
This really seems like it's written by someone who _did not_ use XML back in the day. XSD is no more built-in than JSON Schema is. XSD was first-party (it was promoted by W3C), but it was never a "built-in" component of XML, and there were alternative schema formats. You can perfectly write XML without XSD and back in the heyday of XML in the 2000s, most XML documents did not have XSD.
Nowadays most of the remaining XML usages in production rely heavily on XSD, but that's a bit of a survivorship bias. The projects that used ad-hoc XML as configuration files, simple document files or as an interchange format either died out, converted to another format or eventually adopted XSD. Since almost no new projects are choosing XML nowadays, you don't get an influx of new projects that skip the schema part to ship faster, like you get with JSON. When new developers encounter XML, they are generally interacting with long-established systems that have XSD schemas.
This situation is purely incidental. If you want to get the same result with JSON, you can just use JSON Schema. But if we somehow magically convince everybody on the planet to ditch JSON and return to XML (please not), we'll get the same situation we have had with JSON, only worse. We'll just get to wear we've been in the early 2000s, and no, this wasn't good.
[+] [-] locknitpicker|1 month ago|reply
I remember a decade ago seeing job ads that explicitly requested XML skills. The fact that being able to do something with XML was considered a full time job requiring a specialist says everything there is to be said about XML.
[+] [-] mkozlows|1 month ago|reply
[+] [-] pjmlp|1 month ago|reply
Hence why in 2026, I still hang around programming stacks, like Java and .NET, where XML tooling is great, instead of having to fight with YAML format errors, Norway error, or JSON without basic stuff like comments.
[+] [-] ivan_gammel|1 month ago|reply
I don’t get this argument. There exist streaming APIs with convenient mapping. Yes, there can exist schemas with weird structure, but in practice they are uncommon. I have seen a lot of integration formats in XML, never had the need to parse to DOM first.
[+] [-] unknown|1 month ago|reply
[deleted]
[+] [-] bornfreddy|1 month ago|reply
SOAP was terrible everywhere, not just in Nigeria as OP insinuates. And while the idea of XML sounds good, the tools that developed on top of it were mostly atrocious. Good riddance.
[+] [-] imtringued|1 month ago|reply
[+] [-] pkphilip|1 month ago|reply
I agree with the author that XML is very similar to S expressions but with the brackets replaced by closing tags.
Parsing XML wasn't complex either. There have been many good libraries for it in pretty much most languages
[+] [-] badgersnake|1 month ago|reply
[+] [-] VerifiedReports|1 month ago|reply
[+] [-] hinkley|1 month ago|reply
- XML-DSIG survivor.
[+] [-] bob1029|1 month ago|reply
The entire point of heavy-handed XML is to 1:1 the type system across the wire. Once I generate my service references, it is as if the service is on my local machine. The productivity gains around having strongly typed proxies of the remote services are impossible to overstate. I can wire up entirely new operations without looking at the documentation most of the time. Intellisense surfaces everything I need automatically as I drill into the type system.
JSON can work and provide much of the same, but XML has already proven to work in some of the nastiest environments. It's not the friendliest or most convenient technology, but it is an extremely effective technology. I am very confident that the vendors I work with will continue to use XML/WCF/SOAP into 2030.
[+] [-] vee-kay|1 month ago|reply
UAE's AEP (Al Etihad Payments) launched AANI (It is actually based on India's phenomally successfully "UPI" - its technology stack was licensed to UAE) as digital payments platform.
Jaywan is UAE's domestic cards scheme (in competition to Visa, MasterCard, etc.) (It is actually based on India's successfully RuPay technology stack, licensed to UAE).
And Jaywan uses XML for its files!
So these brand new banking initiatives in Middle East, use XML as the primary file format, because those Banks know that all the thousands of fields/columns in the CBS (Core Banking System) and upstream and downstream system, need a strict file format specification for file loading, processing, Reconciliations, Settlement, Disputes/Chargeback, etc.
[+] [-] tomjen3|1 month ago|reply
[+] [-] unscaled|1 month ago|reply
The problem is part of the same myth many people (like the OP author) have about XML and SOAP: There was "One True Way™" from the beginning, XML schemas were always XSD, SOAP always required WSDL service definition and the style was always wrapped document-literal, with everything following WS-I profiles with the rest of the WS-* suite like WS-Security, WS-Trust, etc. Oh, and of course we don't care about having a secure spec and avoiding easy-to-spoof digital signatures and preventing XML bombs.
Banking systems are mature and I guess everybody already settled and standardized they way they use soap, so you don't have to get into all this mess (And security? Well, if most banks in the world were OK with mandatory maximum password lengths of 8 characters until recently, they probably never heard about XMLdDsig issues or the billion laughs attack).
But you know what also gives you auto-generated code that works perfectly without a hitch, with full schema validation? OpenAPI. Do you prefer RPC style? gRPC and Avro will give you RPC with 5% of the wire bloat that XML does. Message size does matter some times after all.
All of the things that you mentioned are not unique to XML and SOAP. Any well-specified system that combines an interchange format, a schema format, an RPC schema format and an RPC transport can do the achieve the same thing. Some standards had all of this settled from day one: I think Cap'n Proto, Avro and Thrift fit this description. Other systems like CORBA or Protocol Buffers missed some of the components or did not have a well-defined standard[1].
JSON is often criticized by XML-enthusiasts for not having a built-in schema, but his seems like selective amnesia (or maybe all of these bloggers are zoomers or younger millennials?). When XML was first released, there was nothing. Yes, you could cheat and use DTD[2]. But DTD was hard to use and most programmers eschewed writing XML schemas until XSD and Relax-NG came out. SOAP was also very basic (and lightweight!) when it first came out. XSD and WSDL quickly became the standard way to use SOAP, but it took at least a decade to standardize the WSDL binding style (or was it ever standardized)? Doing RPC in JSON now is still as messy as SOAP has been, but if you want RPC instead of REST, you wouldn't be going to JSON in the first place.
---
[1] IIRC, Protocol Buffers 2 had a rudimentary RPC system which never gained traction outside outside of Google and has been entirely replaced by gRPC after version 3 was released.
[2] DTD wasn't really designed for XML, but since XML was a subset of SGML, you could use the SGML DTD. But DTD wasn't a good fit for XML, and it was quickly replaced by XSD (and for a while - Relax-NG) for a reason.
[+] [-] acabal|1 month ago|reply
These items make XML deeply tedious and annoying to ingest and manipulate. Plus, some major XML libraries, like lxml in Python, are extremely unintuitive in their implementation of DOM structures and manipulation. If ingesting and manipulating your markup language feels like an endless trudge through a fiery wasteland then don't be surprised when a simpler, more ergonomic alternative wins, even if its feature set is strictly inferior. And that's exactly what happened.
I say this having spent the last 10 years struggling with lxml specifically, and my entire 25 year career dealing with XML in some shape or form. I still routinely throw up my hands in frustration when having to use Python tooling to do what feels like what should be even the most basic XML task.
Though xpath is nice.
[+] [-] masklinn|1 month ago|reply
Lxml, or more specifically its inspiration ElementTree is specifically not a (W3C) DOM or dom-style API. It was designed for what it called “data-style” XML documents where elements would hold either text or sub-elements but not both, which is why mixed-content interactions are a chore (lxml augments the API by adding more traversal axis but elementtree does not even have that, it’s a literal tree of elements). effbot.org used to have a page explaining its simplified infoset before Fredrik passed and registration lapsed, it can be accessed through archive.org.
That means lxml is, by design, not the right tool to interact with mixed-content documents. But of course the issue is there isn’t really a right tool for that, as to my knowledge nobody has bothered building a fast DOM-style library for Python.
If you approach lxml as what ElementTree was designed as it’s very intuitive: an element is a sequence of sub-elements, with a mapping of attributes. It’s a very straightforward model and works great for data documents, as well as fits great within the langage. But of course that breaks down for mixed content documents as your text nodes get relegated to `tail` attributes (and ElementTree straight up discards comments and PIs, though lxml reverted that).
[+] [-] matkoniecz|1 month ago|reply
and often having less bizarre and overly complex features is a feature by itself
[+] [-] mkozlows|1 month ago|reply
XML did some good things for its day, but no, we abandoned it for very good reasons.
[+] [-] PantaloonFlames|1 month ago|reply
Schema are complicated. XSD is a response to that reality.
The XML ecosystem is messy. But people don't need to adopt everything. Ignore Relax-NG, ignore DTD, use namespaces sparingly, adopt conventions around NOT using attributes. It generally works quite well.
It's a challenge to get comfortable with XSD but once that happens, it's not a monstrosity. Similarly, XSLT. It requires a different way of thinking, and once you get that, you're productive.
[+] [-] WorldMaker|1 month ago|reply
[+] [-] unknown|1 month ago|reply
[deleted]
[+] [-] froh|1 month ago|reply
dsssl was the scheme based domain specific "document style semantics and and specification language"
the syntax change was in the era of general lisp syntax bashing.
but to xml syntax? really? that was so surreal to me.
[+] [-] kenforthewin|1 month ago|reply
> This is not engineering. This is fashion masquerading as technical judgment.
The boring explanation is that AI wrote this. The more interesting theory is that folks are beginning to adopt the writing quirks of AI en masse.
[+] [-] com2kid|1 month ago|reply
Then if there were any problems in my XML, trying to decipher horrible errors determining what I did wrong.
The docs sucked and where "enterprise grade", the examples sucked (either too complicated or too simple), and the tooling sucked.
I suspect it would be fine now days with LLMs to help, but back when it existed, XML was a huge hassle.
I once worked on a robotics project where a full 50% of the CPU was used for XML serialization and parsing. Made it hard to actually have the robot do anything. XML is violently wordy and parsing strings is expensive.
[+] [-] _micheee|1 month ago|reply
It is a very niche solution but actually very stable and quite handy for all kinds of data handling; web-based applications and APIs as it nicely integrates with all kinds of text-based formats such as JSON, CSV or XML.
Yet I can easily comprehend how people get lost in all kinds of standards, meta-standards, DTDs, schemas, namespaces, and modeling the whole enterprise in SOAP.
However, you can do simple things simply and small, but in my experience, most tools promised to solve problems with ever-layered complexities.
Little disclaimer, I am probably biased, as I am with BaseX, an open-source XQuery processor :-)
[+] [-] cyocum|1 month ago|reply
[+] [-] Devasta|1 month ago|reply
The vast, vast majority of devs today have never known anything except JSON for their React frontends, but honestly if they gave XML a try and weren't working from second hand horror stories from 20 years ago I think a lot more people would like it than you expect.
[+] [-] striking|1 month ago|reply
Maybe this is okay if you know your schema beforehand and are willing to write an XSD. My usecase relied on not knowing the schema. Despite my excitement to use a SAX-style parser, I tucked my tail between my legs and switched back to JSONL. Was I missing something?
[+] [-] _heimdall|1 month ago|reply
The industry abandoned both in favor of JSON and RPC for speed and perceived DX improvements, and because for a period of time everyone was in fact building only against their own servers.
There are plenty of examples over the last two decades of us having to reinvent solutions to the same problems that REST solved way back then though. MCP is the latest iteration of trying to shoehorn schemas and self-documenting APIs into a sea of JSON RPC.
[+] [-] bytefish|1 month ago|reply
But what I do not miss is the over-engineering that happened in the ecosystem, especially with everything SOAP. Yes, when it worked, it worked. But when it didn’t work, which was often the case when integrating different enterprise systems, then well… lord have mercy on me.
Sometimes I still use XSD to define a schema for clients, because in some areas there’s still better tooling for XML. And it gives me the safety of getting valid input data, if the XML couldn’t be validated.
And in the enterprise world, XML is far from being dead anyways.
[+] [-] fsckboy|1 month ago|reply
yes, me too! when using XML it rendered all the flexability and power of the unix tools pretty useless, and I missed them.
[+] [-] g-b-r|1 month ago|reply
> the various XML-based "standards" spawned by enterprise committees are monuments to over-engineering. But the core format (elements, attributes, schemas, namespaces) remains sound. We threw out the mechanism along with its abuses.
It's mostly only arguing for using the basic XML in place of the basic JSON.
I largely agree to that, although I wouldn't consider the schemas among its core, go read the Schema specifications and tell me when you come out.
But I agree that a good part of XML's downfall was due to its enterprise committees: no iteration, and few incentives to make things lean and their specifications simple; a lot of the companies designing them had an interest in making them hard to implement.
[+] [-] kennethallen|1 month ago|reply
[+] [-] culebron21|1 month ago|reply
I remember XML proponents back then argued that it allows semantics -- although, it was never clear how a non-human would understand it and process.
The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL! It feels like mathematical concepts -- coordinate spaces, numeric spaces with different number 1 and base space vectors -- applied to HTML. It may be useful in rare cases. But few can wrap their heads around it, and right from the start, most tools worked only with exactly named prefixes, and everyone had to follow this way.
[+] [-] zzo38computer|1 month ago|reply
JSON does not have very much or very good data types either, but (unlike XML) at least JSON has data types. ASN.1 has more data types (although standard ASN.1 lacks one data type that JSON has (key/value list), ASN.1X includes it), and if DER or another BER-related format is used then all types use the same framing, unlike JSON. One thing JSON lacks is octet string type, so instead you must use hex or base64, and must be converted after it has been read rather than during reading because it is not a proper binary data type.
> The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we read a doc with snake:front-left-paw, and ask how come does a snake have paws? -- Because it's actually a bear -- see the definition of snake in the URL!
This is true of any format that you can import with your own names though, and since the names might otherwise conflict, it can also be necessary. This issue is not only XML (and JSON does not have namespaces at all, although some application formats that use it try to add them in some ways).
[+] [-] g-b-r|1 month ago|reply
What tools? Namespaces being defined by their urls is sure not the reason XML is complex, and the tools I remember running into supported it well
[+] [-] Mikhail_Edoshin|1 month ago|reply
So, for example, <b> and <i> have perfect semantic, while <article> not so much. What does the browser do with an <article>? Or maybe it is there for an indexing engine? I myself have no idea (nor that I investigated that, I admit).
But all that was misunderstood, very much like XML itself.
[+] [-] dfabulich|1 month ago|reply
1. Following Postel's law, don't reject "invalid" third-party input; instead, standardize how to interpret weird syntax. This is what we did with HTML.
2. Use declarative schema definitions sparingly, only for first-party testing and as reference documentation, never to automatically reject third-party input.
3. Use XML-like syntax (like JSX) in a Turing-complete language for defining nested UI components.
Think of UI components as if they're functions, accepting a number of named, optional arguments/parameters (attributes!) and an array of child components with their own nested children. (In many UI frameworks, components literally are functions with opaque return types, exactly like this.)
Closing tags like `</article>` make sense when you're going to nest components 10+ layers deep, and when the closing tag will appear hundreds of lines of code later.
Most code shouldn't look like that, but UI code almost always does, which is why JSX is popular.
[+] [-] w10-1|1 month ago|reply
The main difference is that with enterprise companies and consultancies pushed complex XML solutions that differentiated them and created a moat (involving developer tools and compliance). JSON has always just been a way to sling data around, with a modicum of sanity. Hence the overbuilt/underbuilt split.
XML saved our axx. We had both internal and external API's with complex objects in JSON which failed constantly with mismatching implementations, causing friction with clients. Switching both to XML with schema solved that forever. But this was for complex B2B. We still used json for trivial web UI interactions.
[+] [-] bni|1 month ago|reply
Also "worse is better". Many developer still prefer to use something that is similar to notepad.exe, instead of actual tools that understand the formats on a deeper level.
[+] [-] brunoborges|1 month ago|reply
Although ironically there are less production-time human mistakes when editing an XML that is properly validated with a XSD than a YAML file, because Norway.
[+] [-] hirvi74|1 month ago|reply
"XML is a lot like violence. If it's not getting the job done, then you aren't using enough of it."
[+] [-] stmw|1 month ago|reply