top | item 4413572

Why JSON will continue to push XML out of the picture

29 points| lucperkins | 13 years ago |blog.appfog.com

45 comments

order

clarkevans|13 years ago

I think this article has done a great job enumerating trends that show JSON is beating XML for data serialization applications. I think these points are evidence of a shift in thinking, but not the reason for shift itself.

Why JSON over XML? Because people need data serialization format and XML is a Markup Language. JSON is gaining widespread adoption for data serialization applications since it's the correct tool. XML isn't.

In a markup language there is an underlying text that you're annotating with machine readable tags. Most data interchange doesn't have an underlying text -- you can't strip away the tags and expect it to be understandable. If you're writing a web page, that has to be read by humans and interpreted by machines... you need a markup language.

By contrast, data interchange is about moving arbitrary data structures between processes and/or languages. JSON's information model fits this model perfectly: its nested map/list/scalar is simple & powerful. As for typing, it found a sweet spot with text/numeric/boolean.

JSON is the right tool for the data serialization problem.

tptacek|13 years ago

This makes sense, but from what I can tell, in virtually no major XML-based systems is the basis for XML files an underlying text extended with markup. Most XML systems, since the dawn of XML, have been top-to-bottom structured data.

6ren|13 years ago

ON vs ML, nice

But there's also the stack. XML has XSD for validation and documentation; XSLT and XQuery for transformation; and most people seem to like XPath. The overwhelming response to analogues for JSON is horror - don't pollute our simplicity! - and acknowledgement that while some tasks do indeed need these features, the XML stack already has them. The corruption of XML is what keeps JSON clean.

TazeTSchnitzel|13 years ago

It also sounds like the right tool for making something like SVG. The vast majority of SVG data isn't text.

portmanteaufu|13 years ago

For me the single greatest selling point of JSON is that it's just so danged easy to go from json string to a usable map/list/dictionary in every language. Most of the time you can get from A to B in one or two lines of code.

XML always seemed like such a struggle by comparison. Figuring out which parser(s) you've got installed, figuring out their respective APIs -- it felt like total overkill. The only way I could be productive with XML was using Python's ElementTree API because it was so simple.

Some day I'll need my data to be checked against a complicated schema. But until that day arrives, I'm sticking with JSON.

Joeri|13 years ago

XML is a smooth fit on strongly typed languages. You can easily translate an exact type into a corresponding XML encoding and know the type of what you're getting out on the other end. JSON on the other hand is duck typing in web service form. You can shove any data structure in on one end, and get it back out the other end, without writing any custom code, and without actually knowing the type of the data you've sent. You could say that JSON itself is weakly typed.

The popularity of JSON is tied to the popularity of weak typing. You can more rapidly iterate your API design and codebase without those bothersome types getting in the way. The flip side of that is the end result isn't "done done". It lacks full validation of input and it lacks complete documentation. In short it's more difficult to use and more prone to bugs and security issues. I suspect that if you compare "done done" API's JSON and SOAP are probably equally productive.

Having said that, I use JSON myself. It's too easy to get going in.

mtrimpe|13 years ago

I don't understand why JSON schemas and the OJMs (Object-JSON-Mappers ;) it would enable aren't being developed more heavily.

I love JSON, but when working on APIs between large companies / departments the "we'll just send JSON like this and email you when we change stuff" really won't cut it.

Xcelerate|13 years ago

Could someone who knows a lot about these things tell me why JSON took such a long time to arrive?

JSON, at its core, is essentially a hierarchy of maps and lists -- which seems a very intuitive and useful way to store data.

XML on the other hand has always baffled me with its attributes and the redundant and verbose tags (why do I need <tag attr="data">data</tag>?). I'm sure there was a good reason at the time for this, so perhaps someone can enlighten me.

lmkg|13 years ago

What took the longest time was for a language to come out with key-value maps as the main core data structure, and a specialized literal syntax. Once that happened, it was relatively quick for that syntax to become a standardized interchange format for K-V data.

Lisp had assoc-lists, but those were a convention, not a specialized structure. Many languages had K-V maps as libraries, but not core structures, and most lacked literal syntax. Eventually most scripting languages starting getting them as native, and even having literal syntax, but they weren't the "go-to" data structure for doing things. In Python, for example, all of its objects are really just hash maps, but when you're working with them you pretend that they're objects and not hash maps, and you use lists more than maps anyways.

JavaScript (and maybe Lua) was the first language to build itself around K-V maps, so it was the first language where idiomatic usage included a lot of map literals. Like Python, its objects were all really just maps, but unlike it encouraged taking advantage of that fact. Also, because it was on the web, there was a lot of need to be serializing data structures and passing them around. Eventually someone realized "this is much better than XML!" and gave it a name, and that's how we got where we are today.

XML's popularity is an accident of history, due in part to the rise of HTML, which is also an accident of history.

duaneb|13 years ago

Honestly, JSON isn't really much of an improvement over a technology that's been around since 1958, i.e. s-expressions. JSON is just the flavor of the month - I know people who dislike it because it loses some of the power of XML (XSLT, attributes, etc.)

In the end, I think it's just subjective. All of the above formats are equally capable of representing the same data.

jbooth|13 years ago

I don't know a lot, but:

XML looked like HTML at a time when the web was the "next big thing". Like, not a technology on the web, or social media, HTML itself was this big revelation. So a solution that looks like HTML has a leg up.

Then, once there were mature parsers in a lot of languages, server software configured by it, etc, XML had some inertia that takes time to displace.

quux|13 years ago

Formats very similar to JSON have been invented many times, for instance NeXT/Apple used to have a human readable format for their plist files that was basically JSON with different characters. http://code.google.com/p/networkpx/wiki/PlistSpec

For some reason they stopped using text plists and replaced it with an ugly xml serialization, but you still see these in Xcode's debugger if you log an NSArray or NSDictionary to the console.

untog|13 years ago

I've actually always liked the idea of XML at it's core- attributes and so on often make data structures easier to understand (just look at HTML), but namespacing and all that junk ruined the whole thing.

The only reason I still use XML every now and then is XPath. There are third-party alternatives for JSON, but XPath is ubiquitous.

TylerE|13 years ago

JSON conceptually has been around forever.

Lisp s-exps date to the original McCarthy paper in 1958, and could represent pretty much everything you can do with JSON, key-value pairs, lists, nested structure, etc.

ucee054|13 years ago

From a big data perspective, I'm pretty sure people were making do with CSV files before JSON came along. I think most practitioners would not subject themselves to stupid, stupid XML unless they really had to.

lenkite|13 years ago

Well-written XML that was designed for humans instead of machines is much, much more easier to read than JSON. The primary reason is that unlike s-expressions or xml, there is no block-name. In JSON you loose valuable time figuring out the block context in a hierarchy since this isn't labelled.

The only kind of JSON that is readable is flat JSON that is nested to a maximum of 1 level.

digisign|13 years ago

Either format can be pretty printed. If you need signposts to figure out where you are, it is a simple manner to add dictionary key names to things.

nemetroid|13 years ago

I don't foresee JSON ever replacing XML as a "full-blown successor" in the contexts where XML actually is useful: for marking up documents.

As a general data storage format, XML is certainly going away.

TazeTSchnitzel|13 years ago

It might for non-text document structures, perhaps.

pumba_lt|13 years ago

JSON is not a silver-bullet. Actually I think JSON-only APIs suck -- an API should have an equivalent XML alternative as well. Let me explain.

Web APIs are not only consumed by client-side Javascript-based AJAX apps -- they are also used by server-side (web)apps where Javascript is much less widespread. If the primary application language is not Javascript for which JSON is a native format, but PHP or Java for example, then its value is much lower.

There are established industries such as publishing that use complex XML workflows -- I don't think JSON will push them out.

XML family so far has much better standard specifications and tool support. Some of the most useful are XPath and XSLT. There are also advanced features -- too complex for some, useful for others -- like namespaces and schemas. If JSON is to expand its use, it will have to go to the same interoperability issues XML addressed, and develop similar features with similar problems. That's why the idea of JSON schemas sounds funny to me.

Let me give an example. I've developed a semantic tool that lets me import 3rd part API data as RDF. If it is available in XML, I can apply a GRDDL (basically XSLT) transformation to get RDF/XML -- and boom, it's there. RDF/XML serves as the bridge format between XML and RDF.

Now if the data is JSON-only, what do I do? I could download an API client, try to write some Java or PHP code, but that would be much less generic and extensible than XSLT. I could probably try a pivotal conversion via JSON-LD somehow, but oh, bummer -- there's no JSON transformation language? Or is there... Javascript? Thanks, I would prefer XSLT anyday since it is designed specifically for these kind of tasks.

My point is, by offering JSON-only you cut off all the useful tools from the XML world, which is pretty well established. I see JSON as an alternative syntax to XML, which is easier to use with Javascript -- but by no means THE "right tool" to all data serialization problems.

nirvdrum|13 years ago

One of my biggest issues with JSON is it's a lot harder to generate valid JSON as a stream. Granted this may be an esoteric use case, but the quoting rules and type representations seem to require some amount of look ahead which isn't fun when generating that stream.

antihero|13 years ago

For human readable stuff, I don't know why we don't use YAML more often. The serializer is utterly fantastic, though I don't think JavaScript support parsing it quickly.

tmcw|13 years ago

Poorly implemented parsers, especially in Javascript.

TazeTSchnitzel|13 years ago

I think JSON is more popular than XML for a lot of things simply because it's so much simpler to interact with. No querying attributes, elements, elements inside elements, text inside elements, etc. You just look up the value attached to a key, or look up an index in an array, and that's it. It's simple every level down. And it's also simple to construct.

streptomycin|13 years ago

> [XML] enabled people to do previously unthinkable things, like exchange Microsoft Office documents across HTTP connections.

wat

Kilimanjaro|13 years ago

JSON is data. XML is markup. JSON won.

Move on.