top | item 21005704

The cost of parsing JSON

543 points| s9w | 6 years ago |v8.dev

293 comments

order
[+] bryanrasmussen|6 years ago|reply
I actually think the previous title of this article which was something about JSON.parse being faster than object instantiation or something like was clearer because in English the cost of something implies that it is a negative, whereas here the performance cost is a benefit relative to another solution with a higher cost.

maybe I'm being picky though.

[+] tpurves|6 years ago|reply
I was expecting something about the extent to which JSON processing in the world contributes to global warming or some such
[+] leovailati|6 years ago|reply
I agree. I was expecting something about protocol buffers or a binary based representation of JSON.
[+] Tade0|6 years ago|reply
I think this fragment catches the spirit of this piece:

A good rule of thumb is to apply this technique for objects of 10 kB or larger — but as always with performance advice, measure the actual impact before making any changes.

Although it may still not be worth it. At work I have this hand-rolled utility for mocking the backend using a .har file(which is a JSON). I use it to reproduce bugs found by the testers, who are kind enough to supply me both with such a file and a screencast.

On a MacBook Pro a 2.6MB .har file takes about 140ms to parse and process.

[+] Klathmon|6 years ago|reply
I find this really interesting, because at some point the absolute performance benefits of `JSON.parse` is overshadowed by the fact that it blocks the main thread.

I worked on an app a while ago which would have to parse 50mb+ JSON objects on mobile devices. In some cases (especially on mid-range and low-end devices) it would hang the main thread for a couple seconds!

So I ended up using a library called oboe.js [1] to incrementally parse the massive JSON blobs putting liberal `setTimeout`'s between each step to avoid hanging the main thread for more than about 200ms at a time.

This meant that it would often take 5x longer to fully parse the JSON blob than just using `JSON.parse`, but it was a much nicer UX as the UI would never hang or freeze during that process (at least perceptively), and the user wasn't waiting on that parsing to happen to use the app, there was still more user-input I needed from them at that time. So even though it would often take 15+ seconds to parse now, the user was often spending 30+ seconds inputting more information, and now the UI would be fluid the whole time.

[+] stu_k|6 years ago|reply
You might be interested in a tool I wrote to serve .har files called server-replay: https://github.com/Stuk/server-replay

It also allows you to overlay local files, so you can change code while reusing server responses.

[+] 19ylram49|6 years ago|reply
I mean, I get it, but I think performance is overrated in this particular case; unless it’s a significant and/or very noticeable difference, stick to object literals, please. I’d probably fire someone if I started to see `JSON.parse(…)` everywhere in a codebase just for “performance reasons” … remember, code readability and maintainability are just as important (if not more).
[+] SirensOfTitan|6 years ago|reply
> I'd probably fire someone if I started to see `JSON.parse(...)`

I've had the privilege of working in organizations that consider mistakes to be the cornerstone of resilient systems. Because of that, comments like this scare me, even when intentionally hyperbolic. More so, if the product works well and is being maintained easily, why would you micromanage like that? Sounds like a minor conversation only worth having if the technical decision is having a real impact.

Thomas J. Watson:

> Recently, I was asked if I was going to fire an employee who made a mistake that cost the company $600,000. No, I replied, I just spent $600,000 training him. Why would I want somebody to hire his experience?

[+] flabbergast|6 years ago|reply
> I’d probably fire someone if I started to see `JSON.parse(…)` everywhere in a codebase just for “performance reasons” …

Yep, and I'd fire you for doing that! There are better ways to manage instead of showing off your authority. Oh, and by the way, would some JSON.parse statements for performance be the worst thing in your codebase(s) you guess? I mean, I cannot believe that would be the worse in your codebase. Also, if it really helps to use some JSON.parse for creating big objects for performance reasons, who cares? Instead of firing 'someone' maybe you can add some annotation to it for readability (or if that is below your imaginary level, ask the developer if he/she can add that).

Sry, but I hate people that misuse their authority by imposing their subjective opinions.

[+] Klathmon|6 years ago|reply
They say in the linked article that this should only be used for objects about 10kb and larger.

I'd argue that if you have 10kb or larger object literals in your codebase, you are already missing the mark on readability and maintainability in some ways.

[+] untog|6 years ago|reply
> remember, code readability and maintainability are just as important (if not more).

I don't know about that. Prioritising making your own job easier over the experience of all your end users feels like a much more fireable offense to me.

In this particular case I'm still a little wary of it because it feels like it's optimising for a current implementation with no idea what the future performance implications might be (or current implications in non V8 engines?) but this trend of prioritising developer experience over everything feels like a very bad one to me. It's the same reason given to justify making every web site a React app with no thought toward the extra JS payload you're sending when it's not needed.

[+] DrJokepu|6 years ago|reply
I would fire middle managers for firing individual contributors for trivial, easily correctible issues like that.
[+] dahart|6 years ago|reply
It'd certainly be a good idea to understand exactly what the alternative is when you see JSON.parse() before deciding it's bad or firing anyone, right? There are definitely some legit cases for JSON.parse(). Not to mention that a full round of you setting clear expectations, giving examples of what's recommended and what's not, giving people a chance to learn & grow, and documenting repeat offenses, should all be done before booting someone...?

Deep-copying JSON objects using stringify+parse is not just faster, but less problematic and less code than writing a recursive object copy routine.

[+] 5trokerac3|6 years ago|reply
First paragraph...

> This knowledge can be applied to improve start-up performance for web apps that ship large JSON-like configuration object literals

Third paragraph...

> A good rule of thumb is to apply this technique for objects of 10 kB or larger — but as always with performance advice, measure the actual impact before making any changes.

I'd fire people who don't RTFM

[+] jackcodes|6 years ago|reply
I wouldn’t mind having this in my build step, as it’s all minified and unreadable anyway, so what do I care, but I agree with you fully.

Not only would you be missing out on readability, none of your linters will catch errors within that string any more and if you use something like prettier, well, god help you. You’re almost guaranteed to introduce more wasted time than you’ll save with this doing it manually.

[+] lacker|6 years ago|reply
Well, they are suggesting it for literals that are 10 kB or larger. That means they aren't really talking about code that's in your normal codebase - it's quite rare to have a literal that large. It is more likely this is relevant for backend tools that autogenerate JavaScript code to be sent to a client.
[+] tracker1|6 years ago|reply
For the main two apps I work on, there's some configurations that are different between different client deployments, this includes i18n strings, configuration settings/options, theme options and a couple of images (base64 encoded) for theming. Switching to JSON.parse was a pretty significant impact, from about over 200ms to under 100ms for my specific use case (IIRC). Memory usage was also reduced.

I don't remember the specific numbers... it was an easy change in the server handler for the base.js file that injects a __BASE__ variable.

    var clientConfig = JSON.Stringify(base.Env.Settings.ToClient(null)).Replace("\"", "\\\"");
    // NOTE: JSON.parse is faster than direct JS object injection.
    ClientBase = $"{clientTest}\nwindow.__BASE__ = JSON.parse(\"{clientConfig}\")";
    ...
    return Content($"{ClientBase}\n__BASE__.acceptLanguage=\"{lang}\";", "application/javascript");
The top part is actually a static variable that gets reused for each request, the bottom is the response with the request language being set for localization in the browser app.
[+] eyelidlessness|6 years ago|reply
I totally agree that inlining `JSON.parse` of string literals in source is a bad idea and I would reject it in a code review except under the most extreme circumstances (and even then try to identify a better solution).

On the other hand, knowing the performance characteristics, this is something that compilers could do as an optimization. Who knows if that's worth the effort, but this kind of research is part of determining that.

[+] tzs|6 years ago|reply
The JSON.parse approach might also be useful if the same data needs to be used in non-JavaScript code too.

You could then use the same string in JSON.parse(...) in your JavaScript, json_decode(...) in your PHP, JSON::Parse's parse_json(...) in your Perl, json.loads(...) in Python, and so on.

If you do have constant data that needs to match across multiple programs, it will probably be better in many or even most applications to store the constant data in one place and have everything load it from there at run time, but for those cases where it really is best to hard code the data in each program, doing so as identical JSON strings might reduce mistakes.

[+] geddy|6 years ago|reply
> I’d probably fire someone if I started to see `JSON.parse(…)`

Guys - I think he was being hyperbolic. Ya know, like everyone does on the Internet. If he had said "if I had to look at JSON.parse(...) lines constantly, I'd jump off a building!" I doubt you all would be calling 911 over an attempted suicide.

Seriously, chill.

[+] quickthrower2|6 years ago|reply
If I used this one weird trick, I'd want it to be compile time checked.

I'd stick that JSON in a separate file, get typescript to compile it "just to check it's OK" then get the compiled code and include it as a string using something like https://webpack.js.org/loaders/raw-loader/, I guess (not used it before).

There might be a leaner way to do this (maybe the whole thing can be done as a webpack loader in one step), but something like this.

[+] beatgammit|6 years ago|reply
They mentioned that it should only be used for very large objects (say, 10k), so if you're seeing ~10k, hard-coded objects throughout your code, you should probably fire someone. If it's in just a few places, there should be a comment describing it (e.g. "large object constructed from DB query, use JSON to make page load faster").
[+] thoughtpalette|6 years ago|reply
Believe you can use "Interceptors" or the Adapter pattern on the Front-end to easily use JSON.parse once for all your http calls instead of littering it throughout the code base.
[+] iamleppert|6 years ago|reply
Why do you care? It’s syntax and can be automated via build tools so you need not hurt your eyes with syntax that you consider to be unpleasant.

Which that’s the crux of the issue here, your opinion.

[+] AgentOrange1234|6 years ago|reply
TFA says this is could make sense for objects over 10kb. They clearly aren’t advocating doing it everywhere in a code base.
[+] edf13|6 years ago|reply
No there’re not
[+] mumblemumble|6 years ago|reply
Deliberately provocative conversation piece:

If you're concerned enough about performance, or message passing costs are enough of an overall performance bottleneck, that parsing your messages even 1.7x as fast is worth changing the way you code, you probably shouldn't be using JSON as your message format in the first place.

[+] michaelmcmillan|6 years ago|reply
So I guess we should "transpile" static objects into strings that we call with JSON.parse?

Not sure if I should end this comment with a /s or not.

[+] IggleSniggle|6 years ago|reply
Commenting the security issue from the end of explainer for visibility.

I’m having flashbacks to the Java serialize vulnerabilities from a couple years ago.

ECMAScript and JSON do not have the same set of escape characters:

``` Note: It’s crucially important to post-process user-controlled input to escape any special character sequences, depending on the context. In this particular case, we’re injecting into a <script> tag, so we must (also) escape </script, <script, and <!- -. ```

[+] drinchev|6 years ago|reply
Well I guess this means that if you have a 1k+ lines of static JSON, it would be better if you consider converting it to a string and use JSON.parse instead.

I'm not sure I can find a use case of such a big object declaration. Usually what you do is to get it from somewhere ( file, db - with nodejs, xhr ) where it's been parsed with JSON.parse anyway.

[+] gqcwwjtg|6 years ago|reply
This makes total sense, it really just means that the time it takes to compile JSON.parse on a string literal is offset by how much simpler and faster parsing a JSON object is than a js one.
[+] epx|6 years ago|reply
This is the XOR AX,AX of the 21th century
[+] ijpoijpoihpiuoh|6 years ago|reply
Title correction: it's faster to parse and initialize using JSON than to parse and compile the code required to initialize Objects directly. If the code is already compiled, it's much faster to initialize in code. At least that's my understanding of the linked article.
[+] drtz|6 years ago|reply
Couldn't this same performance boost be achieved by adding an optional "strict mode"-like flag for object literals to v8? Adding JSON.parse(…) everywhere you need an object literal seems exceptionally kludgy, even for JS.
[+] xsmasher|6 years ago|reply
something like let map = {{ x:7, y:13 }};

where the double-brackets promise you're only going to do JSONish stuff in there

[+] no_gravity|6 years ago|reply

    Because the JSON grammar is much simpler than
    JavaScript’s grammar, JSON can be parsed more
    efficiently than JavaScript.
Hmm.. shouldn't that hold for most programming languages then?

Let's try it for PHP:

    time php -r 'for ($i=0;$i<10000000; $i++) $data = [1,2,3];'
    
    real 0m0,173s
    user 0m0,161s
    sys  0m0,012s

    time php -r 'for ($i=0;$i<10000000; $i++) $data = json_decode("[1,2,3]");'
    
    real 0m4,125s
    user 0m4,120s
    sys  0m0,005s
So for 10 million repetitions, a small PHP structure is about 20x faster then parsing JSON. But to test the point of the article, one should use the sama data structure it uses (https://raw.githubusercontent.com/WebKit/webkit/ffdd2799d323...) and parse it only once.
[+] nicoburns|6 years ago|reply
I wish there was an option for off the main thread ("async") parsing of JSON. It's easy to cause UI lag by parsing (or serializing) large objects, and this seems very unnecessary.
[+] adossi|6 years ago|reply
I see this sentiment echoed a few times in this discussion thread. I use JSON daily, parsing and serializing everywhere, and yet I've never been in a scenario where I needed to parse a very large JSON object on the client side. The biggest JSON objects I work with are in the neighborhood of ~80 MB, but the client's browser never sees those behemoths. At most I return a few KB to the client's browser via an API response where it is then parsed and consumed.

Is this notably a problem where the back-end is written in JavaScript?

[+] no_wizard|6 years ago|reply
Since a very common use of an object is simply as a Map, I wonder if this would apply to using Maps
[+] jedimastert|6 years ago|reply
AFAIK (I'm not a js expert) there's not a huge difference behind the scenes.
[+] xg15|6 years ago|reply
I'm somewhat surprised that parsing an JS object literal is slower than: tokenizing a string literal, resolving all escapes/unicode encodings/etc, resolving the JSON object, resolving the JSON.parse() function, invoking the function, context-switching to native and then actually parsing the JSON.

(Though you can probably do some optimizations, such as treating "JSON.parse" as a keyword if you can be sure nothing tampered with it)

However, if that's the case, it sounds like a good candidate for an optimisation for V8: why not speculatively try to parse object literals as JSON and only fall back to JS if this causes an error?

Also, didn't their note about Chrome's bytecode cache kind of defeat their point? Yes, JSON would be faster on first load, but it should be slower on subsequent loads as the parsed object can be pulled from the bytecode cache while the JSON literal has to be parsed again on each load.

[+] anderslaub|6 years ago|reply
Dangerous advice if object values potentially can contain string terminators..

var obj = { "Name": "Joe`s sloppy place" };

will work well as plain object while

obj = JSON.parse('{ "Name": "Joe's sloppy place" }');

Will kill your site.. hard to control especially with several different unicode characters all being interpreted as plings by json parse..

[+] higherkinded|6 years ago|reply
Curious! Instantiation by means of the language is actually slower than parsing stuff and instantiating the objects that way. Didn't ever think of that possibility before. Anyway, I understand that JavaScript is much more complicated to parse than bare JSON, but still, that just feels odd. Good read!
[+] anderslaub|6 years ago|reply
Dangerous advice if object values can contain string terminates..

var obj = { "Name": "Joe`s sloppy" };

will work well as plain object while

obj = JSON.parse('{ "Name": "Joe's sloppy" }');

Will kill your site.. hard to control especially with several different unicode characters all being interpreted as plings

[+] olliej|6 years ago|reply
Having worked on parsing + code generation for object literals, the performance is not just a matter of JSON being simpler (IMO the difference in parsing cost is negligible).

The problem with object literals is the cost of code generation, in cpu time and memory usage, and then subsequent execution. The difference is so monumental that JSC will try to parse any JS first as a JSONP-style object literal, because the cost of attempting the object literal parse is so small.

For small literals that are hit multiple times, the literal will be faster in the long term - JSON.parse() results in opaque shape for the result which hinders lowering, etc, and the implementation is somewhat generalized to the case a large object graphs so many of the space improvements that happen for object literals don't happen.

[+] vardump|6 years ago|reply
Battery life.

People seem to love to dismiss small improvements, even if they're practically effortless. Remember, faster performance equals lower battery consumption. Bad slow javascript causes browsers to waste energy.

Small things add up.