# create a tag called "date" with a date value of Dec 5, 2005
date 2005/12/05
# a date time literal without a timezone
here 2005/12/05 14:12:23.345
# a date time literal with a timezone
in_japan 2005/12/05 14:12:23.345-JST
I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't.
Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.
>I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability.
I call BS. If people want to have custom parsing directives they can send them out of band, encode them in the filename, or whatever. But they don't. And I've not seen this happening with most other serialisation formats either, so why would JSON be a particular target? After all it's value comes from being trivially parsable across languages, and that would be killed by custom parsing directives. Those wanting those would also implement their own parsers etc.
Addition: Besides, reading comments to decide how to parse, implies either "comments on top of the file" or a "2 stage parsing".
With 2 stage parsing, you could implement comments and whetever else yourself, even in pure JSON anyway.
As for "comments on top of the file", well, just disallow them (only allow comments after the first JSON object starts), and no issue with "parsing directives" anymore...
Okay, so the reasoning is we remove a highly useful feature that most people who use JSON regularly want, because some people were abusing it and using terrible practices?
Ugh. Comments are also useful for disabling things without actually deleting them.
My hackey in-band work-around is for the persistence layer (and other code) to ignore any dictionaries in an array that have the "//" key, so I can put a "//": "DISABLED" key at the top of a dict to disable it (and document that and why it's disabled).
Timestamps are so complicated once you factor in timezones and daylight savings that it doesn't belong in JSON. Time zones are not static. They can change from country to country, or even states within countries. Ditto for when daylight savings is enacted during the year - even changing over the years. There is no rhyme or reason to any of this. The data for this has to be stored in tables and time zone meanings can change retroactively. The only reliable time stamp is UTC without leap seconds. (Speaking of leap seconds, who thought seconds going from 0 - 60 rather than 0 - 59 was a good idea?)
Accurate time is one of the most difficult things to model in computer science.
Time is actually quite simple if you have a good mental model of what you're trying to represent and don't try to mix different concepts into a single value.
Basically just decide whether you're trying to store an absolute time (a timestamp will do) or a civil time (year, month, day, etc.) and treat them as two separate data types.
(If you just use "civil time + offset from UTC" like RFC 3339 does, then you can convert it to an absolute time, but you can convert only that one specific value using that offset, and not any other - i.e. that offset is not a substitute for an actual timezone identifier.)
> The only reliable time stamp is UTC without leap seconds.
That doesn't make a lot of sense, as UTC does have leap seconds. It is similar to saying that the fastest car has no wheels, when you really mean that the fastest vehicle is a rocket.
However, it already has a difference of roughly 40 seconds with UTC (and therefore civil time), and dropping leap seconds in civil time will shift midnight to later in the day.
But as the frequency of leap seconds rapidly increases, maintaining UTC will become harder. They will consider dropping leap seconds from UTC in 2023. It is unlikely that people care about having the sun rise at midnight in 30000 years.
I don't think # or // for comments is a very good idea as it would also make newline characters significant. I find it useful to be able store a JSON object per-line.
Personally, I would really like to see integer object keys (as opposed to only string keys). For simple numeric transformations, strings feel really heavy and require annoying conversion in languages. E.g. {"10": 60, "42": 2}.
The flipside is that an integer-keyed map is similar in meaning to an array, which associates by virtue of placement, an integer with the value sitting at that index.
While it's possible to spec it to forbid this interpretation, Lua has made this interpretation a language feature, and it'd become impossible to construct an unambigous parser/printer in Lua for this new format.
This guy is an idiot anyway. There's no way to "fix JSON". All you can do is create a new language, it doesn't matter if you call it JSON 2.0, it will still be incompatible with all the JSON parsers of today. I don't get why he is so mad at people suggesting him to use one of the JSON supersets that exist today.
If // and /* are used as comments, then most of this new extended-JSON will still be valid Javascript.
If # is used as comments, then this breaks documents being Javascript.
The post says that "don't eval() JSON ever", but that's like Crockford leaving out comments originally in order to stop them being abused as processor directives...
Like the post says, JSON is already not guaranteed to be valid JS so this isn't really a problem. The fact that 99% of the time it works to just eval it is great and granted, the "feature" that triggers this is incompatibility is a bit obscure.
But if you just do the right thing from the start you'll never have a thing to worry about in the first place, # comments or not.
JSON (for the most part) is a nice format to work with, aside from loosely defined datetimes as mention.
The two areas where I believe the format can greatly be improved; 1# having a standard to define the structure (sometimes schemas can be handy!); 2# a stranded binary format, yes right now with have UBJSON (which doesn't have a date format, this is worse in binary) and BSON (which contains some MongoDB specific stuff).
I'm not saying they don't have their place, but.. Protocol Buffers are more akeen to .net or Java serialization, in the they're quite fragile if used with different versions and/ or with different vendors.
“Just use X” · For values of X including Hjson, Amazon Ion, edn, Transit, YAML, and TOML. ¶
Nah, most of them are way, way richer than JSON, often with fully-worked-out type systems and Conceptual Tutorials and so on.
What? MOST OF THEM? YAML is not, Hjson is not, TOML is not.
YAML... really? Looking at the examples in the Wikipedia article (https://en.wikipedia.org/wiki/YAML) gives me a headache. Fortunately, most actual YAML files I've seen are not that complicated.
Regarding datetimes, it's worth pointing out the conversation that TOML had about it. It's a pretty long read [2][3][4][5] with lots of points raised for and against, but it also shows some of the process of how consensus was eventually forged: through trial-and-error, some enlightening realizations, expert opinions, and a willingness to leave some aspects of the behavior it up to parser, to avoid requiring all other languages to reimplement half of Java 8 Time.
The salient point being that RFC 3339 does not in truth describe exactly one datatype, so you can't just reference the spec and hope everyone reads it the same way. EDIT: Specifically, RFC 3339 says:
"Date and time expressions indicate an instant in time. Description of time periods, or intervals, is not covered here.", but then goes on to define [6] a number of different syntaxes in ABNF, to indicate the subsets of ISO 8601 that "SHOULD be used in new protocols on the Internet." It essentially never defines what a 'valid' RFC 3339 object looks like, it doesn't explicitly say which ones are considered complete representations, so it's not clear if, say, '2016' is a valid RFC 3339 object... but the ones towards the bottom contain more than one discrete term, and can be presumed to be 'complete' representations. These are:
> He summarized the most upvoted posts from the last thread [1] really well.
I feel like he glossed right past the objections to the biggest and (to my mind) most destructive proposed change, the commas-to-whitespace thing; in fact doubles down on it (let's just declare that commas are whitespace! That surely won't confuse anyone!)
[+] [-] Zardoz84|9 years ago|reply
Full example : https://github.com/Abscissa/SDLang-D/wiki/Language-Guide#exa...
Examples:
Creating a Tree
Creating a Matrix A Tree of Nodes with Values and Attributes Date and Date/Time Literals (and comments!)[+] [-] _pmf_|9 years ago|reply
[+] [-] RoryH|9 years ago|reply
https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...
[+] [-] coldtea|9 years ago|reply
I call BS. If people want to have custom parsing directives they can send them out of band, encode them in the filename, or whatever. But they don't. And I've not seen this happening with most other serialisation formats either, so why would JSON be a particular target? After all it's value comes from being trivially parsable across languages, and that would be killed by custom parsing directives. Those wanting those would also implement their own parsers etc.
Addition: Besides, reading comments to decide how to parse, implies either "comments on top of the file" or a "2 stage parsing".
With 2 stage parsing, you could implement comments and whetever else yourself, even in pure JSON anyway.
As for "comments on top of the file", well, just disallow them (only allow comments after the first JSON object starts), and no issue with "parsing directives" anymore...
[+] [-] moonshinefe|9 years ago|reply
That's terrible reasoning.
[+] [-] realharo|9 years ago|reply
[+] [-] DonHopkins|9 years ago|reply
My hackey in-band work-around is for the persistence layer (and other code) to ignore any dictionaries in an array that have the "//" key, so I can put a "//": "DISABLED" key at the top of a dict to disable it (and document that and why it's disabled).
[+] [-] thymelord|9 years ago|reply
Accurate time is one of the most difficult things to model in computer science.
[+] [-] realharo|9 years ago|reply
This talk explains it VERY nicely: https://www.youtube.com/watch?v=2rnIHsqABfM
Basically just decide whether you're trying to store an absolute time (a timestamp will do) or a civil time (year, month, day, etc.) and treat them as two separate data types.
(If you just use "civil time + offset from UTC" like RFC 3339 does, then you can convert it to an absolute time, but you can convert only that one specific value using that offset, and not any other - i.e. that offset is not a substitute for an actual timezone identifier.)
[+] [-] espadrine|9 years ago|reply
That doesn't make a lot of sense, as UTC does have leap seconds. It is similar to saying that the fastest car has no wheels, when you really mean that the fastest vehicle is a rocket.
TAI is the most reliable and easiest to work with. It relies on atomic clock seconds at sea level. https://en.wikipedia.org/wiki/International_Atomic_Time
However, it already has a difference of roughly 40 seconds with UTC (and therefore civil time), and dropping leap seconds in civil time will shift midnight to later in the day.
But as the frequency of leap seconds rapidly increases, maintaining UTC will become harder. They will consider dropping leap seconds from UTC in 2023. It is unlikely that people care about having the sun rise at midnight in 30000 years.
[+] [-] thaumasiotes|9 years ago|reply
Who thought February going from 1-29 instead of 1-28 was a good idea?
I don't understand why everyone seems to believe that the phenomena are so inherently different.
[+] [-] wtbob|9 years ago|reply
[+] [-] tarnacious_|9 years ago|reply
[+] [-] sanqui|9 years ago|reply
[+] [-] niftich|9 years ago|reply
While it's possible to spec it to forbid this interpretation, Lua has made this interpretation a language feature, and it'd become impossible to construct an unambigous parser/printer in Lua for this new format.
[+] [-] fiatjaf|9 years ago|reply
[+] [-] willvarfar|9 years ago|reply
If # is used as comments, then this breaks documents being Javascript.
The post says that "don't eval() JSON ever", but that's like Crockford leaving out comments originally in order to stop them being abused as processor directives...
[+] [-] daenney|9 years ago|reply
But if you just do the right thing from the start you'll never have a thing to worry about in the first place, # comments or not.
[+] [-] velox_io|9 years ago|reply
The two areas where I believe the format can greatly be improved; 1# having a standard to define the structure (sometimes schemas can be handy!); 2# a stranded binary format, yes right now with have UBJSON (which doesn't have a date format, this is worse in binary) and BSON (which contains some MongoDB specific stuff).
I'm not saying they don't have their place, but.. Protocol Buffers are more akeen to .net or Java serialization, in the they're quite fragile if used with different versions and/ or with different vendors.
[+] [-] thymelord|9 years ago|reply
http://msgpack.org/index.html
JSON Schema draft 4 is the defacto schema standard for JSON.
http://json-schema.org/latest/json-schema-core.html
Having used both XML schema and this one, I much prefer using JSON schema.
[+] [-] fiatjaf|9 years ago|reply
[+] [-] gengkev|9 years ago|reply
[+] [-] niftich|9 years ago|reply
[1] https://news.ycombinator.com/item?id=12328088
Regarding datetimes, it's worth pointing out the conversation that TOML had about it. It's a pretty long read [2][3][4][5] with lots of points raised for and against, but it also shows some of the process of how consensus was eventually forged: through trial-and-error, some enlightening realizations, expert opinions, and a willingness to leave some aspects of the behavior it up to parser, to avoid requiring all other languages to reimplement half of Java 8 Time.
[2] https://github.com/toml-lang/toml/pull/414
[3] https://github.com/toml-lang/toml/pull/362
[4] https://github.com/toml-lang/toml/issues/412
[5] https://github.com/toml-lang/toml/issues/263
The salient point being that RFC 3339 does not in truth describe exactly one datatype, so you can't just reference the spec and hope everyone reads it the same way. EDIT: Specifically, RFC 3339 says:
"Date and time expressions indicate an instant in time. Description of time periods, or intervals, is not covered here.", but then goes on to define [6] a number of different syntaxes in ABNF, to indicate the subsets of ISO 8601 that "SHOULD be used in new protocols on the Internet." It essentially never defines what a 'valid' RFC 3339 object looks like, it doesn't explicitly say which ones are considered complete representations, so it's not clear if, say, '2016' is a valid RFC 3339 object... but the ones towards the bottom contain more than one discrete term, and can be presumed to be 'complete' representations. These are:
[A] partial-time: HH:MM:SS(.SSS)
[B] full-date: YYYY-MM-DD
[C] full-time: 'partial-time' +/- offsetFromUTC(HH:MM)
[D] date-time: 'full-date' "T" 'full-time'
Out of these, [D] is clearly a timestamp of an absolute instant in time, but the rest are debatable.
[6] https://tools.ietf.org/html/rfc3339#section-5.6
[+] [-] outsidetheparty|9 years ago|reply
I feel like he glossed right past the objections to the biggest and (to my mind) most destructive proposed change, the commas-to-whitespace thing; in fact doubles down on it (let's just declare that commas are whitespace! That surely won't confuse anyone!)