top | item 42049875

(no title)

tripple6 | 1 year ago

I love this.

> Having both begin and terminate arrays start with << is more consistent.

It hides context for humans. I am a human and I love to see what opens and what closes the context. Why would `<` open an array if `[` is astonishingly wide-spread practice? Why would `<<` close it just because you think it is more consistent? What if open/close balance is also consistency, especially for nested arrays?

Also just think how many key strokes you'd save if you'd use `]` instead of [Shift]+`,` [Shift]+`,` [Shift]+`4` [Shift]+`.` if you declare it as readable text.

> Using `{` and `}` would lead to more special characters.

Agree. Too many now.

> It is simpler to support graphs in the markup. The fact is that the data being serialized may be structured in a graph.

I can't understand why you call it native graph support. The only thing it does is declaring an identified element and references to the element. I can't see how different is that comparing to XML or JSON that semantically "have graph support" just because they also can declare something considered ids and references to the identified element.

> LaTex and PostScript both use % for comments.

Yes, just learnt that from your comment and https://news.ycombinator.com/item?id=42047634 by zzo38computer. Thank you.

> # matches the usage in ᴄꜱꜱ and ʜᴛᴍʟ, relating to an id/page location.

No. The # symbol is overloaded: it may be a comment start, especially for line-oriented and human-readable text formats or scripts; CSS uses it for IDs; HTML has nothing to do with it since browsers only use # as a part of a URL to reference a particular identified element for navigation purposes only (it's called anchor in URL syntax; formerly web-browsers used <a name="anchor"> to navigate to a part of the page; as of now in the HTML5 world any `id` attribute is considered an anchor which I find a design flaw since ids are something to be used to identify hence any id from the document is exposed for navigation navigation purposes, but <a name="anchor"> is semantically something for navigation).

> Having a space after the # differentiate between and id and comment would be a mistake.

Of course it would in its current perspective if the id declaration is `#`. Don't know what `#<NON_WHITESPACE_CHAR>` would do if it's legal.

> The Formats section is to facilitate interoperability between implementations, e.g. if you are encoding a ɢᴜɪᴅ [easy to say] then format it this way.

I agree that it may look better for consistency purposes, but what interoperability is all that about? Why would formatting even affect it? From the consumer application point of view, it must be handled from its context defined by its purpose and semantic type. If my element/attribute is formally declared as a GUID, then why would I care that much if it's conventionally formatted? Would it be still a GUID if I encode it using Base64? The dashes in GUIDs are for humans only and they are optional, and the application knows it's a GUID to process it even leniently if it can. The same goes for ISBN/ISSN for books and magazines, card numbers, phone numbers, etc -- none of them require dashes or spaces or parentheses to be processed.

This is why "Real numbers *should be stored* with commas for readability." is just hilarious. Why should? May I use underscores or dots or spaces to group digits (seriously, why comma)? Can I group digits after the period? If I need integers, why are they also limited to 32 bits and 64 bits? How would I present an arbitrary precision integer or non-integer number (say, I want the Pi number 197 digits after the 3)? If ∞ is allowed, but no mention on +Inf and -Inf, can be 4.2957×10^24 used instead of 4.2957e24? May I just have simple `D+(\.D+)?` for everything I need for true interoperability?

I agree consistent formatting is really beautiful, but it must never be the key to process data.

> It is more terse than ᴊꜱᴏɴ.

Sorry, it's not.

> Good

Could you please provide an example of minified (a single line, no new lines) array of timestamps from your page?

UPD: I've just seen https://news.ycombinator.com/item?id=42038508 by Oras . Well, you know.

----

In short, too many whys, weird syntax and design decisions, so I cannot see anything that makes it a "better alternative" to XML, JSON, or YAML.

discuss

GeneThomas|1 year ago

I don’t love this.

> I can't see how different is that comparing to XML or JSON that semantically "have graph support" just because they also can declare something considered ids and references to the identified element.

When serialising data with ᴊꜱᴏɴ one has to use special field names such as $id; hoping the programming language does not. It DOES have native graph support that xᴍʟ and ᴊꜱᴏɴ do not.

> # [..] it may be a comment start

No.

> but what interoperability is all that about?

Interoperability between implementations. If you were using Xᴇɴᴏɴ to communicate between two different languages, say the C# and a Python implementation, agreeing of what an integer IS is helpful. Both Xᴇɴᴏɴ libraries can provide support for encoding say ɢᴜɪᴅs. You have missed the point. A user is always free to encode data as arbitrary strings.

> commas [...] readability." is just hilarious. Why should?

Commas makes numbers faster to interpret. Something `ls` is missing. As I stated on another branch English is the global lingua franca so commas every three digits is the standard.

> ∞ is allowed, but no mention on +Inf and -Inf, can be 4.2957×10^24 ∞ is +Inf. 4.2957×10^24 is not the xᴇɴᴏɴ standard.

>> It is more terse than ᴊꜱᴏɴ.

>Sorry, it's not.

See https://news.ycombinator.com/item?id=42049033

<<Timestamps>2026-09-24T16\:45\:22.5383742<&>2026-10-04T18\:25\:12Z<&>2026-04-02<$>>

Better than ᴊꜱᴏɴ which does not do timestamps.

tripple6|1 year ago

> When serialising data with ᴊꜱᴏɴ one has to use special field names such as $id; hoping the programming language does not.

Unless a serialization/deserialization tool supports property name overriding which is trivial.

> It DOES have native graph support that xᴍʟ and ᴊꜱᴏɴ do not.

Again, how is this different from `xml:id` that is referenced from other XML document nodes and what makes it "native graph support"?

> Both Xᴇɴᴏɴ libraries can provide support for encoding say ɢᴜɪᴅs. > Better than ᴊꜱᴏɴ which does not do timestamps.

Better?

There is just no need. For what? These two can be controlled by optional schemas that may be extensible like types to validate in XML Schema or Relax NG. Schemas do not dictate format and you don't need your format to be a schema. I still can't get what makes timestamps (and GUIDs) so special so that they have special sections in your document.

I tend I think JSON also has a design flaw providing first-class support for booleans and numbers in terms of literals it took from JavaScript because the latter needs more complex syntax as a programming language. Ridiculously, XML seems to be perfect in this case unifying scalar values: whatever scalar it encodes, text representation can encode it in any efficient format regardless it is a boolean, number (integer, "real", complex, whatever special), a "human-text" string, timestamp or whatever else; HTML attribute values unlike XML don't even need to be quoted in some trivial cases and even may be omitted for boolean attributes. The application simply parses/decodes its data and manages how the data is deserialized. That's all it needs.

I would probably be happy if, say, there would be a format as simple/minimalistic as possible not even requiring delimiters like or quoted strings unless they are ambiguous. Say, `[foo 'bar baz' foo\ bar Zm9vYmFyCg== 2.415e10 ∞ +Inf -∞ -Infinity \[qux\] +1\ 123\ 456789 978012345678 {k1 v1 k2 v2} aa512e8ecf97445eac10cb5a5ea3ef63 c8a0ebbd 2026-09-24T16:45:22.5383742 P3Y6M4DT12H30M5S]` or similar, maybe with nodes metadata and comments support. The above dumb format covers arrays/lists/sets, strings `foo`, space-containing `bar baz`, `foo bar` strings in human and Base64 encoding, the `2.415e10` number from your document and both four infinity notations, a single string `[qux]` and not a nested array with a single element, a phone number (with space delimited country code, region code and local number), an ISBN, a simple map/object made of two pairs, a GUID, a CRC32 checksum, an ISO-8601 zoned date/time, and an ISO-8601 duration. What more scalar types it can be extended with? Since there is no type for scalars in this "format" does not dictate types or preferred scalar formats letting the application make decision how to interpret these on its own.

> Commas makes numbers faster to interpret. Something `ls` is missing. As I stated on another branch English is the global lingua franca so commas every three digits is the standard.

For whom? Humans? Why would data encoding obey region number|date/time notation standards at all? English, but US, UK, Canada, or any other English-speaking country? You've been told that in that thread too, especially if spaces or underscores are even more readable for monospace fonts. You don't need it.

> See https://news.ycombinator.com/item?id=42049033

Funny enough -- your format saves on key/value pairs syntax appealing to 4 vs 6 overhead (okay, cool), but your array elements delimited with `<&>`, and amazingly bad at keyboard typing ergonomics, loses to simple and regular JSON `,` syntax (3 vs 1 overhead). Isn't it blind or crazy?