top | item 39900007

(no title)

nigeltao | 1 year ago

When I wrote my jsonptr tool a few years ago, I noticed that some JSON libraries (in both C++ and Rust) don't even do "parse a string of decimal digits as a float64" properly. I don't mean that in the "0.3 isn't exactly representable; have 0.30000000000000004 instead" sense.

I mean that rapidjson (C++) parsed the string "0.99999999999999999" as the number 1.0000000000000003. Apart from just looking weird, it's a different float64 bit-pattern: 0x3FF0000000000000 vs 0x3FF0000000000001.

Similarly, serde-json (Rust) parsed "122.416294033786585" as 122.4162940337866. This isn't as obvious a difference, but the bit-patterns differ by one: 0x405E9AA48FBB2888 vs 0x405E9AA48FBB2889. Serde-json does have an "float_roundtrip" feature flag, but it's opt-in, not enabled by default.

For details, look for "rapidjson issue #1773" and "serde_json issue #707" at https://nigeltao.github.io/blog/2020/jsonptr.html

discuss

order

lofenfew|1 year ago

this requires multiple precision to do properly and isn't useful most of the time. its odd to describe this as "not properly". you might say "with exact rounding", but that makes it clearer that this isn't that useful a feature, especially since we usually expect floats to be inexact in the first place.

int_19h|1 year ago

With JSON, there's essentially no such thing as "properly" when it comes to parsing numbers, since the spec doesn't limit the ability of the implementation to constrain width and precision. It only says that float64 is common and therefore "good interoperability can be achieved by implementations that expect no more precision or range than these provide", but note the complete absence of any guarantees in that wording.

The only sane thing with JSON is to avoid numbers altogether and just use decimal-encoded strings. This forces the person parsing it on the other end to at least look up the actual limits defined by your schema.

Dylan16807|1 year ago

Rounding by more than an ULP is pretty bad. I don't think it's odd at all to describe rapidjson's behavior as improper.

At least 122.416294033786585 is between ...888 and ...889, though it's much closer to the former.