9999999999999999.0 – 9999999999999998.0

[+] twtw|7 years ago|reply

I don't understand all the crap that IEEE 754 gets. I appreciate that it may be surprising that 0.1 + 0.2 != 0.3 at first, or that many people are not educated about floating point, but I don't understand the people who "understand" floating point and continue to criticize it for the 0.1 + 0.2 "problem."

The fact is that IEEE 754 is an exceptionally good way to approximate the reals in computers with a minimum number of problems or surprises. People who don't appreciate this should try to do math in fixed point to gain some insight into how little you have to think about doing math in floating point.

This isn't to say there aren't issues with IEEE 754 - of course there are. Catastrophic cancellation and friends are not fun, and there are some criticisms to be made with how FP exceptions are usually exposed, but these are pretty small problems considering the problem is to fit the reals into 64/32/16 bits and have fast math.

[+] svat|7 years ago|reply

> considering the problem is to fit the reals into 64/32/16 bits and have fast math

Floating-point numbers (and IEEE-754 in particular) are a good solution to this problem, but is it the right problem?

I think the "minimum of surprises" part isn't true. Many programmers develop incorrect mental models when starting to program, and get no feedback to correct them until much later (when they get surprised).

It is true that for the problem you mentioned, IEEE 754 is a good tradeoff (though Gustafson has some interesting ideas with “unums”: https://web.stanford.edu/class/ee380/Abstracts/170201-slides... / http://johngustafson.net/unums.html / https://en.wikipedia.org/w/index.php?title=Unum_(number_form... ). But many programmers do not realize how they are approximating, and the "fixed number of bits" may not be a strict requirement in many cases. (For example, languages that have arbitrary precision integers by default don't seem to suffer for it overall, relative to those that have 32-bit or 64-bit integers.)

Even without moving away from the IEEE-754 standard, there are ways languages could be designed to minimize surprises. A couple of crazy ideas: Imagine if typing the literal 0.1 into a program gave an error or warning saying it cannot be represented exactly and has been approximated to 0.100000000000000005551, and one had to type "~0.1" or "nearest(0.1)" or add something at the top of the program to suppress such errors/warnings. At a very slight cost, one gives more feedback to the user to either fix their mental model or switch to a more appropriate type for their application. Similarly if the default print/to-string on a float showed ranges (e.g. printing the single-precision float corresponding to 0.1, namely 0.100000001490116119385, would show "between 0.09999999776482582 and 0.10000000521540642" or whatever) and one had to do an extra step or add something to the top of the program to get the shortest approximation ("0.1").

[+] lenticular|7 years ago|reply

Yeah, the limitations of FP are well-known to anyone who does much numerical work.

Floating point numbers are the optimal minimum message length method of representing reals with an improper Jeffery's prior distribution. A Jeffery's prior is a prior that is invariant under reparameterization, which is a mandatory property for approximating the reals.

In this case, it is where Prob(log(|x|)) is proportional to a constant.

Thus, we aren't going to ever do better than floats if we are programming on physical computers that exist in this universe. There is a reason why all numerical code uses them. Best to learn their limitations if you are going to use them, otherwise use arbitrary precision.

[+] chrisseaton|7 years ago|reply

People get upset that floating point can’t represent all infinite number of real numbers exactly - I can’t understand how they think that’s going to be possible in a finite 64 bits.

[+] yoz-y|7 years ago|reply

To me the only downside of IEEE 754 is that most languages including C and C++ do not provide a sensible canonical comparison methods. This leads to surprised beginners and then a ton of home made solutions which are often not appropriate.

[+] delhanty|7 years ago|reply

Exactly!

Very far from a floating point expert here, but what I do is to scale-down by a few odd prime-power factors as appropriate:

Scaling down by powers of 5 is obviously appropriate for decimals, currency etc.

Scaling down by powers of 3 is good for angles measured in the degrees, minutes, seconds system.

If one scales down a lot there is an increased risk of overflow, so one can compensate by scaling up some powers of 2.

The way I think of this is as using my own manual exponent bias [0].

>the exponent is stored in the range 1 .. 254 (0 and 255 have special meanings), and is interpreted by subtracting the bias for an 8-bit exponent (127) to get an exponent value in the range −126 .. +127.

So, for example, even single-precision number are always exact multiples of 1/(2^126), and I'm just changing the denominator to contain powers of 3, 5, 7, ... etc.

[0] https://en.wikipedia.org/wiki/Exponent_bias

[+] ummonk|7 years ago|reply

Presumably we could actually make decimal floating point computation the default and greatly reduce the amount of surprise. I don't think the performance difference would be an issue for most software.

[+] hedora|7 years ago|reply

I took the table to be a handy guide to where arbitrary precision is the default vs. hw accelerated math.

Filtered by languages I care about, I guess I have no choice but to learn perl 6 if I want correct (but presumably slow) floating point with elegant syntax (my taste might not match yours).

I’d be curious to know what the random GPU languages and new vector instruction sets do with this computation. I don’t think they’re all 754 compliant.

[+] etCeteraaa|7 years ago|reply

Because if there are obvious edge and corner cases, like overflow scenarios, a professional system will either ensure that expectations are lived up to, or flatly denied as errors.

No surprises.

[+] Gibbon1|7 years ago|reply

> exceptionally good way to approximate

You answered your question. 99% of the time being exact is a requirement and calculation speed is utterly unimportant, thus using IEEE 754 results in programs that are fundamentally broken.

[+] svat|7 years ago|reply

A useful website for these that I ran across recently: https://float.exposed/

For example, entering 9999999999999999.0 into "double" gives https://float.exposed/0x4341c37937e08000 and entering 9999999999999998.0 gives https://float.exposed/0x4341c37937e07fff

My wishlist for such a page would contain two additional features:

1. Allow entering expressions like "a OP b == c", so that one can enter "0.1 + 0.2 == 0.3" or "9999999999999999.0 - 9999999999999998.0 == 1.0" and see the terms on the left-hand side and right-hand side.

2. Show for each float the explicit actual range of real numbers that will be represented by that float. For example, show that every real number in the range [9999999999999999, 10000000000000001] is represented by 10000000000000000, and that every real number in the range (9999999999999997, 9999999999999999) is represented by 9999999999999998.

The author of this one has a blog post about it: https://ciechanow.ski/exposing-floating-point/ and I also like a shorter (unrelated) page that nicely explains the tradeoffs involved in floating-point representations and the IEEE 754 standard, by usefully starting with an 8-bit format: http://www.toves.org/books/float/

[+] tzs|7 years ago|reply

The IEEE 754 calculator at http://weitz.de/ieee/ does some of what you ask for. You can enter two numbers, see the details of their representation, and do plus, minus, times, or divide using them as operands and see the result.

[+] DougBTX|7 years ago|reply

> 9999999999999999.0 into "double" gives https://float.exposed/0x4341c37937e08000

Nice that it reformats the input to "10000000000000000.0", gets the point across that a 64 bit double float just doesn't have enough bits to exactly represent 9999999999999999.0, but that it does happen to be able to represent 9999999999999998.0.

[+] csours|7 years ago|reply

This is awesome, I tried to read and understand the 754 float spec before, and I didn't really get it.

Try playing around with half precision, it makes things a lot easier to understand.

[+] scrollaway|7 years ago|reply

OT: What a lovely tld .exposed is. I… really wonder about its majority userbase.

[+] unknown|7 years ago|reply

[deleted]

[+] al2o3cr|7 years ago|reply

The arithmetic is correct - the problem is that "9999999999999999.0" isn't representable exactly.

9999999999999998.0 in IEEE754 is 0x4341C37937E07FFF

"9999999999999999.0" in IEEE754 is 0x4341C37937E08000 - the significand is exactly one higher.

With an exponent of 53, the ULP is 2 - so parsing "9999999999999999.0" returns 1.0E16 because it's the next representable number.

    Using one of these workarounds requires a certain prescience of the
    data domain, so they were not generally considered for the table above.

Doing arithmetic reliably with fixed-precision arithmetic always requires understanding of the data domain. If you need arbitrary precision, you'll need to pay the overhead costs of arbitrary-precision: either by opting-in by using the right library, or by default in languages like Perl6 and Wolfram.

[+] alanfranz|7 years ago|reply

What is the "right answer"? Is the article claiming that such languages don't respect IEEE-754, or that IEEE-754 is shit?

If you want arbitrary precision, use an arbitrary precision datatype. If you use fixed precision, you'll need to know how those floats work.

Pointless article, imho.

[+] sbierwagen|7 years ago|reply

Note that the last example in the list, Soup, handles the expression "correctly", and also happens to be a programming language the author is working on.

[+] ken|7 years ago|reply

> Is the article claiming that such languages don't respect IEEE-754, or that IEEE-754 is shit?

No, I don't think so. Where does that come from? The page doesn't mention FP standards at all.

> If you want arbitrary precision, use an arbitrary precision datatype.

That's the point. Half of them don't offer this feature. The other half make it very awkward, and not the default.

We went through this exercise years ago with integers. These days, there are basically two types of languages. Languages which aim for usability first (like Python and Ruby), which use bigints by default, and languages which aim for performance first (like C++ and Swift), which use fixints by default. It's even somewhat similar with strings: the Rubys and Pythons of the world use Unicode everywhere, even though it's slower. No static limits.

With real numbers, we're in a weird middle ground where every language still uses fixnums by default, even those which aim for usability over performance, and which don't have any other static limits encoded in the language. It's a strange inconsistency.

I predict that in 10 years, we'll look back on this inconsistency the same way we now look back on early versions of today's languages where bigints needed special syntax.

[+] geocar|7 years ago|reply

> Pointless article, imho.

I'm sorry you thought so. It pops up pretty often and always seems to spark a lot of conversation, so I think most programmers that give it any thought can find it a very interesting area of study.

There's an incredible amount of creep: We have what starts with nice notation (like x-y) and have to trade a (massively increased) load in either our minds or in the heat our computer generates. I don't think that's right, and I think the language we use can help us do better.

> What is the "right answer"?

What do you think it is?

Everyone wants the punchline, but this isn't a riddle, and if this problem had a simple answer I suspect everyone would do it. Languages are trying different things here: Keeping access to that specialised subtraction hardware is valuable, but our brains are expensive too. We see source-code-characters, lexicographically similar but with wildly differing internals. We want the simplest possible notation and we want access to the fastest possible results. It doesn't seem like we can have it all, does it?

[+] mcguire|7 years ago|reply

I think the surprise was that Go uses arbitrary precision for constants.

[+] Retric|7 years ago|reply

There are several.

If you subtract two numbers close to each other with fixed precision you don’t know what the revealed digits are. (1000 +/- .5) - (999 +/- .5) = 1 +/- 1.

Thus 0, 1, and 2 are all within the correct range.

[+] mabbo|7 years ago|reply

The point is to illustrate a simple fact that most of us know- but maybe some don't.

https://m.xkcd.com/1053/

[+] dangerbird2|7 years ago|reply

The right answer is to convert to an integer or bignum. If the language reads 9999999999999999.0 as a 32 bit float, you will get 0.0. If it's a double, you'll get 2.0.

[+] bjourne|7 years ago|reply

Take a piece of pen and paper and subtract the two numbers. Whatever number you get for the difference is "the right answer."

[+] InclinedPlane|7 years ago|reply

The point is that this reveals a common weakness in most programming languages. Not that floating point math has limits, but that this isn't well communicated to the user. One of the hallmarks of good programming language design is the "principle of least surprise" which things like funky floating point problems definitely fall into. Not everyone who uses programming languages, in fact very few of them, have taken numerical analysis, and many devs are not well versed in the weaknesses of floating point math. So much so that a very common way for devs to become acquainted with those limits and weaknesses is by simply blundering into them, unknowingly writing bugs, and then finding the hard way the sharp corners in the dark. This is not ideal.

Consider a similar example, pointers. Some languages (like C and C++) use pointers heavily and it's expected that devs using those languages will be experienced with them. However, pointers are very "sharp" tools and have to be used exceedingly carefully to avoid creating programs with major defects (crashes, memory leaks, vulnerabilities, etc.) They are so hard to get right that even software written by the best coders in the world commonly has major defects in it related to pointer use. This problem is so troubling to some that there are many languages (java, javascript, python, C#, rust, etc.) which have been designed to avoid a lot of the most difficult to use aspects of languages like C and C++, they use garbage collection for memory management, they discourage you from using pointers directly, and so on. However, even those languages do very little to protect the user from blundering into a mindfield of floating point math.

Consider, for example, simply this statement:

x = 9999999999999999.0

Seems rather straightforward, right? But it's not, it's a lie. Because in many languages the value of x won't be as above, it'll be (to one decimal digit precision) 10000000000000000.0 instead. Whereas the value of ....98.0 is the same as the double precision float representation to one decimal digit precision (thus the difference between the two comes out as 2.0 instead of 1.0). Now, maybe in a "the handle is also a knife" language like C this is fine, but we have so many languages which go to such extremes everywhere else to protect the user from hurting themselves except when it comes to floating point math. And here's a perfect case where the compiler, runtime, or IDE could toss an error or a warning. Here you have a perfect example of trying to tell the language something you want which it can't do for you in the way you've written, that sounds like an error to me. The string representation of this number implies that you want a precision of at least the 1's place in the decimal representation, and possibly down to tenths. If that's not possible, then it would be helpful for the toolchain you're using for development to tell you that's impossible as close to you doing it as possible, so that you know what's actually going on under the hood and the limitations involved.

Something which would also drive developers towards actually learning the limitations of floating point numbers closer to when they start using them in potentially dangerous ways than instead of having to learn by fumbling around and finding all the sharp edges in the dark. The sharp edges are known already, tools should help you find and avoid them not help new developers run into them again and again.

[+] misterdoubt|7 years ago|reply

Using one of these workarounds requires a certain prescience of the data domain

I'm a little concerned if merely knowing the existence of floating point arithmetic constitutes "prescience."

[+] seanalltogether|7 years ago|reply

Are there any mainstream languages that consider a decimal number to be a primitive type? I feel like floating point numbers are far less meaningful in every day programs. Even 2d graphics would be easier with decimal numbers. Unless you're using numbers that scale from very small to very large, like 3d games or scientific calculations, you don't actually want to use floating point.

[+] mschuetz|7 years ago|reply

> Are there any mainstream languages that consider a decimal number to be a primitive type

Mathematica. But it's not particularely fast.

> Unless you're using numbers that scale from very small to very large, like 3d games or scientific calculations, you don't actually want to use floating point.

Unfortunately, we can sometimes only use floats in 3D graphics and floats aren't even good for semi-large to large 3D scenes. Unity is a particular bad offender. It's not even necessary for meshes but having double precision transformation matrices would make life so much easier. Could simply use double precision world and view matrices, then multiply them together and the large terms would cancel out in the resulting worldView matrix, which can then by cast back to single precision floats.

[+] misterdoubt|7 years ago|reply

Depends how you define 'primitive type.' A decimal number is built-in for C# and comes along with the standard libraries of Ruby, Python, Java, at least.

[+] jimhefferon|7 years ago|reply

Racket: https://docs.racket-lang.org/guide/numbers.html

[+] softawre|7 years ago|reply

C#

https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

[+] lozenge|7 years ago|reply

Cobol?

[+] twtw|7 years ago|reply

Julia has built in rationals (as do a few other languages).

I'm not aware of any language (other than Wolfram) that defaults to storing something like 0.1 as 1/10 - i.e. uses the decimal constant notation for rationals, rather than having some secondary syntax or library.

[+] kccqzy|7 years ago|reply

There are issues with arbitrary precision decimal numbers. For one, you can't deal with things like 1/3: these are repeating decimals so they need infinite memory to represent.

[+] garethrees|7 years ago|reply

The linked post is a bit poorly expressed, but I think there is a good point there: fixed-size binary floating-point numbers are a compromise, and they are a poor compromise for some applications, and difficult to use reliably without knowing about numerical analysis. (For example, suppose you have an array of floating-point numbers and you want to add them up, getting the closest representable approximation to the true sum. This is a very simple problem and ought to have a very simple solution, but with floating-point numbers it does not [1].)

Perhaps it is time for the developers of new programming languages to consider using a different approach to representing approximations to real numbers, for example something like the General Decimal Arithmetic Specification [2], and to relegate fixed-size binary floating-point numbers to a library for use by experts.

There is an analogy with integers: historically, languages like C provided fixed-size binary integers with wrap-around or undefined behaviour on overflow, but with experience we recognise that these are a poor compromise, responsible for many bugs, and suitable only for careful use by experts. Modern languages with arbitrary-precision integers are much easier to write reliable programs in.

[1] https://en.wikipedia.org/wiki/Kahan_summation_algorithm [2] http://speleotrove.com/decimal/decarith.html

[+] f2f|7 years ago|reply

Also worth checking: http://0.30000000000000004.com/

most popular previous discussion: https://news.ycombinator.com/item?id=10558871

[+] chaitanya|7 years ago|reply

There's an easier way to specify long floats in Common Lisp: use the exponent marker "L" e.g. 9999999999999999.0L0. No need to bind or set reader variables.

That said, even in Common Lisp I think its only CLISP (among the free implementations) that gives the correct answer for long floats.

CLISP:

    [1]> (- 9999999999999999.0L0 9999999999999998.0L0)
    1.0L0

SBCL, CMUCL and Clozure CL:

    * (- 9999999999999999.0L0 9999999999999998.0L0)
    2.0d0

The standard only mandates a minimum precision of 50 bits for both double and long floats, so there's no guarantee that using long floats will give the correct answer, as we can see.

http://www.lispworks.com/documentation/HyperSpec/Body/t_shor...

[+] mark-r|7 years ago|reply

Is floating point math broken?

https://stackoverflow.com/questions/588004/is-floating-point...

No, it's just that a lot of people don't understand its limitations.

[+] mbostock|7 years ago|reply

It’s nice that JavaScript has arbitrary-precision integers now. 9999999999999999n - 9999999999999998n === 1n

[+] dsalaj|7 years ago|reply

Google calculator gives answer 0 where as duckduckgo calculator answers with 2. xD

[+] nly|7 years ago|reply

This is particularly sucky to solve in C and C++ because you don't get arbitrary precision literals.

    #include <boost/multiprecision/cpp_dec_float.hpp>
    #include <boost/lexical_cast.hpp>
    #include <iostream>

    using fl50 = boost::multiprecision::cpp_dec_float_50;

    int main() {
        auto a = boost::lexical_cast<fl50>("9999999999999999.7");
        auto b = boost::lexical_cast<fl50>("9999999999999998.5");
        std::cout << (a - b) << "\n";
    }

works

    int main() {
        fl50 a = 9999999999999999.7;
        fl50 b = 9999999999999998.5;
        std::cout << (a - b) << "\n";
    }

doesn't, even if you change fl50 out for a quad precision binary float type.

[+] rg3|7 years ago|reply

Note in C you can get the correct result if you use long doubles, which normally go to 80 bits[1]:

printf("%Lf\n", 9999999999999999.0L - 9999999999999998.0L);

In my x86_64 computer it breaks when you add enough digits. At this point it started outputting 0.0 as the difference:

printf("%Lf\n", 99999999999999999999.0L - 99999999999999999998.0L);

With 63 bits for the fraction part you more or less get around 19 decimal digits of precision, and the expression above uses 20 significant digits.

[1] https://en.wikipedia.org/wiki/Extended_precision#x86_extende...

[+] thanatos_dem|7 years ago|reply

Interestingly, SQLite gets it wrong, returning 2.0, but MySQL, MariaDB, Postgres, and Cockroach all get it right at 1.0

I guess this comes down to most of them having implementations of arbitrary precision decimals.

[+] tzury|7 years ago|reply

With python, I get 2 even when using Decimals.

    Python 2.7.3 (default, Oct 26 2016, 21:01:49)
    [GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more 
    information.
    >>> from decimal import *
    >>> getcontext().prec
    28
    >>> a=Decimal(9999999999999999.0)
    >>> b=Decimal(9999999999999998.0)
    >>> a-b
    Decimal('2')

That is unexpected.

[+] rscho|7 years ago|reply

There is interesting ongoing research on representing exact reals: https://youtu.be/pMDoNfKXYZg

[+] gpm|7 years ago|reply

2 with 64 bit floats, 0 with 32 bit floats.

[+] preinheimer|7 years ago|reply

$ php -v

PHP 7.2.10 (cli) (built: Oct 9 2018 14:56:43) ( NTS )

$ php -r "echo 9999999999999999.0 - 9999999999999998.0;"

2

$ php -r "echo bcsub('9999999999999999.0', '9999999999999998.0', 1);"

1.0

bcmath - http://php.net/manual/en/function.bcsub.php

[+] Steve44|7 years ago|reply

The accounting software we use has a built in calculator which has a similar problem.

5.55 * 1.5 = 8.3249999999999....

26.93 * 3 = 80.7899999999999....

I raised it with the supplier some time ago, they said it's just the calculator app and the main program isn't affected. Quite shocking that they are happy to leave it like this.

260 comments