17 Digits Gets You There, Once You’ve Found Your Way

http://www.exploringbinary.com/17-digits-gets-you-there-once-youve-found-your-way/


Every double-precision floating-point number can be specified with 17 significant decimal digits or less. A simple way to generate this 17-digit number is to round the full-precision decimal value of the double to 17 digits. For example, the double-precision value 0x1.6d4c11d09ffa1p-3, which in decimal is 1.783677474777478899614635565740172751247882843017578125 x 10-1, can be recovered from the decimal floating-point literal 1.7836774747774789e-1. The extra digits are unnecessary, since they will only take you to the same double.

On the other hand, an arbitrary, arbitrarily long decimal literal rounded or truncated to 17 digits may not convert to the double-precision value it’s supposed to. This is a subtle point, one that has even tripped up implementers of widely used decimal to floating-point conversion routines (glibc strtod() and Visual C++ strtod(), for example).

Deciding Which Floating-Point Number Is Closest

The job of a decimal to floating-point conversion routine is to pick the floating-point number that is closest to the decimal input. The hardest inputs to convert correctly are those that are halfway or near halfway between consecutive floating-point numbers. These are the very inputs for which more than 17 digits — for doubles, up to 768 of them! — may need to be parsed. The extra digits, despite how little they contribute, can put you just at or beyond the halfway point.

Here’s the interesting thing: we know every double has a 17-digit representative that maps to it (actually there are multiple, but for our purposes we’ll consider only one, the nearest), but it may take more than 17 digits of a decimal input to figure out which 17-digit representative to choose! But once we’ve processed our long decimal input, we could replace it with its simpler, 17-digit representative. We’ve still got the correct double, but now our “handle” to it is unambiguously close. It’s like we are using a more precise number to decide which less precise number to pick, although that extra precision is just an illusion.

Example 1: Just Above Halfway

3.08984926168550152811e-32, an example near-halfway case from Vern Paxson’s paper “A Program for Testing IEEE Decimal–Binary Conversion”, is a 21-digit number that converts to 0x1.40de48676653bp-105. Rounded (same as truncated in this case) to 17 digits it is 3.0898492616855015e-32. However, that converts to 0x1.40de48676653ap-105, which is one binary ULP below the correct answer. Even rounded (truncated) to 20 digits — 3.0898492616855015281e-32 — we still come up one ULP short. So we need all 21 digits.

The problem is that this input converts to a number very nearly halfway between two double-precision numbers. (It is slightly above halfway). Here are its first 130 digits in binary to illustrate this (bits 54 through 130 are highlighted):

1.0100000011011110010010000110011101100110010100111010100000000000000
00000000000000000000000000000000000000000000000000000000000001
…p-105

Being so close to halfway, those extra decimal digits are needed to decide which way to go. Ignore those digits, and the conversion will land below the halfway point.

A 17-digit stand-in

The full decimal value of the desired floating-point number is

3.08984926168550180180110631344083416478369964008307296403646973245891
422971698357636226306421889375997125171124935150146484375e-32

Rounded to 17 digits, it’s 3.0898492616855018e-32; that converts to the desired floating-point number.

Example 2: Halfway

1.00000000000000033306690738754696212708950042724609375 is an example I concocted using my decimal/binary converter; it is halfway between consecutive doubles:

1.00000000000000000000000000000000000000000000000000011

It is 54 digits, and all 54 are needed to decide that the correct conversion is 0x1.0000000000002p0 (round half to even) and not 0x1.0000000000001p0. Rounding (truncating) to 17 digits — 1.0000000000000003 — does not work.

A 17-digit stand-in

The full decimal value of the desired floating-point number is

1.000000000000000444089209850062616169452667236328125

Rounded to 17 digits, it’s 1.0000000000000004, which converts to the desired floating-point number.

Example 3: Just Below Halfway

8.36168422905420598437e-214, another 21-digit example from Vern Paxson’s paper, is a little less than halfway between two doubles. Here it is in binary:

1.001000000100000000111010011000101000101010011100101001111111111111
11111111111111111111111111111111111111111111111111111111111110
…p-708

It converts to 0x1.20403a628a9cap-708. But rounded to 17 digits — 8.361684229054206e-214 — it converts up to 0x1.20403a628a9cbp-708. (If you truncated instead of rounded, you would get the correct answer.) Again, assuming you are rounding and not truncating, you need to round it to 18 digits to convert it correctly: 8.36168422905420598e-214.

A 17-digit stand-in

The full decimal value of the desired floating-point number is

8.36168422905420515990295749156693…(509 digits not shown)…0625e-214

Rounded to 17 digits, it’s 8.3616842290542052e-214.

Example 4: sqrt(2)

In case you were thinking I could come up with only contrived examples, take a look at sqrt(2) ≈ 1.414213562373095048801688… . In binary, it’s

1.01101010000010011110011001100111111100111011110011001001… ,

which you can see is somewhat close to halfway. If you test it, you will find that you need 18 digits (rounded to 1.41421356237309505 or truncated to 1.41421356237309504) to get the correction conversion; 1.414213562373095 (rounded/truncated to 17 digits) does not work.

A 17-digit stand-in

The full decimal value of the desired floating-point number is

1.4142135623730951454746218587388284504413604736328125

Rounded to 17 digits, it’s 1.4142135623730951.

Example 5: pi/3

As another realistic example, consider pi/3 ≈ 1.047197551196597746154214… . In binary it’s

1.0000110000010101001000111000001011010111001101100101100001..

which is even closer to halfway than sqrt(2).

19 digits — 1.047197551196597746 — are required to convert it correctly; 1.0471975511965977 (rounded/truncated to 17 digits) does not work.

A 17-digit stand-in

The full decimal value of the desired floating-point number is

1.047197551196597853362391106202267110347747802734375

Rounded to 17 digits, it’s 1.0471975511965979.

Single-Precision

The same issue applies to floats of course. Whereas you can always find a 9-digit decimal stand-in for a float, you may need more digits than that to convert it correctly.

Summary

When entering decimal literals into a computer program, you need to be aware that you may need more than 17 (9) digits to get the correct conversion. Once you know what a given decimal input converts to, it’s easy to find its 17 (9) digit stand-in. But until you convert it, you must assume you need all the digits you have — or even more if your value represents an infinite decimal. Unless you are willing to do an analysis like mine, you won’t know how many digits you need.

(Please let me know if you know of any “underspecified” literals in real code.)

Further Reading

Please check out the related article by “carolomeetsbarolo”: “Mathematical Constants in Program Code”.

Dingbat

Leave a Comment

(To reduce spam, cookies must be enabled)


css.php