Incorrect Directed Conversions in David Gay’s strtod()

For correctly rounded decimal to floating-point conversions, many open source projects rely on David Gay’s strtod() function. In the default rounding mode, IEEE 754 round-to-nearest, this function is known to give correct results (notwithstanding recent bugs, which have been fixed). However, in the less frequently used IEEE 754 directed rounding modes — round toward positive infinity, round toward negative infinity, and round toward zero — strtod() gives incorrectly rounded results for some inputs.

(Update 7/11/10: dtoa.c is now fixed — see below.)

Discovering the Problem

As a follow up to my article “Visual C++ and GLIBC strtod() Ignore Rounding Mode,” I asked Mark Dickinson — the developer who maintains Python’s copy of dtoa.c — why Python does not enable what I call directed conversions. Out of that discussion came Mark’s conjecture: the directed rounding code was not correct. He soon followed up with an example to prove it.

An Automated Way to Find Examples

I wanted to find more examples, but I wanted a C program to do it for me. I needed a way to generate examples whose correct outcomes were known (in my evaluations of round-to-nearest conversions, David Gay’s strtod() served this purpose). Inspired by Mark’s example, this is the algorithm I devised:

Generate a random decimal string that represents a number that is not an integer and is not dyadic.
Convert the decimal string to a double (I used David Gay’s strtod(), but the built-in strtod() would have been fine — correct rounding does not matter here).
Convert the double to a decimal string, giving the exact decimal representation of a floating-point number (I used David Gay’s g_fmt() function, since I’m running on Windows and printf() in Visual C++ can’t print all the decimal digits of a floating-point number).
Change the last digit of the new decimal string — which by the way is always 5 (this article indirectly explains why) — to either 4 or 6, depending on whether you want a number just below or just above the floating-point number.

Except for Mark’s example, the examples below were found by my C program implementation of this algorithm. I ran the program against a copy of dtoa.c most recently changed on “Tue Feb 2 23:05:34 MST 2010,” which I downloaded on May 27, 2010.

Examples: Decimal Numbers Just Above a Floating-Point Number

This section shows two examples that David Gay’s strtod() (hereafter referred to only as strtod()) rounds incorrectly; they are very close to, but just above (in absolute value), a floating-point number.

Example 1

This is Mark’s example, and it shows a positive number rounding to nearest in round toward positive infinity mode:

1.100000000000000088817841970012523233890533447265626

It converts to this binary number, which has a non-terminating binary fraction portion:

1.00011001100110011001100110011001100110011001100110100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001…

Rounded toward positive infinity the result should be

1.0001100110011001100110011001100110011001100110011011

In hexadecimal “%a” notation, this equals 0x1.199999999999bp+0.

strtod() converts it incorrectly to 0x1.199999999999ap+0, which is one ULP below the correctly rounded value. 0x1.199999999999ap+0 is the floating-point number equal to decimal 1.100000000000000088817841970012523233890533447265625, which differs from the input decimal value only in the last decimal digit (5 vs 6).

Example 2

This example shows a negative number rounding to nearest in round toward negative infinity mode:

-0.91276999999999997026378650843980722129344940185546876

It converts to this non-terminating binary number:

-0.111010011010101101001011011100101100010100011001011110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001…

Rounded toward negative infinity the result should be -0x1.d35696e58a330p-1; strtod() converts it incorrectly to -0x1.d35696e58a32fp-1.

Examples: Decimal Numbers Just Below a Floating-Point Number

This section shows two examples that strtod() rounds incorrectly; they are very close to, but just below (in absolute value), a floating-point number.

Example 3

This example shows a negative number rounding to nearest in round toward positive infinity mode:

-266.240000000000009094947017729282379150390624

Rounded toward positive infinity it should be -0x1.0a3d70a3d70a3p+8; strtod() rounds it incorrectly to -0x1.0a3d70a3d70a4p+8.

Example 4

This example shows a positive number rounding to nearest in both round toward negative infinity and round toward zero modes:

8.255628858767918002472043289952338102302250764062685473021474535926245152950286865234374e-17

Rounded toward negative infinity and rounded toward zero it should be 0x1.7cb9433617c9bp-54; strtod() rounds it incorrectly — in both cases — to 0x1.7cb9433617c9cp-54.

Discussion

In my article “Visual C++ and GLIBC strtod() Ignore Rounding Mode” I said

The percentage of incorrect conversions for both Visual C++ and glibc ranged between 49.00% and 49.06%, suggesting they were not reacting to the rounding mode at all. (Why it wasn’t almost exactly 50% I don’t know.)

It might not have been 50% because, although I didn’t know it at the time, strtod()’s conversions weren’t 100% correct! ~~(I’ll rerun those tests when dtoa.c is fixed.)~~ (Update: the percentages are still near 49%, even with the fixed dtoa.c, so scratch that theory.)

Do You Use Directed Rounding?

If your project uses dtoa.c, could you let me know whether it enables directed conversions? That is, does it define the “Honor_FLT_ROUNDS” flag?

You should be aware that strtod() reacts to the rounding mode, although incorrectly, when Honor_FLT_ROUNDS is not defined.

Bug Report

I reported this to David Gay; I’ll keep you posted.

Update: dtoa.c is Fixed

David Gay updated dtoa.c with a fix, and it appears to work: all four examples above now round correctly, as did the millions of random examples I checked.

Numbers Greater Than DBL_MAX Should Convert To DBL_MAX When Rounding Toward Zero

7 comments

Eric McIntosh says:

December 17, 2010 at 6:35 am

Dear Rick; this information is exactly what I was looking for.
I am currently working on the SixTrack (Fortran) program here
at CERN with a view to obtaining IDENTICAL (i.e. 0 ULP difference)
results on any IEEE 754 platform with any standard Fortran compiler.
I expect the issue of correctly rounded decimal/binary conversions to be
solved in Fortran 2003, whenever available 🙁 In the meantime I would
like to get my hands on the corrected version of dtoa.c and use it
explicitly. Could you please help? I found it at netlib but it is not clear
to me that it is the version to which you refer. The original paper stated
that only round to nearest was supported.

I believe my methodolgy is largely language independemt, but SixTrack is/was
Fortran 77 subset of Fortran 95. I cannot just take C conversion routines for
a particular system as my aim is to be completely portable across all
IEEE paltforms amd hopefully GPUs next year.

Any help would be greatly appreciated as I hope to publish my work (if
successfull) as early as possible next year.

Thanks in advance. Eric.
Rick Regan says:

December 17, 2010 at 9:10 am

Eric,

The latest version of dtoa.c has this fix (the source code does not flag the changes, but they are indicated in a file called “changes” — see the entry for July 7, 2010).

Gay’s original paper was from 1990, and it looks like rounding-mode support was added in 2000 (based on the entry in the changes file).

When you build dtoa.c, make sure to define the flag IEEE_8087. Also, you need to make sure it is run in 53-bit mode and not 64-bit mode (for example, by calling he equivalent of _control87(PC_53, MCW_PC)).

BTW, I’ve written extensively about decimal to floating-point conversions (for starters, check out my topics page.)

Good luck with your project!
Eric McIntosh says:

March 17, 2012 at 10:27 am

Thanks Rick; I am finally chnaging to dtoa.c and I indeed
disbale extended_precision anyway). I am almost there now
as I am getting 0 ULP difference with two different compilers
at different levels of optimisation. I am really hoping this
solution for data conversion will allow me to claim success!
Eric
Rick Regan says:

March 17, 2012 at 12:27 pm

Eric,

Sounds good. If you publish anything, let me know. I’m always interested in this stuff.
Aman Jiang says:

February 26, 2018 at 6:28 am

I found that David Gay’s strtod() will set errno to ERANGE while parsing “4.940656458412e-324”, this is the smallest double value so I don’t think it’s necessary to set errno.
Rick Regan says:

February 26, 2018 at 8:55 am

@Aman,

I assume it returns 4.940656458412e-324 and not 0? Does it set errno for other subnormal values? How about other strtod()s (Microsoft, glibc)? (I don’t have a system handy right now to test myself.)
Aman Jiang says:

February 26, 2018 at 9:45 pm

@Rick Regan

Sorry for my negligence. I did do my homework, the C11 standard says:

If the result underflows (7.12.1), the functions return a value whose magnitude is no greater
than the smallest normalized positive number in the return type; whether errno acquires
the value ERANGE is implementation-defined.

So it’s a implementation-defined behavior.

Actually, for other subnormal values, they do set errno – including David Gay’s strtod(), VC14 and glibc. Even so, I think it’s a little counterintuitive.

Comments are closed.