Last week, a reader of my blog, Geza Herman, told me about a bug he found in David Gay’s strtod() function. In random testing of decimal numbers nearly halfway between double-precision floating-point numbers, he discovered this 53-digit number, which converts incorrectly:
As Geza noted, the problem is in the bigcomp() function, an optimization that kicks in for long decimal inputs. I traced his example through bigcomp() — I’ll show you what’s going on.
(Geza has reported the bug, and David Gay has already fixed it.)
strtod() Returns a Value That Is One ULP Too High
In binary, 1.8254370818746402660437411213933955878019332885742187 is a number with a non-terminating binary fraction portion; here are its first 175 significant bits:
This is extremely close to — but less than — halfway between two double-precision numbers: it has a 0 at significant bit 54 (highlighted), followed by 120 1s. (A halfway case has a 1 at bit 54 followed by all 0s.)
Correctly rounded to the nearest double-precision value this is
or 0x1.d34fd8378ea83p+0 in hexadecimal “%a” notation.
In decimal, this equals
strtod() returns 0x1.d34fd8378ea84p+0, which is 1 binary ULP too high.
What strtod() Does
strtod() starts by computing its floating-point approximation as 0x1.d34fd8378ea83p+0 (which happens to be the correct result). Next, it uses big integer arithmetic to compare the approximation to the decimal input. With the bigcomp optimization enabled, which it is by default, only the first 18 digits of the decimal input are used for the comparison; this truncation, in a close case like this, means strtod() must call bigcomp() to determine if the approximation is correct.
bigcomp() decides which of two adjacent floating-point numbers is closest to the decimal input string. It generates the decimal digits of the number halfway between these two values, comparing them digit-by-digit to the input string. In this case, the halfway value is
which is the same as the input but with a ‘5’ appended as the 54th digit. bigcomp() quits after examining the 53 digits of its input — as it should — but it fails to account for the implied 0 at digit 54. Mistakenly, it deems the input greater than the halfway value, when it should see that the implied 0 makes the input less than the halfway value.
Here is the description of the fix, as logged by David Gay:
20120417 dtoa.c: augment a test in bigcomp() to correctly compare a long input string of digits with a computed string after all of the input string has otherwise been processed. This matters only rarely; a string for which it matters (reported by Herman Geza [sic]) is 1.8254370818746402660437411213933955878019332885742187 .
The fix changes one line of code; this line
if (b->x || b->wds > 1)
is changed to this:
if (b->x || b->wds > 1 || dig > 0)
In this example, dig = 5, so the ‘if’ branch is taken, resulting in the smaller of the two neighboring values to be chosen.
You can avoid the bug by disabling bigcomp, which you can do by “#defining” the variable “NO_STRTOD_BIGCOMP”.