Numbers are entered into computers as strings of text. These strings are converted to binary, into the numeric form understood by the computer’s hardware. Numbers with a decimal point — numbers we think of as real numbers — are converted into a format called binary floating-point. The procedure that converts decimal strings to binary floating-point — IEEE double-precision binary floating-point in particular — goes by the name strtod(), which stands for string to double.
Converting decimal strings to doubles correctly and efficiently is a surprisingly complex task; fortunately, David M. Gay did this for us when he wrote this paper and released this code over 20 years ago. (He maintains this reference copy of strtod() to this day.) His code appears in many places, including in the Python, PHP, and Java programming languages, and in the Firefox, Chrome, and Safari Web browsers.
I’ve spent considerable time reverse engineering strtod(); neither the paper nor the code are easy reads. I’ve written articles about how each of its major pieces work, and I’ve discovered bugs (as have a few of my readers) along the way. This article ties all of my strtod() research together.
strtod()’s Raison D'être
strtod() is designed to produce correctly rounded conversions, that is, convert any decimal string to its nearest double-precision binary floating-point number. This could be done quite simply if performance weren’t an issue: arbitrary-precision integer division would do the trick. But arbitrary-precision division is expensive. Enter David Gay’s strtod(). It uses IEEE floating-point arithmetic when it can, and uses arbitrary-precision arithmetic — not full-blown division though — only when necessary. (Some conversions will always require arbitrary-precision — there’s no getting around that. My article “Decimal to Floating-Point Needs Arbitrary Precision” will help you understand why.)
Paths Through strtod()
There are three basic paths that a conversion can take through strtod(): the fast path, for decimal strings that are short enough and have exponents that are small enough; the correction loop, for decimal strings that don’t qualify for the fast path; and bigcomp, for long decimal strings that can’t be resolved by the correction loop because they are nearly halfway between two floating-point numbers. The fast path requires only IEEE arithmetic; the correction loop and bigcomp require IEEE and arbitrary-precision arithmetic.
All paths require the decimal input to be parsed into two components: an integer, representing the significant digits, and a power of ten, representing the placement of the decimal point. For example, 1.23456789012345e36 is parsed into 123456789012345 and 1022, and 3.14159 is parsed into 314159 and 10-5.
Here’s an overview of the flow:
strtod() uses its own arbitrary-precision integer implementation. It manipulates its floating-point result by accessing its underlying representation through a union of a double and two 32-bit integers.
The fast path applies to decimal strings of 15 significant digits or less with a power of ten exponent that has an absolute value of 22 or less (all powers of ten are treated as nonnegative powers of ten). These types of inputs can be correctly rounded with a single IEEE floating-point multiplication or division. For example, 1.23456789012345e36 is converted with 123456789012345 * 10000000000000000000000, and 3.14159 is converted with 314159/100000. This works because those operands are exactly representable in floating-point.
(See my article “Fast Path Decimal to Floating-Point Conversion” for details.)
The correction loop is required for any input that does not qualify for the fast path; this occurs when either the significant digits integer or nonnegative power of ten or both are not exactly representable in floating-point. (An IEEE floating-point multiplication or division cannot guarantee a correctly rounded result if its operands are already rounded.) For example, 6.2187331579177550499956283 has too many digits, and 163.118762e+109 has a power of ten that’s too big.
For these inputs, strtod() computes an initial floating-point approximation and then uses the loop to check, and if necessary, correct it. IEEE arithmetic is used to calculate the approximation, arbitrary-precision arithmetic is used to check it, and IEEE arithmetic is used to correct it. The majority of inputs require only one pass of the loop.
(See my article “Properties of the Correction Loop in David Gay’s strtod()” for details.)
For the approximation, strtod() uses only the first 16 significant digits of the input, and computes the power of ten using binary exponentiation. The inaccuracies introduced here could make the approximation differ from the correct answer by 10 ULPs or so.
(See my article “strtod()’s Initial Decimal to Floating-Point Approximation” for details.)
For the check, strtod() compares the decimal input to the binary approximation to see if they are within one-half of a ULP of each other. The check is made in arbitrary-precision integer arithmetic, using scaled versions of those three values.
(See my article “Using Integers to Check a Floating-Point Approximation” for details.)
strtod() computes an adjustment which itself is an approximation to the distance between the approximation and the correct result. It adds it to or subtracts it from the approximation, as required. If the adjustment contains a fractional portion that is close to 0.5, the loop is repeated; the approximation is rechecked and adjusted again, if necessary.
(See my article “Adjusting the Floating-Point Approximation in strtod()” for details.)
Bigcomp is an optimization used to reduce arbitrary-precision integer overhead when checking long decimal inputs. It truncates long decimal strings and uses the truncated value in the correction loop. This truncation gives smaller integers, but it can lead to cases where the approximation check cannot decide between adjacent floating-point values. When that happens, the correction loop is exited, and the bigcomp() function is invoked to choose the correct value.
bigcomp(), interestingly, decides which of two neighboring floating-point values is correct by doing a floating-point to decimal conversion. It converts the midpoint of the two values to a string, and then compares that to the input string. From that it can decide to which side the decimal input lies.
(See my article “Bigcomp: Deciding Truncated, Near Halfway Conversions” for details.)
strtod() is written in C and lives in a file called dtoa.c. (This file also contains the dtoa() function — double to ASCII string.) In that file are flags that allow strtod() to be configured. strtod() supports three floating-point architectures (IEEE, IBM, and yes, VAX) and four rounding modes (round-to-nearest, round toward positive infinity, round toward negative infinity, and round toward zero). It can also convert hexadecimal floating-point constants.
Every copy of strtod() I’ve seen in use is configured for IEEE arithmetic and round-to-nearest rounding. To use IEEE arithmetic, you need to “#define” the flag IEEE_8087. In addition, on processors with an x87 FPU, you must set the precision to 53 bits (e.g. on Windows, _controlfp_s(&cur_cw,_PC_53,MCW_PC)), because extended-precision can produce incorrect results.
(There are many more flags; see the code for details.)
Bugs in strtod()
Here I’ll highlight some recent (within the last three years) bugs in strtod() and related decimal to floating-point conversion routines.
This is the bug that got me interested in strtod(). I had been studying the decimal to floating-point conversion routines of gcc and glibc when this security flaw became big news. I was struck by how many projects used it, and how such a well-worn piece of code could still have major bugs.
PHP’s and Java’s implementations of strtod() both had serious bugs that were discovered last year; in both cases, their correction loops never terminated! I discovered the PHP bug, which was due to the use of a very old copy of David Gay’s code. Java’s bug was discovered by a reader; it was due to a problem introduced when strtod() was “translated” from C to Java.
You can read about the PHP bug here:
- “PHP Hangs On Numeric Value 2.2250738585072011e-308”
- “Why “Volatile” Fixes the 2.2250738585072011e-308 Bug”
- “A Better Fix for the PHP 2.2250738585072011e-308 Bug”
You can read about the Java bug here:
- “Java Hangs When Converting 2.2250738585072012e-308”
- “A Closer Look at the Java 2.2250738585072012e-308 Bug”
These articles describe bugs that caused strtod() to do incorrect conversions:
- “Incorrect Directed Conversions in David Gay’s strtod()”
- “A Bug in the Bigcomp Function of David Gay’s strtod()”
- “Quick and Dirty Floating-Point to Decimal Conversion” (Scroll down to the section marked “Endnote” to read about the bug.)
These articles also describe bugs that caused incorrect conversions, but they are Java-specific, and not strtod() per se:
- “Incorrectly Rounded Subnormal Conversions in Java”
- “Nondeterministic Floating-Point Conversions in Java”
(I found some of the above bugs and readers found others.)
Suggested Improvements To strtod()
Here are ways strtod() could be improved (some of these are my ideas; some are my readers’ ideas):
- It could extend the fast path to include most 16-digit numbers, increasing the number of fast path cases eightfold (see my article “Fast Path Decimal to Floating-Point Conversion”).
- It could convert large integers from their big integer binary representation, avoiding the overhead and complexity of the correction loop (see my article “A Better Way to Convert Integers in David Gay’s strtod()”).
- It could improve its performance for long inputs if it raised the cutoff for bigcomp() from 40 digits to at least 80 digits (see my article “Bigcomp: Deciding Truncated, Near Halfway Conversions”).
- It could avoid potential infinite loops by limiting the number of times through the correction loop (see my article “Properties of the Correction Loop in David Gay’s strtod()”).
(This section was added 9/20/13.)
And For Those That Don’t Use strtod()?
Some programming environments don’t use David Gay’s strtod(); Visual C++, gcc, glibc, and SQLite are examples. I have found incorrect conversions in each:
- “Incorrectly Rounded Conversions in Visual C++”
- “Incorrectly Rounded Conversions in GCC and GLIBC”
- “Incorrect Decimal to Floating-Point Conversion In SQLite”
- “Double Rounding Errors in Floating-Point Conversions”
I want to end with a few additional points:
- My description of strtod() focused on normal numbers rounded under round-to-nearest rounding. I did not discuss details of special cases, like subnormal numbers, overflow, and underflow.
- If your project uses strtod(), make sure to keep it up-to-date.
- I wonder: who needs to specify a constant to thousands, hundreds, or even tens of digits? A large part of strtod()’s complexity is due to long inputs. Why do we allow them?
- Next time you type a floating-point constant into your code, be thankful you didn’t have to convert it yourself!
Thanks to Mark Dickinson for answering my questions and for his instructive comments inside Python’s copy of strtod(); thanks to whoever wrote Java’s FloatingDecimal class, also useful for its commentary; and thanks to Dave Gay, for answering my questions and for his quick turnaround on strtod() patches.