In general, to convert an arbitrary decimal number into a binary floating-point number, arbitrary-precision arithmetic is required. However, a subset of decimal numbers can be converted correctly with just ordinary limited-precision IEEE floating-point arithmetic, taking what I call the fast path to conversion. Fast path conversion is an optimization used in practice: it’s in David Gay’s strtod() function and in Java’s FloatingDecimal class. I will explain how fast path conversion works, and describe the set of numbers that qualify for it.
An Unnormalized Form of Decimal Scientific Notation
Any number can be expressed in decimal scientific notation, a form where it is written as a product of its significant digits and a power of ten. Conventionally, the significant digits are expressed in normalized form, where the first digit is between 1 and 9, and any subsequent digits follow a decimal point. For example, 0.0926 would be written as 9.26 x 10-2, and 7390 would be written as 7.390 x 103.
In the context of decimal to floating-point conversion, a nonstandard, unnormalized form of scientific notation is used, where the significant digits are written as an integer. For example, 0.0926 would be written as 926 x 10-4, and 7390 would be written as 739 x 101.
In this unnormalized form, there are two cases: one where the integer containing the significant digits is multiplied by a nonnegative power of ten, and one where it’s multiplied by a negative power of ten. It’s convenient to think of the latter case as a fraction, where the significant digits integer is divided by the corresponding positive power of ten. For example, 926 x 10-4 equals 926/104. These two cases — multiplication or division of an integer by a power of ten that is an integer — form the basis of fast path conversion.
The Fast Path
For the fast path conversion process, we take a decimal number, represented as a string, and convert it to the nearest IEEE double-precision binary floating-point number. You can view this process as having two main steps. In the first step, floating-point integers s and p are created: s represents the significant digits of the decimal number (positive or negative), and p represents the corresponding nonnegative power of ten. In the second step, an IEEE floating-point multiplication or division (the fmul or fdiv instruction on Intel processors) operates on s and p, producing the desired double-precision result.
This process is guaranteed to work only when s and p are exactly representable in double-precision; this happens when their binary representations have no 1 bits beyond bit 53. The values of p that meet this criteria are 100 ≤ p ≤ 1022. The values of s that meet this criteria are, by definition, all floating-point integers. (Think of it this way: Take any floating-point integer and print all the digits of its decimal equivalent — that would be a valid value for s. For example, 52882540679000003517910258026656900519026609372478984461456769024, which is the floating-point integer 0x1.0119c58f25bc6p+215.)
Practically speaking, there is no simple test to check all the valid values of s. But there is a simple test that identifies a subset of its values, 0 ≤ s ≤ 253 – 1 = 9,007,199,254,740,991. I will consider only these 253 values as fast path cases.
This process relies implicitly on two things — that base conversion between integers is exact, and that IEEE multiplication and division produce correctly rounded results. Under these conditions, a maximum of one rounding is done; the result is a correctly rounded decimal to floating-point conversion.
Regarding the power of ten, it can be looked up in a table, or it can be computed. If it’s computed — say naively by repeated multiplication by ten — it will be done exactly. Also, if p = 1, a further optimization applies — the multiplication can be skipped.
Here are some examples that qualify for fast path conversion:
|Decimal Literal||s||p||Mult or Div?|
Fast Path Cases In Disguise
There is another set of decimal numbers that can be transformed easily into fast path multiplication cases: numbers with a p greater than 1022 but with an s that can be multiplied by those excess powers of ten and still remain exact. For example, take the number 123e34. As written, it has s = 123 and p = 1034; p is too big. However, if you treat it as 123000000000000e22, it qualifies as a fast path case. You transform it by multiplying s by 1012 and by using a p that’s 12 orders of magnitude smaller. In other words, if you can transfer the extra powers of ten to the significant digits, you have a fast path case.
The Fast Path in David Gay’s strtod()
David Gay’s strtod() function implements a fast path for significant digits 0 ≤ s ≤ 1015 – 1 = 999,999,999,999,999 and powers of ten 100 ≤ p ≤ 1022 (the powers of ten are precomputed and stored in a table). It also handles the “disguised” fast path cases, up to the same 15-digit limit.
Why is strtod()’s fast path limited to 15 digits? My guess is because that was the easiest check to make; all integers of 15 digits or less are exact. But what about the remaining fast path cases? At a minimum, we could easily allow any s up to 8,999,999,999,999,999 — an eightfold increase in the number of fast path cases. All we’d need to do is add a check for “number of digits = 16 and first digit ≤ 8.”
To cover the remaining cases, we could convert the entire 16-digit string to a double and then check whether its value is less than or equal to 253 – 1. The code currently converts the significant digits to a double anyhow, so why not use it for an additional fast path test? In fact, why not check the double value exclusively, skipping testing of the string altogether?
I have a note out to Dave Gay asking him these questions; I’ll update this article if he responds. Update: Dave Gay agrees that the change to check for 16 digits/first digit ≤ 8 would work and would be simple enough, but he’s not sure if it is worth making.)
The Fast Path in Java’s FloatingDecimal Class
Java does its decimal to floating-point conversion in the doubleValue() method of its FloatingDecimal class. It is modeled on David Gay’s strtod(), and covers the exact set of fast path cases.
Fast Path Vs. the Arbitrary-Precision Integer Algorithm
Fast path conversion is simple because most of the work is done in IEEE floating-point. Contrast this with the arbitrary-precision integer conversion algorithm, which
- Scales the fraction to make rounding of the significand easier.
- Normalizes the result.
- Encodes the result in IEEE floating-point format.
In IEEE floating-point, scaling is unnecessary; correct rounding is handled automatically. Also, there is no need for explicit normalization, nor any need to encode the result — it’s already encoded! Additionally, IEEE floating-point handles the edge case, where the significand would otherwise round up to 253.
The fast path reduces decimal to floating-point conversion to this: parsing of the decimal string, creation of two binary floating-point integers to represent it, and delegation of the hard work to a single floating-point instruction.
Processor Rounding Mode
Because the fast path algorithm uses IEEE arithmetic, it will be sensitive to the processor rounding mode. Typically, the rounding mode is set to round-to-nearest, but it could be set to one of three other modes. A conversion routine must ensure that both its fast path and arbitrary-precision path round the same way.
Can you explain how ‘p’ can have a max of 10^22 using a 53 bit double precision. Thanks
It’s explained in the link in the sentence above that reads “The values of p that meet this criteria are 100 ≤ p ≤ 1022.
I used your Decimal/Binary converter to figure it out. Thanks.
I’m wondering how it can be solved analytically though. Any clues?
Thanks for pointing out the link. Should have read that first before asking you.
Comments are closed.