# Fast Path Decimal to Floating-Point Conversion

Copyright © 2008-2016 Exploring Binary

http://www.exploringbinary.com/fast-path-decimal-to-floating-point-conversion/

In general, to convert an arbitrary decimal number into a binary floating-point number, arbitrary-precision arithmetic is required. However, a subset of decimal numbers can be converted correctly with just ordinary limited-precision IEEE floating-point arithmetic, taking what I call the *fast path* to conversion. Fast path conversion is an optimization used in practice: it’s in David Gay’s strtod() function and in Java’s FloatingDecimal class. I will explain how fast path conversion works, and describe the set of numbers that qualify for it.

## An Unnormalized Form of Decimal Scientific Notation

Any number can be expressed in decimal scientific notation, a form where it is written as a product of its significant digits and a power of ten. Conventionally, the significant digits are expressed in normalized form, where the first digit is between 1 and 9, and any subsequent digits follow a decimal point. For example, 0.0926 would be written as 9.26 x 10^{-2}, and 7390 would be written as 7.390 x 10^{3}.

In the context of decimal to floating-point conversion, a nonstandard, *unnormalized* form of scientific notation is used, where the significant digits are written as an integer. For example, 0.0926 would be written as 926 x 10^{-4}, and 7390 would be written as 739 x 10^{1}.

In this unnormalized form, there are two cases: one where the integer containing the significant digits is multiplied by a nonnegative power of ten, and one where it’s multiplied by a *negative* power of ten. It’s convenient to think of the latter case as a fraction, where the significant digits integer is *divided* by the corresponding *positive* power of ten. For example, 926 x 10^{-4} equals 926/10^{4}. These two cases — multiplication or division of an integer by a power of ten that is an integer — form the basis of fast path conversion.

## The Fast Path

For the fast path conversion process, we take a decimal number, represented as a string, and convert it to the nearest IEEE double-precision binary floating-point number. You can view this process as having two main steps. In the first step, floating-point integers *s* and *p* are created: *s* represents the significant digits of the decimal number (positive or negative), and *p* represents the corresponding nonnegative power of ten. In the second step, an IEEE floating-point multiplication or division (the fmul or fdiv instruction on Intel processors) operates on *s* and *p*, producing the desired double-precision result.

This process is guaranteed to work only when *s* and *p* are exactly representable in double-precision; this happens when their binary representations have no 1 bits beyond bit 53. The values of *p* that meet this criteria are 10^{0} ≤ *p* ≤ 10^{22}. The values of *s* that meet this criteria are, by definition, all floating-point integers. (Think of it this way: Take any floating-point integer and print all the digits of its decimal equivalent — that would be a valid value for *s*. For example, 52882540679000003517910258026656900519026609372478984461456769024, which is the floating-point integer 0x1.0119c58f25bc6p+215.)

Practically speaking, there is no simple test to check all the valid values of *s*. But there is a simple test that identifies a *subset* of its values, **0 ≤ s ≤ 2^{53} – 1 = 9,007,199,254,740,991**. I will consider only these 2

^{53}values as fast path cases.

This process relies implicitly on two things — that base conversion between integers is exact, and that IEEE multiplication and division produce correctly rounded results. Under these conditions, a maximum of one rounding is done; the result is a correctly rounded decimal to floating-point conversion.

Regarding the power of ten, it can be looked up in a table, or it can be computed. If it’s computed — say naively by repeated multiplication by ten — it will be done exactly. Also, if *p* = 1, a further optimization applies — the multiplication can be skipped.

### Examples

Here are some examples that qualify for fast path conversion:

Decimal Literal | s |
p |
Mult or Div? |
---|---|---|---|

3.14159 | 314159 | 100000 | Div |

0.0001256789876643 | 1256789876643 | 10000000000000000 | Div |

9.11234e-17 | 911234 | 10000000000000000000000 | Div |

537.81e8 | 53781 | 1000000 | Mult |

9.007199254740991e37 | 9007199254740991 | 10000000000000000000000 | Mult |

299792458 | 299792458 | 1 | NA |

0 | 0 | 1 | NA |

### Fast Path Cases In Disguise

There is another set of decimal numbers that can be transformed easily into fast path multiplication cases: numbers with a *p* greater than 10^{22} but with an *s* that can be multiplied by those excess powers of ten and still remain exact. For example, take the number 123e34. As written, it has *s* = 123 and *p* = 10^{34}; *p* is too big. However, if you treat it as 123000000000000e22, it qualifies as a fast path case. You transform it by multiplying *s* by 10^{12} and by using a *p* that’s 12 orders of magnitude smaller. In other words, if you can transfer the extra powers of ten to the significant digits, you have a fast path case.

## The Fast Path in David Gay’s strtod()

David Gay’s strtod() function implements a fast path for significant digits 0 ≤ *s* ≤ 10^{15} – 1 = 999,999,999,999,999 and powers of ten 10^{0} ≤ *p* ≤ 10^{22} (the powers of ten are precomputed and stored in a table). It also handles the “disguised” fast path cases, up to the same 15-digit limit.

Why is strtod()’s fast path limited to 15 digits? My guess is because that was the easiest check to make; all integers of 15 digits or less are exact. But what about the remaining fast path cases? At a minimum, we could easily allow any *s* up to 8,999,999,999,999,999 — an eightfold increase in the number of fast path cases. All we’d need to do is add a check for “number of digits = 16 and first digit ≤ 8.”

To cover the remaining cases, we could convert the entire 16-digit string to a double and then check whether its value is less than or equal to 2^{53} – 1. The code currently converts the significant digits to a double anyhow, so why not use it for an additional fast path test? In fact, why not check the double value exclusively, skipping testing of the string altogether?

(~~I have a note out to Dave Gay asking him these questions; I’ll update this article if he responds.~~ ** Update**: Dave Gay agrees that the change to check for 16 digits/first digit ≤ 8 would work and would be simple enough, but he’s not sure if it is worth making.)

## The Fast Path in Java’s FloatingDecimal Class

Java does its decimal to floating-point conversion in the doubleValue() method of its FloatingDecimal class. It is modeled on David Gay’s strtod(), and covers the exact set of fast path cases.

## Discussion

### Fast Path Vs. the Arbitrary-Precision Integer Algorithm

Fast path conversion is simple because most of the work is done in IEEE floating-point. Contrast this with the arbitrary-precision integer conversion algorithm, which

- Scales the fraction to make rounding of the significand easier.
- Normalizes the result.
- Encodes the result in IEEE floating-point format.

In IEEE floating-point, scaling is unnecessary; correct rounding is handled automatically. Also, there is no need for explicit normalization, nor any need to encode the result — it’s already encoded! Additionally, IEEE floating-point handles the edge case, where the significand would otherwise round up to 2^{53}.

The fast path reduces decimal to floating-point conversion to this: parsing of the decimal string, creation of two binary floating-point integers to represent it, and delegation of the hard work to a single floating-point instruction.

### Processor Rounding Mode

Because the fast path algorithm uses IEEE arithmetic, it will be sensitive to the processor rounding mode. Typically, the rounding mode is set to round-to-nearest, but it could be set to one of three other modes. A conversion routine must ensure that both its fast path and arbitrary-precision path round the same way.

## Leave a Comment

(To reduce spam, cookies must be enabled)