## Maximum Number of Decimal Digits In Binary Floating-Point Numbers

I’ve written about the formulas used to compute the number of decimal digits in a binary integer and the number of decimal digits in a binary fraction. In this article, I’ll use those formulas to determine the maximum number of digits required by the double-precision (double), single-precision (float), and quadruple-precision (quad) IEEE binary floating-point formats.

The maximum digit counts are useful if you want to print the full decimal value of a floating-point number (worst case format specifier and buffer size) or if you are writing or trying to understand a decimal to floating-point conversion routine (worst case number of input digits that must be converted).

## Visual C++ strtod(): Still Broken

About a year ago Bruce Dawson informed me that Microsoft is fixing their decimal to floating-point conversion routine in the next release of Visual Studio; I finally made the time to test the new code. I installed Visual Studio Community 2015 Release Candidate and ran my old C++ testcases. The good news: all of the individual conversion errors that I wrote about are fixed. The bad news: many errors remain.

## Floating-Point Questions Are Endless on stackoverflow.com

For years I’ve followed, through RSS, floating-point related questions on stackoverflow.com. Every day it seems there is a question like “why does 19.24 plus 6.95 equal 26.189999999999998?” I decided to track these questions, to see if my sense of their frequency was correct. I found that, in the last 40 days, there were 18 such questions. That’s not one per day, but still — a lot!

## A Better Way to Convert Integers in David Gay’s strtod()

A reader of my blog, John Harrison, suggested a way to improve how David Gay’s strtod() converts large integers to doubles. Instead of approximating the conversion and going through the correction loop to check and correct it — the signature processes of strtod() — he proposed doing the conversion directly from a binary big integer representation of the decimal input. strtod() does lots of processing with big integers, so the facility to do this is already there.

I implemented John’s idea in a copy of strtod(). The path for large integers is so much simpler and faster that I can’t believe it never occurred to me to do it this way. It’s also surprising that strtod() never implemented it this way to begin with.

## Using Integers to Check a Floating-Point Approximation

For decimal inputs that don’t qualify for fast path conversion, David Gay’s strtod() function does three things: first, it uses IEEE double-precision floating-point arithmetic to calculate an approximation to the correct result; next, it uses arbitrary-precision integer arithmetic (AKA big integers) to check if the approximation is correct; finally, it adjusts the approximation, if necessary. In this article, I’ll explain the second step — how the check of the approximation is done.

## strtod()’s Initial Decimal to Floating-Point Approximation

David Gay’s strtod() function does decimal to floating-point conversion using both IEEE double-precision floating-point arithmetic and arbitrary-precision integer arithmetic. For some inputs, a simple IEEE floating-point calculation suffices to produce the correct result; for other inputs, a combination of IEEE arithmetic and arbitrary-precision arithmetic is required. In the latter case, IEEE arithmetic is used to calculate an approximation to the correct result, which is then refined using arbitrary-precision arithmetic. In this article, I’ll describe the approximation calculation, which is based on a form of binary exponentiation.

## Nondeterministic Floating-Point Conversions in Java

Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.

## Fifteen Digits Don’t Round-Trip Through SQLite Reals

I’ve discovered that decimal floating-point numbers of 15 significant digits or less don’t always round-trip through SQLite. Consider this example, executed on version 3.7.3 of the pre-compiled SQLite command shell:

```sqlite> create table t1(d real);
sqlite> insert into t1 values(9.944932e+31);
sqlite> select * from t1;
9.94493200000001e+31
```

SQLite represents a decimal floating-point number that has real affinity as a double-precision binary floating-point number — a double. A decimal number of 15 significant digits or less is supposed to be recoverable from its double-precision representation. In SQLite, however, this guarantee is not met; this is because its floating-point to decimal conversion routine is implemented in limited-precision floating-point arithmetic.

## The Answer is One (Unless You Use Floating-Point)

What does this C function do?

```double f(double a)
{
double b, c;

b = 10*a - 10;
c = a - 0.1*b;

return (c);
}
```

Based solely on reading the code, you’ll conclude that it always returns 1: c = a – 0.1*(10*a – 10) = a – (a-1) = 1. But if you execute the code, you’ll find that it may or may not return 1, depending on the input. If you know anything about binary floating-point arithmetic, that won’t surprise you; what might surprise you is how far from 1 the answer can be — as far away as a large negative number!

## Quick and Dirty Floating-Point to Decimal Conversion

In my article “Quick and Dirty Decimal to Floating-Point Conversion” I presented a small C program that uses double-precision floating-point arithmetic to convert decimal strings to binary floating-point numbers. The program converts some numbers incorrectly, despite using an algorithm that’s mathematically correct; its limited precision calculations are to blame. I dubbed the program “quick and dirty” because it’s simple, and overall converts reasonably accurately.

For this article, I took a similar approach to the conversion in the opposite direction — from binary floating-point to decimal string. I wrote a small C program that combines two mathematically correct algorithms: the classic “repeated division by ten” algorithm to convert integer values, and the classic “repeated multiplication by ten” algorithm to convert fractional values. The program uses double-precision floating-point arithmetic, so like its quick and dirty decimal to floating-point counterpart, its conversions are not always correct — though reasonably accurate. I’ll present the program and analyze some example conversions, both correct and incorrect.

## Incorrect Floating-Point to Decimal Conversions

In my article “Inconsistent Rounding of Printed Floating-Point Numbers” I showed examples of incorrect floating-point to decimal conversions I stumbled upon — in Java, Visual Basic, JavaScript, VBScript, and OpenOffice.org Calc. In this article, I’ll explore floating-point to decimal conversions more deeply, by analyzing conversions done under four C compilers: Visual C++, MinGW GCC, Digital Mars C, and Linux GCC. I found that incorrect conversions occur in three of the four environments — in all but Linux GCC. I’ll show you some examples and explain how I found them.

## Inconsistent Rounding of Printed Floating-Point Numbers

What does this C program print?

```#include <stdio.h>
int main (void)
{
printf ("%.1f\n",0.25);
}
```

The answer depends on which compiler you use. If you compile the program with Visual C++ and run on it on Windows, it prints 0.3; if you compile it with gcc and run it on Linux, it prints 0.2.

The compilers — actually, their run time libraries — are using different rules to break decimal rounding ties. The two-digit number 0.25, which has an exact binary floating-point representation, is equally near two one-digit decimal numbers: 0.2 and 0.3; either is an acceptable answer. Visual C++ uses the round-half-away-from-zero rule, and gcc (actually, glibc) uses the round-half-to-even rule, also known as bankers’ rounding.

This inconsistency of printed output is not limited to C — it spans many programming environments. In all, I tested fixed-format printing in nineteen environments: in thirteen of them, round-half-away-from-zero was used; in the remaining six, round-half-to-even was used. I also discovered an anomaly in some environments: numbers like 0.15 — which look like halfway cases but are actually not when viewed in binary — may be rounded incorrectly. I’ll report my results in this article.