Incorrect Floating-Point to Decimal Conversions

In my article “Inconsistent Rounding of Printed Floating-Point Numbers” I showed examples of incorrect floating-point to decimal conversions I stumbled upon — in Java, Visual Basic, JavaScript, VBScript, and OpenOffice.org Calc. In this article, I’ll explore floating-point to decimal conversions more deeply, by analyzing conversions done under four C compilers: Visual C++, MinGW GCC, Digital Mars C, and Linux GCC. I found that incorrect conversions occur in three of the four environments — in all but Linux GCC. I’ll show you some examples and explain how I found them.

Correctly Rounded Conversions

I’ve written about how decimal numbers are sometimes rounded to floating-point numbers incorrectly. A decimal number, in general, can only be approximated in binary floating-point; as such, it needs to be rounded to one of two floating-point numbers surrounding it. Conventionally, it is rounded to the nearest floating-point number, with ties broken using the round-half-to-even rule. Alas, the nearest number is not always chosen, although fortunately — at least in the implementations I’ve tested — the second nearest is chosen. This results in a floating-point number that is one unit in the last place (ULP) away from the correct one.

A similar problem exists in the other direction; that is, for floating-point to decimal conversions. Although every binary floating-point number has an exact decimal representation, rounding is required when printing to a fixed number of decimal digits. To be correctly rounded, the nearest of the two n-digit decimal numbers surrounding the floating-point number must be chosen, with ties broken according to a rule — typically round-half-away-from-zero or round-half-to-even. Sometimes the nearest number is not chosen, resulting in a decimal number that is one ULP away from the correct one.

To round floating-point numbers to decimal numbers correctly, two things must be done:

  1. The full-precision decimal equivalent of the floating-point number — or at least enough digits to make a correct rounding decision — must be generated.
  2. The full-precision decimal equivalent (or sufficiently long substring thereof) must be rounded properly to the specified number of digits.

Step 1 is the hard part, so I assume this is where things go wrong when conversions are done incorrectly — and things do go wrong, as demonstrated by the examples below.

Using sprintf() for Floating-Point to Decimal Conversions

In C, floating-point numbers are converted to decimal strings using the printf() family of functions, which are supplied by the run time library associated with a compiler. I used the sprintf() function, which allowed me to automate the search for incorrectly converted values; this is the form I used:

sprintf (decimalString,"%.*e",numDigits-1,floatingPointNumber);

The “%.*e” format specifier prints a floating-point number to a selected number of significant digits after the decimal point, in normalized scientific notation (the digit before the decimal point, though significant, is not counted).

For example, in Visual C++,

sprintf (decimalString,"%.*e",3,0.84375);

sets decimalString to 8.438e-001, and

sprintf (decimalString,"%.*e",1,0.84375);

sets decimalString to 8.4e-001.

Finding Incorrect Floating-Point to Decimal Conversions

I generated example floating-point numbers to convert by randomly generating decimal numbers and then converting them using David Gay’s strtod() function. I formatted the floating-point numbers with sprintf() and compared its output to the correctly rounded output of David Gay’s dtoa() function. (I wrapped dtoa() in a version of David Gay’s g_fmt() function, modified to account for exponent and trailing zero formatting differences.) Of the many examples I found in Visual C++, MinGW, and Digital Mars, I selected a few for analysis and presentation.

For each example I show five things:

  • The randomly generated input number.
  • The correctly rounded double-precision floating-point equivalent of the input number, written as a hexadecimal floating-point constant.
  • The correctly rounded double-precision floating-point equivalent of the input number, written in decimal (I computed this by converting the input number to binary, rounding it to 53 significant bits by hand, and then converting it back to decimal.)
  • The decimal equivalent of the double-precision floating-point number, rounded correctly to the specified number of digits.
  • The decimal equivalent of the double-precision floating-point number, as rounded incorrectly to the specified number of digits by sprintf().

I present the examples without ‘e’ notation, since I think it makes comparison of the rounded numbers easier.

Visual C++ (2010) / MinGW GCC C (4.5.0) on Windows

In this section, I’ll show three examples that are incorrectly rounded under Visual C++ and MinGW (Visual C++ and MinGW use the same run time library, so they get the same results).

Example 1

Input 1.0551955
Nearest Double (hex) 0x1.0e214ad362e90p+0
Nearest Double (decimal) 1.055195499999999952933649183250963687896728515625
Rounded to 7 Digits 1.055195
Printed to 7 Digits 1.055196

The input number 1.0551955 converts correctly to the double-precision floating-point number 1.055195499999999952933649183250963687896728515625 (which equals 0x1.0e214ad362e90p+0 as a hexadecimal floating-point constant). This double-precision value rounded correctly to seven digits is 1.055195, since the value of decimal place seven and beyond is less than one-half ULP (the last place being the sixth decimal place). Visual C++ and MinGW round it incorrectly to 1.055196.

(For the remaining examples, the analysis is similar; I will let the tables speak for themselves.)

Example 2

Input 8.330400913327153
Nearest Double (hex) 0x1.0a92a4efa9e08p+3
Nearest Double (decimal) 8.3304009133271534892628551460802555084228515625
Rounded to 16 Digits 8.330400913327153
Printed to 16 Digits 8.330400913327154

Example 3

Input 9.522938016739373
Nearest Double (hex) 0x1.30bbe881f761fp+3
Nearest Double (decimal) 9.5229380167393724576641034218482673168182373046875
Rounded to 16 Digits 9.522938016739372
Printed to 16 Digits 9.522938016739373

A Note About the “%a” Format Specifier in MinGW

Interestingly, MinGW printf() does not appear to support “%a”, the format specifier that prints hexadecimal floating-point constants. Using it caused my program to crash. This was unexpected — does it use the same run time library as Visual C++ or not?

To verify that the floating point values tested in MinGW were as expected — without using “%a” — I printed them using my function print_double_binsci().

Digital Mars C (v852) on Windows

In this section, I’ll show three examples that are incorrectly rounded under Digital Mars C.

Example 4

Input 9194.25055964485
Nearest Double (hex) 0x1.1f5201256a42ap+13
Nearest Double (decimal) 9194.25055964485000004060566425323486328125
Rounded to 14 Digits 9194.2505596449
Printed to 14 Digits 9194.2505596448

Example 5

Input 816.2665949149578
Nearest Double (hex) 0x1.98221fc83c830p+9
Nearest Double (decimal) 816.266594914957750006578862667083740234375
Rounded to 16 Digits 816.2665949149578
Printed to 16 Digits 816.2665949149577

Example 6

Input 95.47149571505499
Nearest Double (hex) 0x1.7de2cfc5d1761p+6
Nearest Double (decimal) 95.4714957150549849984599859453737735748291015625
Rounded to 16 Digits 95.47149571505498
Printed to 16 Digits 95.47149571505499

Linux GCC (4.4.3) / eglibc (2.11.1)

I found no incorrect floating-point to decimal conversions in Linux GCC. This makes sense, given it is the only one of the four compilers able to generate all of the significant decimal digits of a floating-point number.

On Round-Trip Conversions

This article is about floating-point to decimal conversions, not round-trip decimal to floating-point to decimal conversions. Nonetheless, I selected four of my examples purposely to bring up an interesting discussion about round-trip conversions.

Examples 3 and 6 show conversions that round-trip but shouldn’t, and examples 2 and 5 show conversions that don’t round-trip but should. You might get fooled into judging the correctness of these conversions based on whether the output matches the input. But this thinking is wrong; the output decimal number is determined solely by the floating-point number, which may only be an approximation to the input decimal number.

Dingbat

One comment

  1. To be fare with visual c++ and mingw gcc,
    they are pretty good if you ask for 17 sig. digits

    Bad conversion happens when it is VERY close to half-way cases.
    I believe the inaccuracy comes from DOUBLE ROUNDING

    your 2nd example 0x1.0a92a4efa9e08p+3
    8.3304009133271534829 …
    -> 17 digits = 8.3304009133271535
    -> 16 digits= 8.330400913327154

    But if you try 0x2.0a92a4efa9e08p+3 (fractional part unchanged)
    -> 17 digits = 16.330400913327153

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

(Cookies must be enabled to leave a comment...it reduces spam.)

Copyright © 2008-2024 Exploring Binary

Privacy policy

Powered by WordPress

css.php