I’ve written about the formulas used to compute the number of decimal digits in a binary integer and the number of decimal digits in a binary fraction. In this article, I’ll use those formulas to determine the maximum number of digits required by the double-precision (double), single-precision (float), and quadruple-precision (quad) IEEE binary floating-point formats.
The maximum digit counts are useful if you want to print the full decimal value of a floating-point number (worst case format specifier and buffer size) or if you are writing or trying to understand a decimal to floating-point conversion routine (worst case number of input digits that must be converted).
Continue reading “Maximum Number of Decimal Digits In Binary Floating-Point Numbers”
For years I’ve followed, through RSS, floating-point related questions on stackoverflow.com. Every day it seems there is a question like “why does 19.24 plus 6.95 equal 26.189999999999998?” I decided to track these questions, to see if my sense of their frequency was correct. I found that, in the last 40 days, there were 18 such questions. That’s not one per day, but still — a lot!
Continue reading “Floating-Point Questions Are Endless on stackoverflow.com”
I read about an interesting method for decimal/binary conversion in chapter two of Gerald R. Rising’s book “Inside Your Calculator”. Unlike the standard string-oriented conversion algorithms, the algorithms in his book perform arithmetic on an encoding I’ve dubbed “deci-binary”. Using this encoding, binary numbers are input and output as decimal strings consisting of only ones and zeros, exploiting built-in language facilities for decimal input and output. I will demonstrate this conversion process with Java code I have written.
Continue reading “Decimal/Binary Conversion Using “Deci-Binary””
In general, to convert an arbitrary decimal number into a binary floating-point number, arbitrary-precision arithmetic is required. However, a subset of decimal numbers can be converted correctly with just ordinary limited-precision IEEE floating-point arithmetic, taking what I call the fast path to conversion. Fast path conversion is an optimization used in practice: it’s in David Gay’s strtod() function and in Java’s FloatingDecimal class. I will explain how fast path conversion works, and describe the set of numbers that qualify for it.
Continue reading “Fast Path Decimal to Floating-Point Conversion”
The infinite loop I discovered in PHP was caused by a bug in its decimal to floating-point conversion routine, which is based on David Gay’s widely used strtod() function. strtod() has a “correction loop,” the purpose of which is to refine an initial estimate of a converted double-precision value to its correctly rounded result. This got me thinking: infinite loops notwithstanding, how many times should the loop execute? Does it depend on the accuracy of the initial estimate? I instrumented strtod() and gathered some data to help answer these questions.
The most interesting thing I discovered was this: strtod()’s correction procedure can execute at most three times. So why was it coded as an infinite loop?
Continue reading “Properties of the Correction Loop in David Gay’s strtod()”
Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.
Continue reading “Nondeterministic Floating-Point Conversions in Java”
While verifying the fix to the Java 2.2250738585072012e-308 bug I found an OpenJDK testcase for verifying conversions of edge case subnormal double-precision numbers. I ran the testcase, expecting it to work — but it failed! I determined it fails because Java converts some subnormal numbers incorrectly.
(By the way, this bug exists in prior versions of Java — it has nothing to do with the fix.)
Continue reading “Incorrectly Rounded Subnormal Conversions in Java”
Oracle has released a fix for security alert CVE-2010-4476 — the “Java Hangs on 2.2250738585072012e-308 bug.” The fix comes in the form of something called the FPUpdater Tool, which updates rt.jar. I tested it on my Windows XP system and it works.
Continue reading “FPUpdater Fixes the Java 2.2250738585072012e-308 Bug”
Java’s decimal to floating-point conversion routine, the doubleValue() method of its FloatingDecimal class, goes into an infinite loop when converting the decimal string 2.2250738585072012e-308 to double-precision binary floating-point. I took a closer look at the bug, by tracing the doubleValue() method in the Eclipse IDE for Java (thanks to Konstantin Preißer for helping me set that up). What I found was that our initial analysis of the bug was wrong; what actually happens is that doubleValue()’s correction loop oscillates between two values, 0x1p-1022 and 0x0.fffffffffffffp-1022.
Continue reading “A Closer Look at the Java 2.2250738585072012e-308 Bug”
Konstantin Preißer made an interesting discovery, after reading my article “PHP Hangs On Numeric Value 2.2250738585072011e-308”: Java — both its runtime and compiler — go into an infinite loop when converting the decimal number 2.2250738585072012e-308 to double-precision binary floating-point. This number is supposed to convert to 0x1p-1022, which is DBL_MIN; instead, Java gets stuck, oscillating between 0x1p-1022 and 0x0.fffffffffffffp-1022, the largest subnormal double-precision floating-point number.
Continue reading “Java Hangs When Converting 2.2250738585072012e-308”
What does this C program print?
int main (void)
The answer depends on which compiler you use. If you compile the program with Visual C++ and run on it on Windows, it prints 0.3; if you compile it with gcc and run it on Linux, it prints 0.2.
The compilers — actually, their run time libraries — are using different rules to break decimal rounding ties. The two-digit number 0.25, which has an exact binary floating-point representation, is equally near two one-digit decimal numbers: 0.2 and 0.3; either is an acceptable answer. Visual C++ uses the round-half-away-from-zero rule, and gcc (actually, glibc) uses the round-half-to-even rule, also known as bankers’ rounding.
This inconsistency of printed output is not limited to C — it spans many programming environments. In all, I tested fixed-format printing in nineteen environments: in thirteen of them, round-half-away-from-zero was used; in the remaining six, round-half-to-even was used. I also discovered an anomaly in some environments: numbers like 0.15 — which look like halfway cases but are actually not when viewed in binary — may be rounded incorrectly. I’ll report my results in this article.
Continue reading “Inconsistent Rounding of Printed Floating-Point Numbers”