Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.
Nondeterministic Conversion
This program converts the decimal string representing the value 2-1047 + 2-1075 repeatedly, in a loop:
class diffconvert { public static void main(String[] args) { double conversion, conversion_prev = 0; //2^-1047 + 2^-1075 String decimal = "6.631236871469758276785396630275967243399099947355303144249971758736286630139265439618068200788048744105960420552601852889715006376325666595539603330361800519107591783233358492337208057849499360899425128640718856616503093444922854759159988160304439909868291973931426625698663157749836252274523485312442358651207051292453083278116143932569727918709786004497872322193856150225415211997283078496319412124640111777216148110752815101775295719811974338451936095907419622417538473679495148632480391435931767981122396703443803335529756003353209830071832230689201383015598792184172909927924176339315507402234836120730914783168400715462440053817592702766213559042115986763819482654128770595766806872783349146967171293949598850675682115696218943412532098591327667236328125E-316"; for(int i = 1; i <= 10000; i++) { conversion = Double.parseDouble(decimal); if (i != 1 && conversion != conversion_prev) { System.out.printf("Iteration %d converts as %a%n", i-1,conversion_prev); System.out.printf("Iteration %d converts as %a%n", i,conversion); } conversion_prev = conversion; } } }
When the program runs, it should print nothing, because the conversion should be the same every time. However, at some point during the loop — a point which varies — the result of the conversion changes, from correct (0x0.0000008p-1022) to incorrect (0x0.0000008000001p-1022). And once the conversion changes, it never changes back (for as long as I’ve cared to extend the loop that is).
Here’s an example of three consecutive runs of the program:
First Run:
Iteration 103 converts as 0x0.0000008p-1022 Iteration 104 converts as 0x0.0000008000001p-1022
Second Run:
Iteration 105 converts as 0x0.0000008p-1022 Iteration 106 converts as 0x0.0000008000001p-1022
Third Run:
Iteration 90 converts as 0x0.0000008p-1022 Iteration 91 converts as 0x0.0000008000001p-1022
Environment
I ran this on a 32-bit Intel machine. The problem occurs under both Windows and Linux. These are the versions of Java I’m using:
Windows
java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing) javac -version javac 1.6.0_24
Linux
java -version java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.4) (6b20-1.9.4-0ubuntu1) OpenJDK Client VM (build 19.0-b09, mixed mode, sharing) javac -version javac 1.6.0_20
The Failing Code In doubleValue()
I traced the execution of the FloatingDecimal class with Eclipse and zeroed in on this section of code in doubleValue():
double t = dValue * tiny10pow[j]; if ( t == 0.0 ){ /* * It did underflow. * Look more closely at the result. * If the exponent is just one too small, * then use the minimum finite as our estimate * value. Else call the result 0.0 * and punt it. * ( I presume this could happen because * rounding forces the result here to be * an ULP or two less than * Double.MIN_VALUE ). */ t = dValue * 2.0; t *= tiny10pow[j]; if ( t == 0.0 ){ return (isNegative)? -0.0 : 0.0; } t = Double.MIN_VALUE; } dValue = t;
Here are the values of the relevant variables after the highlighted line of code is executed, for both a correct conversion and an incorrect conversion:
For a correct conversion:
dValue: 0x1.54fdd80c8bd14p-197 tiny10pow[j]: 0x1.8062864ac6f43p-851 t: 0x0.0000008p-1022
For an incorrect conversion:
dValue: 0x1.54fdd80c8bd14p-197
tiny10pow[j]: 0x1.8062864ac6f43p-851
t: 0x0.0000008000001p-1022
dValue and tiny10pow[j] are the same in each case — so how could t be different?
My x87 Theory
My first theory was that x87 extended-precision arithmetic was coming into play. I thought that the processor had switched from 53-bit to 64-bit precision mode (although I had no theory as to how).
Calculating dValue * tiny10pow[j] in Extended-Precision Registers
Let’s take the double-precision values of dValue and tiny10pow[j] and multiply them by hand:
dValue = 0x1.54fdd80c8bd14p-197 =
1.01010100111111011101100000001100100010111101000101 x 2-197
tiny10pow[j] = 0x1.8062864ac6f43p-851 =
1.1000000001100010100001100100101011000110111101000011 x 2-851
dValue * tiny10pow[j] =
1.00000000000000000000000000010000000000000000000000000000111100101101
01010100101111111010100101000001111 x 2-1047
Depending on which precision mode the processor is in, this result has to be rounded, to either 53 or 64 bits (I’ve highlighted bits 54 and 65, the rounding bits for each case). Let’s round it both ways.
53-bit mode
If the processor is in 53-bit mode, the full-precision result would round to
1.0000000000000000000000000001 x 2-1047
Since this is subnormal relative to double-precision, it has to be rounded again — to 52 bits. To see better where bit 53 (the rounding bit) is, let’s rewrite the 53-bit result in unnormalized binary scientific notation:
0.00000000000000000000000010000000000000000000000000001 x 2-1022
Because of round-half-to-even rounding — the mode the processor is most certainly in — this would round to
0.0000000000000000000000001 x 2-1022
This is the correct answer, so I concluded that 53-bit mode is the normal mode of operation.
64-bit mode
If the processor is in 64-bit mode, the full-precision result would round to
1.000000000000000000000000000100000000000000000000000000001111001 x 2-1047
In unnormalized binary scientific notation this is
0.00000000000000000000000010000000000000000000000000001000000000000000
00000000000001111001 x 2-1022
Because bits 53 and beyond are greater than 1/2 ULP, this would round to 52 bits as
0.0000000000000000000000001000000000000000000000000001 x 2-1022
This matches the incorrect answer, so I concluded that the processor had switched from 53-bit mode to 64-bit mode sometime during the loop.
Other Pieces to The Puzzle
32-Bit vs. 64-Bit, x87 vs. SSE
I asked Konstantin Preißer to run my testcase. He confirmed seeing the same problem on a 32-Bit JRE. He thinks that at some point during the loop, just-in-time compilation kicks in, causing native instructions to be executed. I took that as a plausible explanation for how the processor could switch modes.
In subsequent correspondence, Konstantin told me that on a 64-bit JRE, the conversion never changes. That’s what I expected, since I assumed the 64-bit JRE uses SSE instructions. But what I didn’t expect was that the conversion is always incorrect: 0x0.0000008000001p-1022.
Konstantin further told me that he ran my testcase on a 32-bit JRE without SSE:
java -XX:UseSSE=0 diffconvert
The conversion was always correct and never changed.
The 53-bit Mode Answer is Correct — But For the Wrong Reason
Looking closer at my computation of the 53-bit mode answer, 0.0000000000000000000000001 x 2-1022, I realized it was the result of a double rounding on underflow error. If the full-precision result were rounded directly to 52 subnormal bits, it would have rounded up to the incorrect answer:
0.0000000000000000000000001000000000000000000000000001 x 2-1022.
This directly rounded result is the same as the result computed with SSE instructions.
My x87/SSE Theory
So now my theory is that both x87 and SSE come into play, with 53-bit mode x87 computations being done before a switchover to SSE “hot spot” computations. (I no longer think that 64-bit mode is involved; the 64-bit mode result is just coincidentally the same as the SSE result — it avoids the double rounding error as well.)
Further Support for My New Theory
If there were a way to step through assembler code and inspect registers with Eclipse for Java, I could verify my theory directly. Not being able to do that, I wrote two small C programs to do the example computation of dValue * tiny10pow[j]: one to do it with x87 instructions in 53-bit mode, and the other to do it with SSE instructions. I ran both programs on Linux, and traced them using the gdb debugger.
Computation In x87 53-bit Mode (conv_x87_53.c)
This program, when compiled with this command
gcc -o conv_x87_53 conv_x87_53.c
produces the correct conversion, 0x0.0000008p-1022:
#include <stdio.h> int main (void) { double dValue, tiny10pow, t; volatile short oldcw, newcw; /* Set 53-bit mode */ asm ("fstcw %0\n" : :"m"(oldcw)); newcw = (oldcw & 0xCFF) | 0x200; asm ("fldcw %0\n" : :"m"(newcw)); dValue = 0x1.54fdd80c8bd14p-197; tiny10pow = 0x1.8062864ac6f43p-851; t = dValue * tiny10pow; printf("%a\n",t); }
The result of the x87 multiplication, 0x3be88000000800000000, is put in FPU register st0. This is the IEEE 754 extended-precision encoding of the 53-bit rounded result. It gets rounded again, to 52 bits, when stored to the variable t in memory. This results in a double rounding on underflow error.
Computation With SSE (conv_sse.c)
This program, when compiled with this command
gcc -msse -mfpmath=sse -march=pentium4 -o conv_sse conv_sse.c
produces the incorrect conversion, 0x0.0000008000001p-1022:
#include <stdio.h> int main (void) { double dValue, tiny10pow, t; dValue = 0x1.54fdd80c8bd14p-197; tiny10pow = 0x1.8062864ac6f43p-851; t = dValue * tiny10pow; printf("%a\n",t); }
(This is the same program as above, but without the code to set the precision mode.)
The result of the multiplication is put in SSE register xmm0. It represents the final result, as no further rounding can occur when it’s stored to variable t in memory.
Summary
There is a bug in Java’s decimal to floating-point conversion algorithm, as demonstrated by my example. Interestingly, the bug is masked when done with x87 instructions; the correct result comes accidentally from a double rounding error in hardware. Using SSE instructions, the bug surfaces; the hardware does the calculation it was asked to do — faithfully.
This bug makes Java’s decimal to floating-point conversion algorithm sensitive to the floating-point architecture on which it runs. And with just-in-time compilation on a 32-bit processor, both architectures come into play in one program, giving two conversion results for the same decimal string.
Discussion
I described an anomaly in the conversion of one tiny decimal number; no one will lose sleep over that. But are there implications for Java’s floating-point calculations in general?
How I Found This
In my article “Incorrectly Rounded Subnormal Conversions in Java” I used 2-1047 + 2-1075 as an example of incorrect conversion. I found it using an OpenJDK testcase, which I modified to generate conversions randomly. When I looked back at the unmodified testcase’s output, I realized that the conversion of 2-1047 + 2-1075 was not listed as incorrect.
To see what was going on, I hardcoded the decimal string in a small test program and ran it; it converted correctly as well. Then I put the conversion in a loop, to see if it changed — and it did!
Bug Reports
To address the nondeterministic conversion, I wrote this bug report, calling for the use of the ‘strictfp’ keyword in the FloatingDecimal class (see the discussion in the comments below): Bug ID 7021568: Double.parseDouble() returns architecture dependent results.
As for the incorrect conversion, I think it is covered by the bug report I wrote previously (I added a comment singling out this case): Bug ID 7019078: Double.parseDouble() converts some subnormal numbers incorrectly.
I guess this is what the “strictfp” keyword is for?
@Daniel Cheng,
I have tried strictfp, in the example above and in the FloatingDecimal class. It makes no difference (which makes me wonder: what’s it really supposed to do?)
Hi,
I also tried to declare the class FloatingDecimal as strictfp. It actually made a difference: On a 32-Bit JRE, the value now always converts to “0x0.0000008000001p-1022”, independently from the UseSSE-Setting (however with UseSSE=2, the program takes a bit longer to execute the conversions).
This is the same result that I get on a 64-Bit JRE, but there the conversions is also always “0x0.0000008000001p-1022”, independent from UseSSE setting or strictfp declaration.
From what I read about strictfp keyword, I think it is supposed to guarantee that all floating point arithmetics (inside the class or method which is declared strictfp) are done in 32-bit or 64-bit (IEEE 754) mode, not extended 80-bit presicion.
Konstantin,
You are right. I just tried it again and it DOES work when added in FloatingDecimal.java, whether on the class or doubleValue() method. (My notes say it didn’t work —
I don’t know what happenedI know what happened: the note was in reference to the conversion being incorrect. I didn’t test whether strictfp fixed the nondeterminism.)Is the fix simply for Java to add this keyword? It’s odd that it’s not there already! (I should say is the fix for the nondeterminism that simple; the fact remains that the conversion is incorrect.)
I wonder: does strictfp simply force the use of SSE instructions? I don’t know how it could eliminate double rounding on underflow errors otherwise.
FWIW, I tried this on Mac OSX 10.6.6 with Apple’s Java 6 and no problems.
Also, using OpenJDK 1.7.0 build 20.0-b06 on same OS is problem free.
AIX IBM JDKs are the other way round : the non-JIT code gives the wrong result, but once JIT kicks in, I get the correct result (using Power5 CPU).
With AIX JDK 6 64bits :
java -version
javac java version “1.6.0”
Java(TM) SE Runtime Environment (build pap6460sr6-20090925_01(SR6))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc64-64 jvmap6460sr6-20090923_42924 (JIT enabled, AOT enabled)
J9VM – 20090923_042924
JIT – r9_20090902_1330ifx1
GC – 20090817_AA)
JCL – 20090924_01
javac -version
javac 1.6.0-internal
Iteration 15 converts as 0x0.0000008000001p-1022
Iteration 16 converts as 0x0.0000008p-1022
With AIX JDK 6 32 bits, I get the same result (the iteration number is roughly the same). The number of iteration can be different when I run it several times, but it is always first 0x0.0000008000001p-1022, then 0x0.0000008p-1022.
Using JDK 5 (both 32 bits and 64 bits), the change occurs later (around iteration 700 or 300). I guess the code which decides when to use JIT as been changed between JDK 5 and JDK 6 to use JIT sooner.
Using JDK 6 (64 bits and 32 bits) without the JIT compilation (i.e. running “java -Xint”) makes the conversion always the same (i.e. no output). Adding a printf outside of the loop, I see that it converts to 0x0.0000008000001p-1022.
Adding “strictfp” does not change anything with JDK 6 (64 bits and 32 bits), with or without JIT.
Does it mean I should add a loop around every floating point computation in my code to have the correct result ? 😉
@Tim,
Interesting. I’m not too familiar with the Power floating-point architecture — what instruction-level mechanism could cause this? It’d be interesting to see the assembler (I don’t know if your compiler accepts hexadecimal floating-point literals, but maybe you could try my C program — with appropriate compiler options of course).
As for strictfp, I think we’ve determined it only has effect when added to Java’s FloatingDecimal class.
You don’t need to use SSE to get correct strictfp behavior.
But, one has to be careful, see:
http://www.open-std.org/JTC1/SC22/JSG/docs/m3/docs/jsgn326.pdf