Nondeterministic Floating-Point Conversions in Java

http://www.exploringbinary.com/nondeterministic-floating-point-conversions-in-java/


Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.

Nondeterministic Conversion

This program converts the decimal string representing the value 2-1047 + 2-1075 repeatedly, in a loop (I inserted 11 newlines in the string so it would fit here; you’ll have to delete them if you want to run this):

class diffconvert {
 public static void main(String[] args) {
   double conversion, conversion_prev = 0;

   //2^-1047 + 2^-1075
   String decimal = "6.6312368714697582767853966302759672433990999
473553031442499717587362866301392654396180682007880487441059604205
526018528897150063763256665955396033303618005191075917832333584923
372080578494993608994251286407188566165030934449228547591599881603
044399098682919739314266256986631577498362522745234853124423586512
070512924530832781161439325697279187097860044978723221938561502254
152119972830784963194121246401117772161481107528151017752957198119
743384519360959074196224175384736794951486324803914359317679811223
967034438033355297560033532098300718322306892013830155987921841729
099279241763393155074022348361207309147831684007154624400538175927
027662135590421159867638194826541287705957668068727833491469671712
93949598850675682115696218943412532098591327667236328125E-316";

   for(int i = 1; i <= 10000; i++) {
     conversion = Double.parseDouble(decimal);
     if (i != 1 && conversion != conversion_prev) {
        System.out.printf("Iteration %d converts as %a%n",
                          i-1,conversion_prev);
        System.out.printf("Iteration %d converts as %a%n",
                          i,conversion);
     }
     conversion_prev = conversion;
   }
 }
}

When the program runs, it should print nothing, because the conversion should be the same every time. However, at some point during the loop — a point which varies — the result of the conversion changes, from correct (0x0.0000008p-1022) to incorrect (0x0.0000008000001p-1022). And once the conversion changes, it never changes back (for as long as I’ve cared to extend the loop that is).

Here’s an example of three consecutive runs of the program:

First Run:

Iteration 103 converts as 0x0.0000008p-1022
Iteration 104 converts as 0x0.0000008000001p-1022

Second Run:

Iteration 105 converts as 0x0.0000008p-1022
Iteration 106 converts as 0x0.0000008000001p-1022

Third Run:

Iteration 90 converts as 0x0.0000008p-1022
Iteration 91 converts as 0x0.0000008000001p-1022

Environment

I ran this on a 32-bit Intel machine. The problem occurs under both Windows and Linux. These are the versions of Java I’m using:

Windows

java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)

javac -version
javac 1.6.0_24

Linux

java -version
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.4) (6b20-1.9.4-0ubuntu1)
OpenJDK Client VM (build 19.0-b09, mixed mode, sharing)

javac -version
javac 1.6.0_20

The Failing Code In doubleValue()

I traced the execution of the FloatingDecimal class with Eclipse and zeroed in on this section of code in doubleValue():

double t = dValue * tiny10pow[j];
if ( t == 0.0 ){
  /*
   * It did underflow.
   * Look more closely at the result.
   * If the exponent is just one too small,
   * then use the minimum finite as our estimate
   * value. Else call the result 0.0
   * and punt it.
   * ( I presume this could happen because
   * rounding forces the result here to be
   * an ULP or two less than
   * Double.MIN_VALUE ).
   */
  t = dValue * 2.0;
  t *= tiny10pow[j];
  if ( t == 0.0 ){
    return (isNegative)? -0.0 : 0.0;
  }
  t = Double.MIN_VALUE;
}
dValue = t;

Here are the values of the relevant variables after the highlighted line of code is executed, for both a correct conversion and an incorrect conversion:

For a correct conversion:

dValue:       0x1.54fdd80c8bd14p-197
tiny10pow[j]: 0x1.8062864ac6f43p-851
t:            0x0.0000008p-1022

For an incorrect conversion:

dValue:       0x1.54fdd80c8bd14p-197
tiny10pow[j]: 0x1.8062864ac6f43p-851
t:            0x0.0000008000001p-1022

dValue and tiny10pow[j] are the same in each case — so how could t be different?

My x87 Theory

My first theory was that x87 extended-precision arithmetic was coming into play. I thought that the processor had switched from 53-bit to 64-bit precision mode (although I had no theory as to how).

Calculating dValue * tiny10pow[j] in Extended-Precision Registers

Let’s take the double-precision values of dValue and tiny10pow[j] and multiply them by hand:

dValue = 0x1.54fdd80c8bd14p-197 =

1.01010100111111011101100000001100100010111101000101 x 2-197

tiny10pow[j] = 0x1.8062864ac6f43p-851 =

1.1000000001100010100001100100101011000110111101000011 x 2-851

dValue * tiny10pow[j] =

1.00000000000000000000000000010000000000000000000000000000111100101101
01010100101111111010100101000001111 x 2-1047

Depending on which precision mode the processor is in, this result has to be rounded, to either 53 or 64 bits (I’ve highlighted bits 54 and 65, the rounding bits for each case). Let’s round it both ways.

53-bit mode

If the processor is in 53-bit mode, the full-precision result would round to

1.0000000000000000000000000001 x 2-1047

Since this is subnormal relative to double-precision, it has to be rounded again — to 52 bits. To see better where bit 53 (the rounding bit) is, let’s rewrite the 53-bit result in unnormalized binary scientific notation:

0.00000000000000000000000010000000000000000000000000001 x 2-1022

Because of round-half-to-even rounding — the mode the processor is most certainly in — this would round to

0.0000000000000000000000001 x 2-1022

This is the correct answer, so I concluded that 53-bit mode is the normal mode of operation.

64-bit mode

If the processor is in 64-bit mode, the full-precision result would round to

1.000000000000000000000000000100000000000000000000000000001111001 x 2-1047

In unnormalized binary scientific notation this is

0.00000000000000000000000010000000000000000000000000001000000000000000
00000000000001111001 x 2-1022

Because bits 53 and beyond are greater than 1/2 ULP, this would round to 52 bits as

0.0000000000000000000000001000000000000000000000000001 x 2-1022

This matches the incorrect answer, so I concluded that the processor had switched from 53-bit mode to 64-bit mode sometime during the loop.

Other Pieces to The Puzzle

32-Bit vs. 64-Bit, x87 vs. SSE

I asked Konstantin Preißer to run my testcase. He confirmed seeing the same problem on a 32-Bit JRE. He thinks that at some point during the loop, just-in-time compilation kicks in, causing native instructions to be executed. I took that as a plausible explanation for how the processor could switch modes.

In subsequent correspondence, Konstantin told me that on a 64-bit JRE, the conversion never changes. That’s what I expected, since I assumed the 64-bit JRE uses SSE instructions. But what I didn’t expect was that the conversion is always incorrect: 0x0.0000008000001p-1022.

Konstantin further told me that he ran my testcase on a 32-bit JRE without SSE:

java -XX:UseSSE=0 diffconvert

The conversion was always correct and never changed.

The 53-bit Mode Answer is Correct — But For the Wrong Reason

Looking closer at my computation of the 53-bit mode answer, 0.0000000000000000000000001 x 2-1022, I realized it was the result of a double rounding on underflow error. If the full-precision result were rounded directly to 52 subnormal bits, it would have rounded up to the incorrect answer:

0.0000000000000000000000001000000000000000000000000001 x 2-1022.

This directly rounded result is the same as the result computed with SSE instructions.

My x87/SSE Theory

So now my theory is that both x87 and SSE come into play, with 53-bit mode x87 computations being done before a switchover to SSE “hot spot” computations. (I no longer think that 64-bit mode is involved; the 64-bit mode result is just coincidentally the same as the SSE result — it avoids the double rounding error as well.)

Further Support for My New Theory

If there were a way to step through assembler code and inspect registers with Eclipse for Java, I could verify my theory directly. Not being able to do that, I wrote two small C programs to do the example computation of dValue * tiny10pow[j]: one to do it with x87 instructions in 53-bit mode, and the other to do it with SSE instructions. I ran both programs on Linux, and traced them using the gdb debugger.

Computation In x87 53-bit Mode (conv_x87_53.c)

This program, when compiled with this command

gcc -o conv_x87_53 conv_x87_53.c

produces the correct conversion, 0x0.0000008p-1022:

#include <stdio.h>
int main (void)
{
 double dValue, tiny10pow, t;
 volatile short oldcw, newcw;

 /* Set 53-bit mode */
 asm ("fstcw %0\n" : :"m"(oldcw));
 newcw = (oldcw & 0xCFF) | 0x200;
 asm ("fldcw %0\n" : :"m"(newcw));

 dValue = 0x1.54fdd80c8bd14p-197;
 tiny10pow = 0x1.8062864ac6f43p-851;
 t = dValue * tiny10pow;

 printf("%a\n",t);
}

The result of the x87 multiplication, 0x3be88000000800000000, is put in FPU register st0. This is the IEEE 754 extended-precision encoding of the 53-bit rounded result. It gets rounded again, to 52 bits, when stored to the variable t in memory. This results in a double rounding on underflow error.

Computation With SSE (conv_sse.c)

This program, when compiled with this command

gcc -msse -mfpmath=sse -march=pentium4 -o conv_sse conv_sse.c

produces the incorrect conversion, 0x0.0000008000001p-1022:

#include <stdio.h>
int main (void)
{
 double dValue, tiny10pow, t;

 dValue = 0x1.54fdd80c8bd14p-197;
 tiny10pow = 0x1.8062864ac6f43p-851;
 t = dValue * tiny10pow;

 printf("%a\n",t);
}

(This is the same program as above, but without the code to set the precision mode.)

The result of the multiplication is put in SSE register xmm0. It represents the final result, as no further rounding can occur when it’s stored to variable t in memory.

Summary

There is a bug in Java’s decimal to floating-point conversion algorithm, as demonstrated by my example. Interestingly, the bug is masked when done with x87 instructions; the correct result comes accidentally from a double rounding error in hardware. Using SSE instructions, the bug surfaces; the hardware does the calculation it was asked to do — faithfully.

This bug makes Java’s decimal to floating-point conversion algorithm sensitive to the floating-point architecture on which it runs. And with just-in-time compilation on a 32-bit processor, both architectures come into play in one program, giving two conversion results for the same decimal string.

Discussion

I described an anomaly in the conversion of one tiny decimal number; no one will lose sleep over that. But are there implications for Java’s floating-point calculations in general?

How I Found This

In my article “Incorrectly Rounded Subnormal Conversions in Java” I used 2-1047 + 2-1075 as an example of incorrect conversion. I found it using an OpenJDK testcase, which I modified to generate conversions randomly. When I looked back at the unmodified testcase’s output, I realized that the conversion of 2-1047 + 2-1075 was not listed as incorrect.

To see what was going on, I hardcoded the decimal string in a small test program and ran it; it converted correctly as well. Then I put the conversion in a loop, to see if it changed — and it did!

Bug Reports

To address the nondeterministic conversion, I wrote this bug report, calling for the use of the ‘strictfp’ keyword in the FloatingDecimal class (see the discussion in the comments below): Bug ID 7021568: Double.parseDouble() returns architecture dependent results.

As for the incorrect conversion, I think it is covered by the bug report I wrote previously (I added a comment singling out this case): Bug ID 7019078: Double.parseDouble() converts some subnormal numbers incorrectly.

Dingbat

8 Responses to “Nondeterministic Floating-Point Conversions in Java”

  1. Daniel Cheng Says:

    I guess this is what the “strictfp” keyword is for?

  2. Rick Regan Says:

    @Daniel Cheng,

    I have tried strictfp, in the example above and in the FloatingDecimal class. It makes no difference (which makes me wonder: what’s it really supposed to do?)

  3. Konstantin Preißer Says:

    Hi,

    I also tried to declare the class FloatingDecimal as strictfp. It actually made a difference: On a 32-Bit JRE, the value now always converts to “0x0.0000008000001p-1022″, independently from the UseSSE-Setting (however with UseSSE=2, the program takes a bit longer to execute the conversions).

    This is the same result that I get on a 64-Bit JRE, but there the conversions is also always “0x0.0000008000001p-1022″, independent from UseSSE setting or strictfp declaration.

    From what I read about strictfp keyword, I think it is supposed to guarantee that all floating point arithmetics (inside the class or method which is declared strictfp) are done in 32-bit or 64-bit (IEEE 754) mode, not extended 80-bit presicion.

  4. Rick Regan Says:

    Konstantin,

    You are right. I just tried it again and it DOES work when added in FloatingDecimal.java, whether on the class or doubleValue() method. (My notes say it didn’t work — I don’t know what happened I know what happened: the note was in reference to the conversion being incorrect. I didn’t test whether strictfp fixed the nondeterminism.)

    Is the fix simply for Java to add this keyword? It’s odd that it’s not there already! (I should say is the fix for the nondeterminism that simple; the fact remains that the conversion is incorrect.)

    I wonder: does strictfp simply force the use of SSE instructions? I don’t know how it could eliminate double rounding on underflow errors otherwise.

  5. Adrian Says:

    FWIW, I tried this on Mac OSX 10.6.6 with Apple’s Java 6 and no problems.
    Also, using OpenJDK 1.7.0 build 20.0-b06 on same OS is problem free.

  6. Tim Says:

    AIX IBM JDKs are the other way round : the non-JIT code gives the wrong result, but once JIT kicks in, I get the correct result (using Power5 CPU).

    With AIX JDK 6 64bits :

    java -version
    javac java version “1.6.0”
    Java(TM) SE Runtime Environment (build pap6460sr6-20090925_01(SR6))
    IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc64-64 jvmap6460sr6-20090923_42924 (JIT enabled, AOT enabled)
    J9VM – 20090923_042924
    JIT – r9_20090902_1330ifx1
    GC – 20090817_AA)
    JCL – 20090924_01

    javac -version
    javac 1.6.0-internal

    Iteration 15 converts as 0x0.0000008000001p-1022
    Iteration 16 converts as 0x0.0000008p-1022

    With AIX JDK 6 32 bits, I get the same result (the iteration number is roughly the same). The number of iteration can be different when I run it several times, but it is always first 0x0.0000008000001p-1022, then 0x0.0000008p-1022.

    Using JDK 5 (both 32 bits and 64 bits), the change occurs later (around iteration 700 or 300). I guess the code which decides when to use JIT as been changed between JDK 5 and JDK 6 to use JIT sooner.

    Using JDK 6 (64 bits and 32 bits) without the JIT compilation (i.e. running “java -Xint”) makes the conversion always the same (i.e. no output). Adding a printf outside of the loop, I see that it converts to 0x0.0000008000001p-1022.

    Adding “strictfp” does not change anything with JDK 6 (64 bits and 32 bits), with or without JIT.

    Does it mean I should add a loop around every floating point computation in my code to have the correct result ? ;)

  7. Rick Regan Says:

    @Tim,

    Interesting. I’m not too familiar with the Power floating-point architecture — what instruction-level mechanism could cause this? It’d be interesting to see the assembler (I don’t know if your compiler accepts hexadecimal floating-point literals, but maybe you could try my C program — with appropriate compiler options of course).

    As for strictfp, I think we’ve determined it only has effect when added to Java’s FloatingDecimal class.

  8. Roger Golliver Says:

    You don’t need to use SSE to get correct strictfp behavior.
    But, one has to be careful, see:
    http://www.open-std.org/JTC1/SC22/JSG/docs/m3/docs/jsgn326.pdf

Leave a Comment

(To reduce spam, cookies must be enabled)


css.php