GCC Conversions Are Incorrect, Architecture or Otherwise

Recently I wrote about my retesting of the gcc C compiler’s string to double conversions and how it appeared that its incorrect conversions were due to an architecture-dependent bug. My examples converted incorrectly on 32-bit systems, but worked on 64-bit systems — at least most of them. I decided to dig into gcc’s source code and trace its execution, and I found the architecture dependency I was looking for. But I found more than that: due to limited precision, gcc will do incorrect conversions on any system. I’ve constructed an example to demonstrate this.

(Update 12/3/13: GCC now does correct conversions.)

Architecture Dependency AND Limited Precision

gcc’s conversion routine lives in gcc/real.c and real.h. In real.h, there’s this line:

#define SIGNIFICAND_BITS	(128 + HOST_BITS_PER_LONG)

SIGNIFICAND_BITS defines the precision used in the conversion; there are two problems wth its definition:

Architecture Dependency. On 32-bit systems, or at least on my Intel Core i5 running the 32-bit version of Ubuntu Linux 12.04.3 with gcc 4.6.3, HOST_BITS_PER_LONG is 32; on 64-bit systems, or at least on my Intel Core i5 running the 64-bit version of Ubuntu Linux 12.04.3 with gcc 4.6.3, HOST_BITS_PER_LONG is 64. That makes SIGNIFICAND_BITS equal to 160 or 192 bits, respectively.
Limited Precision. 160 or 192 bits is not sufficient to convert an arbitrary double-precision value.

I don’t know the rationale for choosing this constant, but I know that if I increase it and recompile gcc, it converts the example below correctly.

Example

My previous testing of gcc had not uncovered any incorrect conversions on my 64-bit system. (Such are the pitfalls of random testing — missed corner cases.) But after looking at the code, it was easy to construct an example:

5.0216813883093451685872615018317116712748411717802652598273e58

That 59 decimal digit number converts to this 196-bit binary number:

1000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

Bit 54 (highlighted) is a 1, but we need to see through bit 196 in order to round properly (up). The correctly rounded result, as a hexadecimal floating-point constant, is 0x1.0000000000001p+195.

This is how gcc represents the number in its limited precision:

1000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110000

It thinks bit 54 is 0, so it rounds down to 0x1p+195.

Try It Out

Here’s a C program you can use to confirm incorrect rounding on your system (it is formatted this way to fit within this article):

// gcc -o gccIncorrect gccIncorrect.c
// ./gccIncorrect
#include <stdio.h>

int main (void)
{
 char* decimal =
 "5.0216813883093451685872615018317116712748411717802652598273e58";
 double d =
 5.0216813883093451685872615018317116712748411717802652598273e58;

 printf("%s\n",decimal);
 printf(" Correct = 0x1.0000000000001p+195\n");
 printf(" gcc =     %a\n",d);
}

This is the output on my system (incorrect conversion is shaded):

5.0216813883093451685872615018317116712748411717802652598273e58
 Correct = 0x1.0000000000001p+195
 gcc =     0x1p+195

Bug Report

This is an open problem; here is the existing bug report, which I have amended.

Decimal to Floating-Point Needs Arbitrary Precision

6 comments

Hello, Rick.

I do not know if you remember but we were already puzzled about GCC’s conversions working in 64-bit but failing in 32-bit. So well done solving that mystery:

http://blog.frama-c.com/index.php?post/2011/11/18/Analyzing-single-precision-floating-point-constants

Pascal,

Yes, it’s been in the back of my mind since I wrote the first article, but I just never had the time to investigate. I was puzzled how there was an architecture dependency, yet floating-point was not involved. Who knew it would boil down to integer size!

Hi Rick, it seems to me as if not only the bit buffer is chosen too small, but that there also is a fundamental flaw in gcc’s conversion algorithm. I have created some test data that are composed of an exact tie (1.5+2^(-53)) plus a single negative power of two in the range from -70 to -1100, step 2. From your investigations I had expected the converted doubles to be exact up to some negative power of two, and then be wrong for all larger negative powers of two. For gcc 4.7.1 on Windows (MinGW…), this is true: tie plus 2^(-154) is converted to 3FF8000000000001, but for all larger powers of two the result is 3FF8000000000000. On the contrary, for gcc on Ubuntu, the results are correctly converted by gcc 4.7.2 up to 2^(-610), for -612 the result is wrong, but larger negative powers of two do not necessarily yield incorrect results. For example, the strings corresponding to the powers -614 to -626, -630, -634 to -636, and -640 to -656 are correctly converted, whereas those corresponding to -628, -632, -638, and -658 to -688 are wrong. But then, the result for -690 is correct again…
BTW, the javascript engines of all the main web browsers except IE get all values right, including FF24 on Ubuntu! So those interested in algorithms perhaps might want to have a look into their routines…

Georg,

Remind me — is your MinGW system 32-bit, and your Ubuntu system 64-bit?

Good timing — I just published my article about glibc’s strtod() and just started looking at gcc’s algorithm more closely to write a similar article about it. I will look at the code with your examples in mind and let you know what I find.

Regarding conversion in the Javascript engines, I think most of them use David Gay’s strtod(). I think Chrome uses something else though (code by Florian Loitsch? I have that on my todo list).

Georg,

I confirm this behavior on my 64-bit Ubuntu system (“+2^-610” is right; “+2^-612” is wrong; “+2^-614” is right; “+2^-628” is wrong).

Georg,

I’m looking at real.c and so far I see nothing wrong other than the lack of precision. I think that could still explain the bad conversions, although
it’d be nice to understand why it happens according to the “pattern” you see.

To test this I tried to increase the precision in real.h to match, but I broke the build.

I’ll keep my eyes open as I study the code further.

Comments are closed.

Architecture Dependency AND Limited Precision

Example

Try It Out

Bug Report

Related

6 comments