Bigcomp: Deciding Truncated, Near Halfway Conversions

http://www.exploringbinary.com/bigcomp-deciding-truncated-near-halfway-conversions/


In my article “Using Integers to Check a Floating-Point Approximation,” I briefly mentioned “bigcomp,” an optimization strtod() uses to reduce big integer overhead when checking long decimal inputs. bigcomp does a floating-point to decimal conversion — right in the middle of a decimal to floating-point conversion mind you — to generate the decimal expansion of the number halfway between two target floating-point numbers. This decimal expansion is compared to the input decimal string, and the result of the comparison dictates which of the two target numbers is the correctly rounded result.

In this article, I’ll explain how bigcomp works, and when it applies. Also, I’ll talk briefly about its performance; my informal testing shows that, under the default setting, bigcomp actually worsens performance for some inputs.

When Bigcomp Applies

Bigcomp is enabled by default in strtod() (it can be disabled by “#defining” the variable “NO_STRTOD_BIGCOMP”). It applies to decimal strings longer than STRTOD_DIGLIM significant digits, which by default is “#defined” to 40. For decimal inputs dStr of 40 significant digits or less, the normal approximation check is done, using all significant digits of dStr for its big integer representation d; for inputs longer than 40 digits, only the first 18 digits are used for d. This truncation of d leads to smaller big integers, but it can lead to cases where the approximation check cannot decide between adjacent floating-point values. When that happens, the bigcomp procedure kicks in, through a call to the bigcomp() function.

Specifically, bigcomp() is called when the approximation check cannot decide between the approximation b and its successor, b + u. (As with all my articles in this series, I will only be discussing how strtod() works for round-to-nearest/round-half-to-even rounding.) This happens when the truncated value of d falls between b and b + h; that is, between b and one-half ULP above b.

(In this article, I distinguish between the input dStr and its big integer representation d, which may or may not be a truncation of dStr.)

Overview of the Algorithm

bigcomp() compares the significant digits of dStr to a string representation of the significant digits of b + h. It goes left-to-right, digit by significant digit, stopping when the two strings differ. (In the bulk of cases, the difference occurs between the 16th and 18th digits, although in the worst case the difference can occur at the 54th digit; it depends on how close the decimal number is to b + h.) If dStr is less than b + h, bigcomp() returns b; if dStr is greater than b + h, bigcomp() returns b + u; if dStr equals b + h, either b or b + u is chosen, according to round-half-to-even rounding.

bigcomp() generates the significant digits of b + h using essentially the same algorithm as the dtoa() function, the floating-point to decimal conversion routine co-located in David Gay’s dtoa.c. It uses big integer arithmetic, including big integer division.

bigcomp() starts by creating a fraction that represents the significant digits of b + h (not b + h itself). In this form, it generates a digit by multiplying the numerator of the fraction by ten, and then using integer division to form a quotient and remainder. The quotient represents the next digit; the remainder represents the remaining digits, which are accessed one at a time by repeating this process. Digits are generated until there is a mismatch between dStr and b + h.

The Algorithm, By Example

I’ll break the algorithm into five steps, and illustrate it with two examples. I set STRTOD_DIGLIM to 19 so I could generate shorter examples (both are 20 digits).

Example 1: Check 1.3694713649464322631e-11

In this example, we want to check if dStr = 1.3694713649464322631e-11 is closer to b = 0x1.e1d703bb5749cp-37 or b + u = 0x1.e1d703bb5749dp-37. We do this by comparing dStr to b + h = 0x1.e1d703bb5749c8p-37. (I describe b + h using a hexadecimal floating-point constant, even though it won’t be represented as an IEEE floating-point number. This number has 54 bits, with bit 54 being a 1. It is b with a hexadecimal ‘8’ tacked on at the end.)

1. Represent b + h as an integer times a power of two

We start by representing b as a 53-bit integer times a power of two, as we do in the basic approximation check:

b = bInt x 2bExp 
  = 8476617176609948 x 2-89

Since bInt is 53-bit integer, u = 2bExp, and h is 2bExp-1. Therefore,

b + h = bInt x 2bExp + 2bExp-1
      = (bInt x 2 + 1) x 2bExp-1
      = (8476617176609948 x 2 + 1) x 2-90
      = 16953234353219897 x 2-90

In code: We create bInt and bExp from b’s underlying floating-point representation: we treat its significand as a 53-bit integer, and subtract 52 from its power of two exponent. We create a big integer from bInt, and then we make it the integer portion of b + h by shifting it left and ‘ORing’ a 1 into its least significant bit. We maintain the exponent of the power of two as a native integer, setting it to bExp-1.

2. Scale b + h so its first significant digit is just to the left of the decimal point

If you were to compute b + h from the expression above, you’d see its value is

0.0000000000136947136494643226044416541235590733258456475063269408565
24743139743804931640625

(We’re not going to do that; I just wanted to show you where we’re headed. By the way, I computed it using PARI/GP.)

What we want to do is scale b + h so that it looks like

1.3694713649464322604441654123559073325845647506326940856524743139743
804931640625

In other words, we want to multiply it by a power of ten so that, when viewed as a decimal number, its significant digits look normalized; 1011 does the trick, effectively shifting b + h left by 11 decimal places:

Scaled b + h = 16953234353219897 x 2-90 x 1011

This “normalization” sets up the digit generation algorithm.

In code: A full-blown floating-point to decimal conversion would need to use logarithms to compute the power of ten; but in this case, we can obtain the power of ten exponent directly from dStr.

3. Combine powers of two

If we decompose the power of ten into a power of two and a power of five, we get

Scaled b + h = 16953234353219897 x 2-90 x 211 x 511

We can now combine powers of two to reduce the size of the resulting big integers.

Scaled b + h = 16953234353219897 x 2-79 x 511

In code: The code is simple: just add the exponents of the power of two from b + h and the power of two from the power of ten scaling factor.

4. Express scaled b + h as a fraction

The negative power of two, 2-79, makes this a fraction:

Scaled b + h = (16953234353219897 x 511)/279
             = 827794646153315283203125/604462909807314587353088

In code: Compute the powers as big integers, using the absolute values of the exponents; then, multiply the terms out, as appropriate.

5. Generate and check digits

Scaled b + h has a numerator num and denominator den:

num = 827794646153315283203125
den = 604462909807314587353088

The significant digits are generated from these integers using the “repeated multiplication by ten” algorithm; this is what’s done at each iteration (dig stands for ‘digit’, and rem stands for ‘remainder’):

  1. dig = num/den (just the integer part of the quotient)
  2. rem = numdig x den
  3. num = rem x 10

In other words, at each iteration, a digit is peeled off and the rest of the number is shifted left one decimal place. This step is repeated until a corresponding digit from dStr and b + h don’t match; here are the first three iterations:

1. 827794646153315283203125/den  = 1 rem 223331736346000695850037
2. 2233317363460006958500370/den = 3 rem 419928634038063196441106
3. 4199286340380631964411060/den = 6 rem 572508881536744440292532
4. ...

This stops after step 19; that is, dStr and b + h differ at the 19th digit. Here are the digits of dStr and b + h aligned, to show where they differ:

Digits of dStr  = 13694713649464322631
Digits of b + h = 1369471364946432260444165412355907332584564750632
6940856524743139743804931640625

Since 3 is greater than 0, dStr is closer to b + u than it is to b, so b + u is the answer.

Example 2: Check 9.3170532238714134438e+16

In this example, we want to check if dStr = 9.3170532238714134438e+16 is closer to b = 0x1.4b021afd9f651p+56 or b + u = 0x1.4b021afd9f652p+56. We do this by comparing dStr to b + h = 0x1.4b021afd9f6518p+56.

1. Represent b + h as an integer times a power of two

b = bInt x 2bExp 
  = 5823158264919633 x 24

b + h = (bInt x 2 + 1) x 2bExp-1
      = (5823158264919633 x 2 + 1) x 23
      = 11646316529839267 x 23

2. Scale b + h so its first significant digit is just to the left of the decimal point

If you were to compute b + h from the expression above, you’d see its value is

93170532238714136

We want to shift it right 16 decimal places so it looks like

9.3170532238714136

Scaled b + h = 11646316529839267 x 23 x 10-16

3. Combine powers of two

If we decompose the power of ten into a power of two and a power of five, we get

Scaled b + h = 11646316529839267 x 23 x 2-16 x 5-16

Combining powers of two we get

Scaled b + h = 11646316529839267 x 2-13 x 5-16

4. Express scaled b + h as a fraction

The negative powers of two and five make this a fraction:

Scaled b + h = 11646316529839267/(213 x 516)
             = 11646316529839267/1250000000000000

5. Generate and check digits

Scaled b + h has a numerator num and denominator den:

num = 11646316529839267
den = 1250000000000000

Here are the first three iterations of the repeated multiplication by ten algorithm:

1. 11646316529839267/den = 9 rem 396316529839267
2. 3963165298392670/den  = 3 rem 213165298392670
3. 2131652983926700/den  = 1 rem 881652983926700
4. ...

This stops after step 17; that is, dStr and b + h differ at the 17th digit. Here are the digits of dStr and b + h aligned, to show where they differ:

Digits of dStr  = 93170532238714134438
Digits of b + h = 93170532238714136

Since 4 is less than 6, dStr is closer to b than it is to b + u, so b is the answer.

Discussion

strtod() has an extra step, which I’ve not described. It scales the numerator and denominator by a common power of two, which is necessary for its ‘quorem()’ function to compute the quotient and remainder properly. The power of two exponent is determined by a function called ‘dshift().’

If dStr represents a fractional value, step 2 (scale by power of ten) and step 3 (combine powers of two) are technically unnecessary. Without them, step 5 (generate digits) would generate leading zeros, which could be ignored. (Of course, the pre-scaling is more elegant, as it reduces everything to one form.)

The 40-Digit Cutoff Is Too Low

bigcomp() does a lot of work with big integers, which made me wonder why it performed better than the basic approximation check. I ran some tests, and found that it doesn’t perform better — not at the default STRTOD_DIGLIM value of 40! My tests indicate that bigcomp() only performs better starting at somewhere between 80 and 83 digits. (I don’t know why 40 was chosen as the default.) bigcomp() outperforms the basic check for really long inputs; for example, the difference in performance is great at 500 digits, and substantial at 1000 digits.

Let’s look at a 1000-digit input to see the difference in the big integers involved:

dStr = 5.39393324717760434968487301635560570642059608762708229108908
24537754965408622951477281124864324524316631524646435056101989215441
13243584680626358335990970896315873327586087827333915843584231417344
18113410301502801068079096892041478634628161830670245841932386407800
12745803879948980583567722824049594060350336856210415679895612943123
50098271611832693583464221850190462604938332826437767915224673375308
74570398083131101089815004834148500912320646940469604931721018872877
13566715387976682236017863055187831755129573243931855002598816011501
73130862094578091926124182802567489337171671008075773833668718543916
59331687129078431660248055444327878892028873933579030052796118406453
59419797286387142281725295340184928613290361074947899276429780250293
85401861377815243939607988608126354001619571808095966555370152710852
57007714188421395421027623995592231545518458559072989336658831420578
08271261804364513313549623924640140003927894134813865873451090057841
21193123838848522389829029095139051820951352412966707477e+16

Without bigcomp, using the basic approximation check, these are the variables involved:

dS = 539393324717760434968487301635560570642059608762708229108908245
37754965408622951477281124864324524316631524646435056101989215441132
43584680626358335990970896315873327586087827333915843584231417344181
13410301502801068079096892041478634628161830670245841932386407800127
45803879948980583567722824049594060350336856210415679895612943123500
98271611832693583464221850190462604938332826437767915224673375308745
70398083131101089815004834148500912320646940469604931721018872877135
66715387976682236017863055187831755129573243931855002598816011501731
30862094578091926124182802567489337171671008075773833668718543916593
31687129078431660248055444327878892028873933579030052796118406453594
19797286387142281725295340184928613290361074947899276429780250293854
01861377815243939607988608126354001619571808095966555370152710852570
07714188421395421027623995592231545518458559072989336658831420578082
71261804364513313549623924640140003927894134813865873451090057841211
93123838848522389829029095139051820951352412966707477

bS = 539393324717760400000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000

hS = 400000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000

|dS-bS| = 3496848730163556057064205960876270822910890824537754965408
62295147728112486432452431663152464643505610198921544113243584680626
35833599097089631587332758608782733391584358423141734418113410301502
80106807909689204147863462816183067024584193238640780012745803879948
98058356772282404959406035033685621041567989561294312350098271611832
69358346422185019046260493833282643776791522467337530874570398083131
10108981500483414850091232064694046960493172101887287713566715387976
68223601786305518783175512957324393185500259881601150173130862094578
09192612418280256748933717167100807577383366871854391659331687129078
43166024805544432787889202887393357903005279611840645359419797286387
14228172529534018492861329036107494789927642978025029385401861377815
24393960798860812635400161957180809596655537015271085257007714188421
39542102762399559223154551845855907298933665883142057808271261804364
51331354962392464014000392789413481386587345109005784121193123838848
522389829029095139051820951352412966707477

|dS-bS| is less than hS, so b is chosen.

With bigcomp, there are two sets of big integers involved: one set for the basic approximation check (on the truncated value), and one set for bigcomp. Here are the big integers used for the basic approximation check:

dS = 539393324717760434
bS = 539393324717760400
hS = 40
|dS-bS| = 34

Here are the big integers used for bigcomp (not shown is the extra factor of 28 in the numerator and denominator due to dshift()):

num = 13484833117944011
den = 2500000000000000

bigcomp() finds that digit 17 of dStr (3) is less than digit 17 of b + h (4), so it chooses b.

Acknowledgement

Thanks to Mark Dickinson for his commentary on bigcomp, both in Python’s version of David Gay’s dtoa.c, and in private communication.

Dingbat

Leave a Comment

(To reduce spam, cookies must be enabled)