Number of Bits in a Decimal Integer

Every integer has an equivalent representation in decimal and binary. Except for 0 and 1, the binary representation of an integer has more digits than its decimal counterpart. To find the number of binary digits (bits) corresponding to any given decimal integer, you could convert the decimal number to binary and count the bits. For example, the two-digit decimal integer 29 converts to the five-digit binary integer 11101. But there’s a way to compute the number of bits directly, without the conversion.

Sometimes you want to know, not how many bits are required for a specific integer, but how many are required for a d-digit integer — a range of integers. A range of integers has a range of bit counts. For example, four-digit decimal integers require between 10 and 14 bits. For any d-digit range, you might want to know its minimum, maximum, or average number of bits. Those values can be computed directly as well.

Continue reading “Number of Bits in a Decimal Integer”

Using Integers to Check a Floating-Point Approximation

For decimal inputs that don’t qualify for fast path conversion, David Gay’s strtod() function does three things: first, it uses IEEE double-precision floating-point arithmetic to calculate an approximation to the correct result; next, it uses arbitrary-precision integer arithmetic (AKA big integers) to check if the approximation is correct; finally, it adjusts the approximation, if necessary. In this article, I’ll explain the second step — how the check of the approximation is done.

Continue reading “Using Integers to Check a Floating-Point Approximation”

strtod()’s Initial Decimal to Floating-Point Approximation

David Gay’s strtod() function does decimal to floating-point conversion using both IEEE double-precision floating-point arithmetic and arbitrary-precision integer arithmetic. For some inputs, a simple IEEE floating-point calculation suffices to produce the correct result; for other inputs, a combination of IEEE arithmetic and arbitrary-precision arithmetic is required. In the latter case, IEEE arithmetic is used to calculate an approximation to the correct result, which is then refined using arbitrary-precision arithmetic. In this article, I’ll describe the approximation calculation, which is based on a form of binary exponentiation.

Continue reading “strtod()’s Initial Decimal to Floating-Point Approximation”

Nondeterministic Floating-Point Conversions in Java

Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.

Continue reading “Nondeterministic Floating-Point Conversions in Java”

A Closer Look at the Java 2.2250738585072012e-308 Bug

Java’s decimal to floating-point conversion routine, the doubleValue() method of its FloatingDecimal class, goes into an infinite loop when converting the decimal string 2.2250738585072012e-308 to double-precision binary floating-point. I took a closer look at the bug, by tracing the doubleValue() method in the Eclipse IDE for Java (thanks to Konstantin Preißer for helping me set that up). What I found was that our initial analysis of the bug was wrong; what actually happens is that doubleValue()’s correction loop oscillates between two values, 0x1p-1022 and 0x0.fffffffffffffp-1022.

Continue reading “A Closer Look at the Java 2.2250738585072012e-308 Bug”

Java Hangs When Converting 2.2250738585072012e-308

Konstantin Preißer made an interesting discovery, after reading my article “PHP Hangs On Numeric Value 2.2250738585072011e-308”: Java — both its runtime and compiler — go into an infinite loop when converting the decimal number 2.2250738585072012e-308 to double-precision binary floating-point. This number is supposed to convert to 0x1p-1022, which is DBL_MIN; instead, Java gets stuck, oscillating between 0x1p-1022 and 0x0.fffffffffffffp-1022, the largest subnormal double-precision floating-point number.

Continue reading “Java Hangs When Converting 2.2250738585072012e-308”

Why “Volatile” Fixes the 2.2250738585072011e-308 Bug

Recently I discovered a serious bug in x87 builds of PHP: PHP’s decimal to floating-point conversion routine, zend_strtod(), went into an infinite loop when converting the decimal string 2.2250738585072011e-308 to double-precision binary floating-point. This problem was fixed with a simple one line of code change to zend_strtod.c:

This line

double aadj, aadj1, adj;

was changed to

volatile double aadj, aadj1, adj;

Why does this fix the problem? I uncovered the very specific reason: it prevents a double rounding on underflow error.

Continue reading “Why “Volatile” Fixes the 2.2250738585072011e-308 Bug”

PHP Hangs On Numeric Value 2.2250738585072011e-308

I stumbled upon a very strange bug in PHP; this statement sends it into an infinite loop:

<?php $d = 2.2250738585072011e-308; ?>

(The same thing happens if you write the number without scientific notation — 324 decimal places.)

I hit this bug in the two places I tested for it: on Windows (PHP 5.3.1 under XAMPP 1.7.3), and on Linux (PHP Version 5.3.2-1ubuntu4.5) — both on an Intel Core Duo processor. I’ve written a bug report.

Continue reading “PHP Hangs On Numeric Value 2.2250738585072011e-308”

Fifteen Digits Don’t Round-Trip Through SQLite Reals

I’ve discovered that decimal floating-point numbers of 15 significant digits or less don’t always round-trip through SQLite. Consider this example, executed on version 3.7.3 of the pre-compiled SQLite command shell:

sqlite> create table t1(d real);
sqlite> insert into t1 values(9.944932e+31);
sqlite> select * from t1;
9.94493200000001e+31

SQLite represents a decimal floating-point number that has real affinity as a double-precision binary floating-point number — a double. A decimal number of 15 significant digits or less is supposed to be recoverable from its double-precision representation. In SQLite, however, this guarantee is not met; this is because its floating-point to decimal conversion routine is implemented in limited-precision floating-point arithmetic.

Continue reading “Fifteen Digits Don’t Round-Trip Through SQLite Reals”

The Answer is One (Unless You Use Floating-Point)

What does this C function do?

double f(double a)
{
 double b, c;

 b = 10*a - 10;
 c = a - 0.1*b;

 return (c);
}

Based solely on reading the code, you’ll conclude that it always returns 1: c = a – 0.1*(10*a – 10) = a – (a-1) = 1. But if you execute the code, you’ll find that it may or may not return 1, depending on the input. If you know anything about binary floating-point arithmetic, that won’t surprise you; what might surprise you is how far from 1 the answer can be — as far away as a large negative number!

Continue reading “The Answer is One (Unless You Use Floating-Point)”

Quick and Dirty Floating-Point to Decimal Conversion

In my article “Quick and Dirty Decimal to Floating-Point Conversion” I presented a small C program that uses double-precision floating-point arithmetic to convert decimal strings to binary floating-point numbers. The program converts some numbers incorrectly, despite using an algorithm that’s mathematically correct; its limited precision calculations are to blame. I dubbed the program “quick and dirty” because it’s simple, and overall converts reasonably accurately.

For this article, I took a similar approach to the conversion in the opposite direction — from binary floating-point to decimal string. I wrote a small C program that combines two mathematically correct algorithms: the classic “repeated division by ten” algorithm to convert integer values, and the classic “repeated multiplication by ten” algorithm to convert fractional values. The program uses double-precision floating-point arithmetic, so like its quick and dirty decimal to floating-point counterpart, its conversions are not always correct — though reasonably accurate. I’ll present the program and analyze some example conversions, both correct and incorrect.

Continue reading “Quick and Dirty Floating-Point to Decimal Conversion”

Inconsistent Rounding of Printed Floating-Point Numbers

What does this C program print?

#include <stdio.h>
int main (void)
{
 printf ("%.1f\n",0.25);
}

The answer depends on which compiler you use. If you compile the program with Visual C++ and run on it on Windows, it prints 0.3; if you compile it with gcc and run it on Linux, it prints 0.2.

The compilers — actually, their run time libraries — are using different rules to break decimal rounding ties. The two-digit number 0.25, which has an exact binary floating-point representation, is equally near two one-digit decimal numbers: 0.2 and 0.3; either is an acceptable answer. Visual C++ uses the round-half-away-from-zero rule, and gcc (actually, glibc) uses the round-half-to-even rule, also known as bankers’ rounding.

This inconsistency of printed output is not limited to C — it spans many programming environments. In all, I tested fixed-format printing in nineteen environments: in thirteen of them, round-half-away-from-zero was used; in the remaining six, round-half-to-even was used. I also discovered an anomaly in some environments: numbers like 0.15 — which look like halfway cases but are actually not when viewed in binary — may be rounded incorrectly. I’ll report my results in this article.

Continue reading “Inconsistent Rounding of Printed Floating-Point Numbers”

Copyright © 2008-2022 Exploring Binary

Privacy policy

Powered by WordPress

css.php