Articles with the ‘Convert to binary’ Tag - Exploring Binary

Bigcomp: Deciding Truncated, Near Halfway Conversions

By Rick Regan December 2nd, 2011

In my article “Using Integers to Check a Floating-Point Approximation,” I briefly mentioned “bigcomp,” an optimization strtod() uses to reduce big integer overhead when checking long decimal inputs. bigcomp does a floating-point to decimal conversion — right in the middle of a decimal to floating-point conversion mind you — to generate the decimal expansion of the number halfway between two target floating-point numbers. This decimal expansion is compared to the input decimal string, and the result of the comparison dictates which of the two target numbers is the correctly rounded result.

In this article, I’ll explain how bigcomp works, and when it applies. Also, I’ll talk briefly about its performance; my informal testing shows that, under the default setting, bigcomp actually worsens performance for some inputs.

Continue reading “Bigcomp: Deciding Truncated, Near Halfway Conversions”

Using Integers to Check a Floating-Point Approximation

By Rick Regan October 31st, 2011

For decimal inputs that don’t qualify for fast path conversion, David Gay’s strtod() function does three things: first, it uses IEEE double-precision floating-point arithmetic to calculate an approximation to the correct result; next, it uses arbitrary-precision integer arithmetic (AKA big integers) to check if the approximation is correct; finally, it adjusts the approximation, if necessary. In this article, I’ll explain the second step — how the check of the approximation is done.

Continue reading “Using Integers to Check a Floating-Point Approximation”

strtod()’s Initial Decimal to Floating-Point Approximation

By Rick Regan September 27th, 2011

David Gay’s strtod() function does decimal to floating-point conversion using both IEEE double-precision floating-point arithmetic and arbitrary-precision integer arithmetic. For some inputs, a simple IEEE floating-point calculation suffices to produce the correct result; for other inputs, a combination of IEEE arithmetic and arbitrary-precision arithmetic is required. In the latter case, IEEE arithmetic is used to calculate an approximation to the correct result, which is then refined using arbitrary-precision arithmetic. In this article, I’ll describe the approximation calculation, which is based on a form of binary exponentiation.

Continue reading “strtod()’s Initial Decimal to Floating-Point Approximation”

Fast Path Decimal to Floating-Point Conversion

By Rick Regan September 1st, 2011

In general, to convert an arbitrary decimal number into a binary floating-point number, arbitrary-precision arithmetic is required. However, a subset of decimal numbers can be converted correctly with just ordinary limited-precision IEEE floating-point arithmetic, taking what I call the fast path to conversion. Fast path conversion is an optimization used in practice: it’s in David Gay’s strtod() function and in Java’s FloatingDecimal class. I will explain how fast path conversion works, and describe the set of numbers that qualify for it.

Continue reading “Fast Path Decimal to Floating-Point Conversion”

Why Powers of Ten Up to 10²² Are Exact As Doubles

By Rick Regan August 11th, 2011

The fast path in David Gay’s decimal to floating-point conversion routine relies on this property of the first twenty-three nonnegative powers of ten: they have exact representations in double-precision floating-point. While it’s easy to see why powers of ten up to 10¹⁵ are exact, it’s less clear why the powers of ten from 10¹⁶ to 10²² are. To see why, you have to look at their binary representations.

Continue reading “Why Powers of Ten Up to 10²² Are Exact As Doubles”

Correct Decimal To Floating-Point Using Big Integers

By Rick Regan August 3rd, 2011

Producing correctly rounded decimal to floating-point conversions is hard, but only because it is made to be done efficiently. There is a simple algorithm that produces correct conversions, but it’s too slow — it’s based entirely on arbitrary-precision integer arithmetic. Nonetheless, you should know this algorithm, because it will help you understand the highly-optimized conversion routines used in practice, like David Gay’s strtod() function. I will outline the algorithm, which is easily implemented in a language like C, using a “big integer” library like GMP.

Ratio of Big Integers (2^119/10^20) Producing the 53-Bit Significand of 1e-20 — Ratio of Big Integers (2¹¹⁹/10²⁰) Producing the 53-Bit Significand of 1e-20

Continue reading “Correct Decimal To Floating-Point Using Big Integers”

How I Taught Third Graders Binary Numbers

By Rick Regan June 8th, 2011

Last week I introduced my son’s third grade class to binary numbers. I wanted to build on my prior visit, where I introduced them to the powers of two. By teaching them binary, I showed them that place value is not limited to base ten, and that there is a difference between numbers and numerals.

My presentation was based on base-ten-block-like imagery, since I knew the students were comfortable expressing numbers with base ten blocks. I thought extending the block model to other bases would work well. I think it did.

The Number Twenty-Seven, Broken Into Powers of Two — The Number Twenty-Seven in Tape Flags, Broken Into Powers of Two

Continue reading “How I Taught Third Graders Binary Numbers”

Pi and e In Binary

By Rick Regan May 18th, 2011

Some people are curious about the binary representations of the mathematical constants pi and e. Mathematically, they’re like every other irrational number — infinite strings of 0s and 1s (with no discernible pattern). In a computer, they’re finite, making them only approximations to their true values. I will show you what their approximations look like in five different levels of binary floating-point precision.

Continue reading “Pi and e In Binary”

The Four Stages of Floating-Point Competence

By Rick Regan April 20th, 2011

The four stages of competence model describes the phases you go through when acquiring a skill:

Unconscious incompetence: You don’t know what you don’t know.
Conscious incompetence: You know what you don’t know.
Conscious competence: You know what you know.
Unconscious competence: You don’t know what you know.

I’ve applied this model to assess competence in binary floating-point arithmetic; let’s see where you stand:

Continue reading “The Four Stages of Floating-Point Competence”

Properties of the Correction Loop in David Gay’s strtod()

By Rick Regan March 23rd, 2011

The infinite loop I discovered in PHP was caused by a bug in its decimal to floating-point conversion routine, which is based on David Gay’s widely used strtod() function. strtod() has a “correction loop,” the purpose of which is to refine an initial estimate of a converted double-precision value to its correctly rounded result. This got me thinking: infinite loops notwithstanding, how many times should the loop execute? Does it depend on the accuracy of the initial estimate? I instrumented strtod() and gathered some data to help answer these questions.

The most interesting thing I discovered was this: strtod()’s correction procedure can execute at most three times. So why was it coded as an infinite loop?

Continue reading “Properties of the Correction Loop in David Gay’s strtod()”

A Better Fix for the PHP 2.2250738585072011e-308 Bug

By Rick Regan February 24th, 2011

Recently I discovered a bug in PHP’s decimal to floating-point conversion routine, zend_strtod(): it went into an infinite loop trying to convert the decimal string 2.2250738585072011e-308 to floating-point. zend_strtod() is based on David Gay’s strtod() function in dtoa.c, as are the decimal to floating-point conversion routines of many other open source projects. So why hasn’t this bug affected these other projects?

zend_strtod() is based on a very old copy of dtoa.c. The current version of dtoa.c is immune to the 2.2250738585072011e-308 bug — and has been since 1997 by my reckoning. So while the ‘volatile’ keyword fixes the PHP problem, I think there’s a better solution: upgrade zend_strtod() to the latest dtoa.c.

Continue reading “A Better Fix for the PHP 2.2250738585072011e-308 Bug”

Nondeterministic Floating-Point Conversions in Java

By Rick Regan February 21st, 2011

Recently I discovered that Java converts some very small decimal numbers to double-precision floating-point incorrectly. While investigating that bug, I stumbled upon something very strange: Java’s decimal to floating-point conversion routine, Double.parseDouble(), sometimes returns two different results for the same decimal string. The culprit appears to be just-in-time compilation of Double.parseDouble() into SSE instructions, which exposes an architecture-dependent bug in Java’s conversion algorithm — and another real-world example of a double rounding on underflow error. I’ll describe the problem, and take you through the detective work to find its cause.

Continue reading “Nondeterministic Floating-Point Conversions in Java”