Articles with the ‘Floating-point’ Tag

Numbers Greater Than DBL_MAX Should Convert To DBL_MAX When Rounding Toward Zero

By Rick Regan April 21st, 2024

I was testing David Gay’s most recent fixes to strtod() with different rounding modes and discovered that Apple Clang C++ (Xcode) and Microsoft Visual Studio C++ produce incorrect results for round towards zero and round down modes: their strtod()s convert numbers greater than DBL_MAX to infinity, not DBL_MAX. At first I thought Gay’s strtod() was wrong, but Dave pointed out that the IEEE 754 spec requires such conversions to be monotonic.

Continue reading “Numbers Greater Than DBL_MAX Should Convert To DBL_MAX When Rounding Toward Zero”

NaNs, Infinities, and Negative Numbers In Loan Calculators

By Rick Regan February 6th, 2024

I’ve encountered several NaNs over the years in the normal course of using various websites and apps. I’ve only documented two of them: one in a media player, and one in a podcast app. I recently ran into another one using a loan calculator website. Rather than reporting on just that one, I decided to look for more and report on anything I found all at once.

I found many more errors — NaNs, but also infinites, negative numbers, and one called “incomplete data”, whatever that means — all on websites within the top Google search results for “loan calculator”. All I had to do to elicit these errors was to enter large numbers. (And in one case, simply including a dollar sign.) All of the errors arise from the use of floating-point arithmetic combined with unconstrained input values. Some sites even let me enter numbers in scientific notation, like 1e308, or even displayed them as results.

Floating point error in loan calculator

Continue reading “NaNs, Infinities, and Negative Numbers In Loan Calculators”

Incorrect Hexadecimal to Floating-Point Conversions in David Gay’s strtod()

By Rick Regan January 30th, 2024

I wrote about Visual C++ incorrectly converting hexadecimal constants at the normal/subnormal double-precision floating-point boundary. It turns out that David Gay’s strtod() also has a problem with the same inputs, converting them all to 0 instead of 0x1p-1022.

I have emailed Dave Gay to report the problem; I will update this post when he responds.

Update 2/28/24

The bug is now fixed (after several iterations) and is available in netlib. Thanks to Dave Gay for the quick turnaround.

Typically, I use Gay’s strtod() to check the outputs of other conversion routines (examples: Visual C++, GCC and GLIBC, PHP). In this case, the roles were reversed: I used another C implementation, Apple Clang C++ , to check Gay’s strtod().

Incorrect Hexadecimal to Floating-Point Conversions in Visual C++

By Rick Regan January 26th, 2024

Martin Brown, through a referral on his Stack Overflow question, contacted me about incorrect hexadecimal to floating-point conversions he found in Visual C++, specifically conversions using strtod() at the normal/subnormal double-precision floating-point boundary. I confirmed his examples, and also found an existing problem report for the issue. It is not your typical “off by one ULP due to rounding” conversion error; it is a conversion returning 0 for a non-zero input or returning numbers with exponents off by binary orders of magnitude.

Continue reading “Incorrect Hexadecimal to Floating-Point Conversions in Visual C++”

ChatGPT Writes Decent Code For Binary to Decimal Conversion

By Rick Regan February 14th, 2023

OK, enough just playing around with ChatGPT; let’s see if it can write some code:

Rick tells ChatGPT ‘Write Kotlin code to convert a binary number represented as a string to base 10’ and ChatGPT answers
Part two of Chat GPT's answer

Continue reading “ChatGPT Writes Decent Code For Binary to Decimal Conversion”

Anomalies In IntelliJ Kotlin Floating-Point Literal Inspection

By Rick Regan July 19th, 2022

IntelliJ IDEA has a code inspection for Kotlin that will warn you if a decimal floating-point literal exceeds the precision of its type (Float or Double). It will suggest an equivalent literal (one that maps to the same binary floating-point number) that has fewer digits, or has the same number of digits but is closer to the floating-point number.

Screenshot in IntelliJ IDEA of hovering over a flagged 17-digit literal with a suggested 10-digit replacement — Hovering over a flagged 17-digit literal suggests a 10-digit replacement.

For Doubles for example, every literal over 17-digits should be flagged, since it never takes more than 17 digits to specify any double-precision binary floating-point value. Literals with 16 or 17 digits should be flagged if there is a replacement that is shorter or closer. And no literal with 15 digits or fewer should ever be flagged, since doubles have of 15-digits of precision.

But IntelliJ doesn’t always adhere to that, like when it suggests an 18-digit replacement for a 13-digit literal!

Screenshot of IntelliJ IDEA suggesting an 18-digit replacement for a 13-digit literal — An 18-digit replacement suggested for a 13-digit literal!

Continue reading “Anomalies In IntelliJ Kotlin Floating-Point Literal Inspection”

Another NaN In the Wild

By Rick Regan March 1st, 2021

I see these from time to time, but I don’t always capture them; here’s one I saw recently while playing a podcast:

A NaN in an ad in the app Castbox (partial image) — A NaN in an ad in the app Castbox (click for full image).

(According to Castbox, this is an error in the ad and is out of their control.)

Direct Generation of Double Rounding Error Conversions in Kotlin

By Rick Regan November 10th, 2020

For my recent search for short examples of double rounding errors in decimal to double to float conversions I wrote a Kotlin program to generate and test random decimal strings. While this was sufficient to find examples, I realized I could do a more direct search by generating only decimal strings with the underlying double rounding error bit patterns. I’ll show you the Java BigDecimal based Kotlin program I wrote for this purpose.

Continue reading “Direct Generation of Double Rounding Error Conversions in Kotlin”

Double Rounding Errors in Decimal to Double to Float Conversions

By Rick Regan November 5th, 2020

In my previous exploration of double rounding errors in decimal to float conversions I showed two decimal numbers that experienced a double rounding error when converted to float (single-precision) through an intermediate double (double-precision). I generated the examples indirectly by setting bit combinations that forced the error, using their corresponding exact decimal representations. As a result, the decimal numbers were long (55 digits each). Mark Dickinson derived a much shorter 17 digit example, but I hadn’t contemplated how to generate even shorter numbers — or whether they existed at all — until Per Vognsen wrote me recently to ask.

The easiest way for me to approach Per’s question was to search for examples, rather than try to find a way to construct them. As such, I wrote a simple Kotlin¹ program to generate decimal strings and check them. I tested all float-range (including subnormal) decimal numbers of 9 digits or fewer, and tens of billions of random 10 to 17 digit float-range (normal only) numbers. I found example 7 to 17 digit numbers that, when converted to float through a double, suffer a double rounding error.

Continue reading “Double Rounding Errors in Decimal to Double to Float Conversions”

Maximum Number of Decimal Digits In Binary Floating-Point Numbers

By Rick Regan December 5th, 2016

I’ve written about the formulas used to compute the number of decimal digits in a binary integer and the number of decimal digits in a binary fraction. In this article, I’ll use those formulas to determine the maximum number of digits required by the double-precision (double), single-precision (float), and quadruple-precision (quad) IEEE binary floating-point formats.

The maximum digit counts are useful if you want to print the full decimal value of a floating-point number (worst case format specifier and buffer size) or if you are writing or trying to understand a decimal to floating-point conversion routine (worst case number of input digits that must be converted).

Continue reading “Maximum Number of Decimal Digits In Binary Floating-Point Numbers”

17 Digits Gets You There, Once You’ve Found Your Way

By Rick Regan June 27th, 2016

Every double-precision floating-point number can be specified with 17 significant decimal digits or less. A simple way to generate this 17-digit number is to round the full-precision decimal value of the double to 17 digits. For example, the double-precision value 0x1.6d4c11d09ffa1p-3, which in decimal is 1.783677474777478899614635565740172751247882843017578125 x 10^-1, can be recovered from the decimal floating-point literal 1.7836774747774789e-1. The extra digits are unnecessary, since they will only take you to the same double.

On the other hand, an arbitrary, arbitrarily long decimal literal rounded or truncated to 17 digits may not convert to the double-precision value it’s supposed to. This is a subtle point, one that has even tripped up implementers of widely used decimal to floating-point conversion routines (glibc strtod() and Visual C++ strtod(), for example).

Continue reading “17 Digits Gets You There, Once You’ve Found Your Way”

Decimal Precision of Binary Floating-Point Numbers

By Rick Regan June 9th, 2016

How many decimal digits of precision does a binary floating-point number have?

For example, does an IEEE single-precision binary floating-point number, or float as it’s known, have 6-8 digits? 7-8 digits? 6-9 digits? 6 digits? 7 digits? 7.22 digits? 6-112 digits? (I’ve seen all those answers on the Web.)

Continue reading “Decimal Precision of Binary Floating-Point Numbers”