I have been writing about the spacing of decimal and binary floating-point numbers, and about how their relative spacing determines whether numbers round-trip between those two bases. I’ve stated an inequality that captures the required spacing, and from it I have derived formulas that specify the number of digits required for round-trip conversions. I have not proven that this inequality holds, but I will prove “half” of it here. (I’m looking for help to prove the other half.)
Base Conversion Theorem
David W. Matula states and proves this theorem, which governs round-trip conversions: Given n base B digits and m base v digits, conversions from base B round-trip through base v if and only if Bn < vm-1.
The inequality specifies the precision required of both bases to make the gaps in the original base (B) greater than or equal to the gaps in the intermediate base (v), the required condition for round-trip conversions.
Proof of the theorem requires two steps:
- If Bn < vm-1, then conversions from B round-trip through v.
- If conversions from B round-trip through v, then Bn < vm-1.
The first step shows that the inequality is a sufficient condition for round-tripping, and the second step shows that the inequality is a necessary condition for round-tripping. In other words: if the inequality holds, numbers round-trip; if numbers round-trip, the inequality holds. The full proof can be found here.
(Matula states his theorem in different terms and symbols than I’ve used. For example, he uses the phrase one-to-one conversion mapping instead of round-trip, and he defines gap in a relative, not absolute way. Also, he covers a second class of conversions, called truncation conversions.)
For our purposes, we want to apply this theorem to decimal/binary conversions, so I will restate it concretely as two separate theorems:
- Decimal-binary-decimal: d-digit decimal numbers round-trip through b-bit binary numbers if and only if 10d < 2b-1.
- Binary-decimal-binary: b-bit binary numbers round-trip through d-digit decimal numbers if and only if 2b < 10d-1.
Existing Proofs of The Sufficient Condition
Several authors have done partial proofs of the theorem, for the sufficient condition only: I. Goldberg, Kahan, D. Goldberg, and Muller et al. (p. 41-3). I. Goldberg does it for decimal-binary-decimal round-trips, and Kahan, D. Goldberg, and Muller et al. do it for binary-decimal-binary round-trips. D. Goldberg’s proof is limited to IEEE single-precision binary floating-point. Muller et al., though they state that the inequality is “a necessary and sufficient condition”, only prove the sufficient condition.
I like the proofs of D. Goldberg and Muller et al; they use a simple gap-based argument. (Frankly, I could not follow the other proofs in detail.) I will give my own proof, based on D. Goldberg’s and Muller et al.’s technique, for both decimal-binary-decimal and binary-decimal-binary round-trips, and for any precision.
Gap Size Determines Round-Trip
A number is said to round-trip if it converts back to itself after being converted to an intermediate base. This can only happen when the gaps in the original base are greater than or equal to the gaps in the intermediate base. (Gap size can only be equal for integers, where the gaps line up exactly and thus conversion is exact; I won’t consider this special case of equal gap size any further.) To see why, consider what happens during a conversion; let’s use decimal-binary-decimal conversion as an example.
Let gapd be the size of the gap between decimal numbers, and let gapb be the size of the gap between binary numbers. Conversion from decimal to binary will incur a maximum error of gapb/2. If gapb/2 > gapd/2, the closest binary number, in the worst case, will be beyond the midpoints sandwiching the original decimal number; that means it would convert back to a different decimal number. If gapd/2 > gapb/2, then the binary number has to round back to the original decimal number. (Note that in this latter case, the decimal number can never convert to a decimal midpoint; even if a binary number lines up with the decimal midpoint, the binary number at the other end of the binary gap would be closer. This prevents ties, and thus keeps tie-breaking rounding mode out of the picture, which could send the returning conversion back to the wrong number.)
What Matula Has To Say About Gap Size and Round-Trips
In his paper Base Conversion Mappings Matula says
“…a sufficient condition for a base conversion mapping to be one-to-one is that the minimum gap in the old base be at least as large as the maximum gap in the new base. …if this condition does not hold … there will exist some range of numbers where the gap function in the new base is larger than the gap function in the old base. If this region is sufficiently long, a conversion mapping of the old base into the new base could not be one-to-one over this region.”
My Proof of The Sufficient Condition: Decimal-Binary-Decimal
Let d be the number of significant decimal digits, and b be the number of significant bits.
Consider the interval [10e,10e+1). In this interval, there are three or four powers of two, so there are four or five different binary gap sizes. Let 2f be the largest power of two in the interval. Since gap size increases with exponent, the binary gaps in the power of ten interval are at their largest in [2f,10e+1). Thus all we need to check is that that this maximum binary gap size is smaller than the decimal gap size in the power of ten interval.
Starting with 10d < 2b-1, and given that 2f < 10e+1, we can multiply as follows, preserving the inequality: 10d·2f < 2b-1·10e+1.
Combining powers, this simplifies to 2f/2b-1 < 10e+1/10d, or 2f+1-b < 10e+1-d. This proves the sufficient condition.
My Proof of The Sufficient Condition: Binary-Decimal-Binary
This proof is symmetric to the one above, so will read similarly.
Consider the interval [10e,10e+1). In this interval, there are three or four powers of two, so there are four or five different binary gap sizes. Let 2f be the smallest power of two in the interval. Since gap size increases with exponent, the binary gaps in the power of ten interval are at their smallest in [10e,2f). Thus all we need to check is that that this minimum binary gap size is larger than the decimal gap size in the power of ten interval.
In [10e,10e+1), the decimal gap size is 10e+1-d. In [10e,2f), which is a subinterval of [2f-1,2f), the binary gap size is 2f-b. We must show that if 2b < 10d-1, then 2f-b > 10e+1-d.
Starting with 2b < 10d-1, and given that 2f > 10e, we can multiply as follows, preserving the inequality: 2b·10e < 10d-1·2f.
Combining powers, this simplifies to 10e/10d-1 < 2f/2b, or 10e+1-d < 2f-b. This proves the sufficient condition.
Open Question: Proving The Necessary Condition
I would like to see a proof of the necessary condition in the style of the sufficient condition proofs above. For example, for decimal-binary-decimal conversions, I’d like to prove that if 10e+1-d > 2f+1-b, then 10d < 2b-1. I attempted to prove it by reversing the steps of the sufficient condition proof, but I did not see a way to complete it. I also tried proving the contrapositive: if 10d > 2b-1, then 10e+1-d < 2f+1-b. Still no luck. (Technically the contrapositive would make the signs ≥ and ≤, respectively, but d is greater than zero, and we are not considering the equal gap size case.)
D. Goldberg uses a counterexample to prove the necessary condition for binary-decimal-binary conversion from IEEE single-precision binary floating-point, but I am looking for a general proof (Matula’s proof is general but not in the above style). What seems to make this direction harder is that there are some ranges where less precision than called for by the inequality will work.