Converting Floating-Point Numbers to Binary Strings in C

http://www.exploringbinary.com/converting-floating-point-numbers-to-binary-strings-in-c/


If you want to print a floating-point number in binary using C code, you can’t use printf() — it has no format specifier for it. That’s why I wrote a program to do it, a program I describe in this article.

(If you’re wondering why you’d want to print a floating-point number in binary, I’ll tell you that too.)

Binary to Binary Conversion

You could print a floating-point number in binary by parsing and interpreting its IEEE representation, or you could do it more elegantly by casting it as a base conversion problem — a binary to binary conversion; specifically, a conversion from a binary number to a binary string.

To illustrate the process of converting a number to a string, let’s “convert” the decimal integer 352 to decimal using the classic base conversion algorithm for integers:

  • 352/10 = 35 remainder 2
  • 35/10 = 3 remainder 5
  • 3/10 = 0 remainder 3

Because we use a divisor of 10, this process simply isolates the digits of the number. If we string them together, we get the original number back: 352.

We can illustrate a similar process for fractional values, for example 0.5943, using the classic base conversion algorithm for fractionals:

  • 0.5943 * 10 = 5.943
  • 0.943 * 10 = 9.43
  • 0.43 * 10 = 4.3
  • 0.3 * 10 = 3.0

Because we use a multiplier of 10, this process isolates the digits of the number. If we string them together, we get the original number back: 0.5943.

The same two-part algorithm works for binary to binary conversion, if instead you divide and multiply by 2 and use binary arithmetic.

On paper, this is not too exciting. But in a computer, it allows us to convert binary numbers to binary strings. A floating-point binary value is a number, whereas a printed binary value is a string. We can use the binary to binary conversion algorithm to isolate the digits of the number and convert them to ASCII numerals in a string. That’s what I do in the C code below.

The Code

The function fp2bin() converts a number from IEEE double format to an equivalent character string made up of 0s and 1s. It breaks the double into integer and fractional parts and then converts each separately using routines fp2bin_i() and fp2bin_f(), respectively.

fp2bin_i() and fp2bin_f() use the algorithms described above, which are the same algorithms used in the dec2bin_i() and dec2bin_f() routines in my article Base Conversion in PHP Using BCMath. The algorithms are the same because in each case, the base of the number being converted is the same as the base of the arithmetic used to convert. For the dec2bin* routines, the base is decimal; for the fp2bin* routines, the base is binary.

fp2bin.h

/***********************************************************/
/* fp2bin.h: Convert IEEE double to binary string          */
/*                                                         */
/* Rick Regan, http://www.exploringbinary.com              */
/*                                                         */
/***********************************************************/
/* FP2BIN_STRING_MAX covers the longest binary string
   (2^-1074 plus "0." and string terminator) */
#define FP2BIN_STRING_MAX 1077

void fp2bin(double fp, char* binString);

fp2bin.c

/***********************************************************/
/* fp2bin.c: Convert IEEE double to binary string          */
/*                                                         */
/* Rick Regan, http://www.exploringbinary.com              */
/*                                                         */
/***********************************************************/
#include <string.h>
#include <math.h>
#include "fp2bin.h"

void fp2bin_i(double fp_int, char* binString)
{
 int bitCount = 0;
 int i;
 char binString_temp[FP2BIN_STRING_MAX];

 do
   {
    binString_temp[bitCount++] = '0' + (int)fmod(fp_int,2);
    fp_int = floor(fp_int/2);
   } while (fp_int > 0);

 /* Reverse the binary string */
 for (i=0; i<bitCount; i++)
   binString[i] = binString_temp[bitCount-i-1];

 binString[bitCount] = 0; //Null terminator
}

void fp2bin_f(double fp_frac, char* binString)
{
 int bitCount = 0;
 double fp_int;

 while (fp_frac > 0)
   {
    fp_frac*=2;
    fp_frac = modf(fp_frac,&fp_int);
    binString[bitCount++] = '0' + (int)fp_int;
   }
  binString[bitCount] = 0; //Null terminator
}

void fp2bin(double fp, char* binString)
{
 double fp_int, fp_frac;

 /* Separate integer and fractional parts */
 fp_frac = modf(fp,&fp_int);

 /* Convert integer part, if any */
 if (fp_int != 0)
   fp2bin_i(fp_int,binString);
 else
   strcpy(binString,"0");

 strcat(binString,"."); // Radix point

 /* Convert fractional part, if any */
 if (fp_frac != 0)
   fp2bin_f(fp_frac,binString+strlen(binString)); //Append
 else
   strcpy(binString+strlen(binString),"0");
}

Notes

  • fp2bin() prints binary numbers in their entirety, with no scientific notation.
  • fp2bin() only works with positive numbers.
  • fp2bin() doesn’t handle the special IEEE values for not-a-number (NaN) and infinity.
  • FP2BIN_STRING_MAX can be reduced if you know a priori that you will be converting numbers within a limited range.
  • fp2bin_f() terminates, and gives the exact binary fraction, because multiplication by 2 is essentially bit shifting; bits are shifted left out of the number, one at a time, until the number is 0.
  • fp2bin_i() can be called independently of fp2bin(), but fp2bin_f() would need modification to run standalone (it doesn’t add the radix point or handle 0 properly).

Compiling and Running

I compiled and ran this code on both Windows and Linux:

  • On Windows, I built a project in Visual C++ and compiled and ran it in there.
  • On Linux, I compiled with “gcc fp2binTest.c fp2bin.c -lm -o fp2bin” and then ran with “./fp2bin”.

Examples

The following program uses fp2bin() to convert five floating-point numbers to binary strings:

/***********************************************************/
/* fp2binTest.c: Test double to binary string conversion   */
/*                                                         */
/* Rick Regan, http://www.exploringbinary.com              */
/*                                                         */
/***********************************************************/
#include "fp2bin.h"
#include <stdio.h>

int main(int argc, char *argv[])
{
 char binString[FP2BIN_STRING_MAX];

 fp2bin(16,binString);
 printf("2^4 is %s\n",binString);

 fp2bin(0.00390625,binString);
 printf("2^-8 is %s\n",binString);

 fp2bin(25,binString);
 printf("25 is %s\n",binString);

 fp2bin(0.1,binString);
 printf("0.1 is %s\n",binString);

 fp2bin(0.6,binString);
 printf("0.6 is %s\n",binString);

 return (0);
}

Here is the output from the program:

2^4 is 10000.0
2^-8 is 0.00000001
25 is 11001.0
0.1 is 0.0001100110011001100110011001100110011001100110011001101
0.6 is 0.10011001100110011001100110011001100110011001100110011

0.1 and 0.6 are interesting because they have terminating expansions in decimal but infinite expansions in binary: 0.110 is 0.000112, and 0.610 is 0.10012. Both are rounded to the nearest 53 significant bits, as shown in the values printed above (trailing 0s are not printed).

Using fp2bin() to Study Binary Numbers

You can use fp2bin() in conjunction with my decimal/binary converter to study how decimal values are approximated with IEEE floating-point. There are two aspects of the approximation in particular you can look at:

  1. How the exact binary equivalent of a decimal number compares to its IEEE double representation; that is, how it is rounded to fit into 53 significant bits.
  2. What the exact decimal value of the approximation is.

For example, let’s look at the IEEE approximation of 0.1:

  1. The decimal/binary converter tells you that 0.1 in binary is 0.000110011001100110011001100110011001100110011001100110011001

    fp2bin(0.1) tells you that 0.1 in double-precision floating-point is 0.0001100110011001100110011001100110011001100110011001101 .

    Comparing the two values, you’ll see the IEEE number is the binary number rounded to 53 significant bits. The value of the binary number beyond its 53rd significant bit — its 56th bit overall — is greater than 2-57; therefore, it is rounded up (the rounding results in a carry to the 52nd significant bit).

  2. The decimal/binary converter tells you that the IEEE approximation to 0.1, 0.0001100110011001100110011001100110011001100110011001101,

    is exactly
    0.1000000000000000055511151231257827021181583404541015625
    in decimal.

    (If you’re using GCC C, the %f format specifier of printf() can be used to print this value instead of using the converter).

This shows that the IEEE double approximation of 0.1 is accurate only to 17 significant decimal digits.

Printing floats

fp2bin() will print single-precision floating-point values (floats) as well. Your C compiler will “promote” the float to a double before the call. The resulting double will have the same value, only with extra trailing zeros — which fp2bin() will not print.

Dingbat

15 Responses to “Converting Floating-Point Numbers to Binary Strings in C”

  1. ikilobo Says:

    i want to generate randomly n bits number that divide into W word, where W=m/n
    and then i want operate it by adding/xor, multiplying, the process that i mean look below:

    input m, n1, n2; m=4, n1=12, n2=9;
    assumed: n1>=n2;

    n1=12
    A[]=[[1001],[0011],[1010]]
    n2=9
    B[]=[[0001],[0011],[1010]]
    three ZERO digit MSB in B[] automatic generate..

    so the calculate follow the rule:
    A[]=[[1001],[0011],[1010]]
    B[]=[[0001],[1010],[0010]]
    _______________________xor
    C[]=[[1000],[1001],[1000]]

    xor per block, how can i implementing it in C, where n1 and n2 up to 100 bit or more?

  2. Rick Regan Says:

    ikilobo,

    This doesn’t seem to have anything to do with converting floating-point numbers to binary strings (and I can’t say I understand the question anyhow). Sorry.

  3. Bangon Kali Says:

    Great article! Thanks!

  4. Rick Regan Says:

    @Bangon Kali,

    I’m glad you liked it. Thanks for the feedback.

  5. anshuman dhuliya Says:

    nicely programmed – it simplified the concept with a good coding style. Thanks a lot!

  6. Rick Regan Says:

    NOTE: My code depends on the fmod() function, and one reader reports that the MINIX version of fmod() produces incorrect results.

  7. Herman Says:

    It does not work for negative numbers :)

    Still, great contribution. I appreciate it. Thanks!

    H.

  8. Rick Regan Says:

    @Herman,

    Yes, that’s a known limitation (it is stated in the “Notes” section). Thanks for the feedback.

  9. Eric Dawson Says:

    thanks for the article! :) helped me a lot!! Could you please include IEEE 754 numbers as well?

  10. Rick Regan Says:

    @Eric,

    I’m not sure what you’re asking for — these are IEEE 754 numbers I am talking about.

  11. Mauricio Says:

    If I want the results to consider only 32 bits? There is a easy way to do it? Thanks

  12. Rick Regan Says:

    @Mauricio,

    I’m sorry, I do not understand your question.

  13. Mauricio Says:

    The results are presented this way

    0.1 is 0.0001100110011001100110011001100110011001100110011001101

    it’s double precision,

    i need single precision results, that means, the converted vector with a 32-bit precision

  14. Rick Regan Says:

    @Mauricio,

    Do you mean you want to print floats? Calling fp2bin() with a float should work (see section “Printing floats”).

  15. Mauricio Says:

    Hello Rick,

    thanks, I haven’t seen the printing floats section before. Sorry. I needed the single precision because I am generating values to be used in a hardware implementation of a LUT. I will develop a “rounding” function to generate aways a 32-bit string with fixed size integer and decimal parts. Your functions helped a lot. Thank you.

Leave a Comment

(To reduce spam, cookies must be enabled)


css.php