Fixed Point Fractional Values

Section 16.3 Fixed Point Fractional Values

In a Fixed Point format, the storage area is divided into the integral part and the fractional part. The programmer must keep track of where the binary point is located. For example, we may decide to divide a 32-bit int in half and use the high-order 16 bits for the integral part and the low-order 16 bits for the fractional part.

An example of fixed point notation is the printed deposit slips that that my bank provides with my checks. There are seven boxes for numerals. There is also a decimal point printed just to the left of the rightmost two boxes. Note that the decimal point does not occupy a box. That is, there are no digits allocated for the decimal point. So the bank assumes up to five decimal digits for the integral dollars part (rather optimistic), and the rightmost two decimal digits represent the fractional dollars (cents) part. The bank's printing tells me how they have allocated the digits, but it is my responsibility to keep track of the location of the decimal point when filling in the digits.

One advantage of a fixed point format is that integer instructions can be used for arithmetic computations. Of course, the programmer must be very careful to keep track of which bits are allocated for the integral part and which for the fractional part. And the range of possible values is restricted by the number of bits.

Listing 16.3.1 shows a program that adds ruler measurements (in the “English” system) that are specified to the nearest \(\sfrac{1}{16}\) inch.

/*
 * rulerAdd.c
 * Adds two ruler measurements, to nearest 1/16th inch.
 * 2017-09-29: Bob Plantz
 */
#include <stdio.h>

int main(void)
{
  int x, y, fraction_part, sum;
   
  printf("Enter first measurement, inches: ");
  scanf("%d", &x);
  x = x << 4;         /* shift to integral part of variable */
  printf("                     sixteenths: ");
  scanf("%d", &fraction_part);
  x = x | (0xf & fraction_part);  /* add in fractional part */

  printf("Enter second measurement, inches: ");
  scanf("%d", &y);
  y = y << 4;         /* shift to integral part of variable */
  printf("                      sixteenths: ");
  scanf("%d", &fraction_part);
  y = y | (0xf & fraction_part);  /* add in fractional part */

  sum = x + y;
  printf("Their sum is %d and %d/16 inches\n",
	                   (sum >> 4), (sum & 0xf));

  return 0;
}

Listing 16.3.1. Fixed point addition. The high-order 28 bits are used for the integral part, the low-order 4 for the fractional part. (C)

The numbers are input to the nearest \(\sfrac{1}{16}\) inch, so I have allocated four bits for the fractional part. This leaves \(28\) bits for the integral part. After the integral part is read, the stored number must be shifted four bit positions to the left to put it in the high-order \(28\) bits. Then the fractional part (in number of sixteenths) is added into the low-order four bits with a simple bit-wise or operation. Printing the answer also requires some bit shifting and some masking to filter out the fractional part.

This is clearly a contrived example. A program using floats would work just as well and be somewhat easier to write (not to mention that we should stop using the “English” system). However, the program in Listing 16.3.1 uses integer instructions, which execute faster than floating point. The hardware issues have become less significant in recent times. Modern CPUs use various parallelization schemes such that a mix of floating point and integer instructions may actually execute faster than only integer instructions. Fixed point arithmetic is more apt to be used in embedded applications where the CPU is small and may not have floating point capabilities.