Fractional Values in Binary

Section 16.1 Fractional Values in Binary

Before discussing the storage formats, we need to think about how fractional values are stored in binary. The concept is quite simple. We can extend Equation (2.5.1) to include fractional powers of \(2\text{:}\)

\begin{equation} N.F = d_{n-1} \times 2^{n-1} + d_{n-2} \times 2^{n-2} + \ldots + d_{0} \times 2^{0} . d_{-1} \times 2^{-1} + d_{-2} \times 2^{-2} + \ldots\label{eq-binfrac}\tag{16.1.1} \end{equation}

For example,

\begin{equation*} 123.6875_{10} = \texttt{1111011.1011}_{2} \end{equation*}

because

\begin{align*} d_{-1} \times 2^{-1} &= \texttt{1} \times 0.5\\ d_{-2} \times 2^{-2} &= \texttt{0} \times 0.25\\ d_{-3} \times 2^{-3} &= \texttt{1} \times 0.125\\ d_{-4} \times 2^{-4} &= \texttt{1} \times 0.0625 \end{align*}

and thus

\begin{align*} \binary{0.1011}_{2} &= 0.5_{10} + 0.125_{10} + 0.0625_{10}\\ &= 0.6875_{10} \end{align*}

Although any integer can be represented as a sum of powers of two, an exact representation of fractional values in binary is limited to sums of inverse powers of two. For example, consider an 8-bit representation of the fractional value \(0.9\text{.}\) From

\begin{align*} \binary{0.11100110}_{2} &= 0.89843750_{10}\\ \binary{0.11100111}_{2} &= 0.90234375_{10} \end{align*}

we can see that

\begin{gather*} \binary{0.11100110}_{2} < 0.9_{10} < \binary{0.11100111}_{2} \end{gather*}

In fact,

\begin{gather*} 0.9_{10} = \binary{0.11100} \overline{\binary{1100}}_{2} \end{gather*}

where \(\overline{\binary{1100}}\) means that this bit pattern repeats forever.

Rounding off fractional values in binary is very simple. If the next bit to the right is one, add one to the bit position where rounding off. Let us round off \(0.9\) to eight bits. From above we see that the ninth bit to the right of the binary point is zero, so we do not add one in the eighth bit position. Thus, we use

\begin{gather*} 0.9_{10} \approx \binary{.1110 0110}_{2} \end{gather*}

which gives a round off error of

\begin{align*} 0.9_{10} - \binary{.11100110}_{2} &= 0.9_{10} - 0.8984375_{10}\\ &= 0.0015625_{10} \end{align*}