Bits and Groups of Bits

Section 2.1 Bits and Groups of Bits

Since nearly everything that takes place in a computer—from the instructions that make up a program to the data these instructions act upon—depends upon two-state switches, we need a good notation to use when talking about the states of the switches. It is clearly very cumbersome to say something like:

The first switch is on, the second one is also on, but the third is off, while the fourth is on.

We need a more concise notation, which leads us to use numbers. When dealing with numbers, you are most familiar with the decimal system, which is based on ten, and thus uses ten digits. Two number systems are useful when talking about the states of switches—the binary system, which is based on two, and the hexadecimal system, which is based on sixteen. A less commonly used number system is octal, which is based on eight.

Decimal digits: \(0, 1, 2, 3, 4, 5, 6, 7, 8, 9\)
Binary digits: \(\binary{0}, \binary{1}\)
Hexadecimal digits: \(\hex{0}, \hex{1}, \hex{2}, \hex{3}, \hex{4}, \hex{5}, \hex{6}, \hex{7}, \hex{8}, \hex{9}, \hex{a}, \hex{b}, \hex{c}, \hex{d}, \hex{e}, \hex{f}\)
Octal digits: \(\octal{0}, \octal{1}, \octal{2}, \octal{3}, \octal{4}, \octal{5}, \octal{6}, \octal{7}\)

“Binary digit” is commonly shortened to bit. It is common to bypass the fact that a bit represents the state of a switch, and simply call the switches “bits.” Using bits (binary digits), we can greatly simplify the previous statement about switches as

\begin{equation*} \binary{1101} \end{equation*}

which you can think of as representing “on, on, off, on.” It does not matter whether we use 1 to represent “on” and 0 as “off,” or 0 as “on” and 1 as “off.” We simply need to be consistent. You will see that this will occur naturally; it will not be an issue.

Hexadecimal is commonly used as a shorthand notation to specify bit patterns. Since there are sixteen hexadecimal digits, each one can be used to specify uniquely a group of four bits. Table 2.1.1 shows the correspondence between each possible group of four bits and one hexadecimal digit.

Table 2.1.1. Hexadecimal representation of four bits.

One hexadecimal digit	Four binary digits (bits)
\(\hex{0}\)	\(\binary{0000}\)
\(\hex{1}\)	\(\binary{0001}\)
\(\hex{2}\)	\(\binary{0010}\)
\(\hex{3}\)	\(\binary{0011}\)
\(\hex{4}\)	\(\binary{0100}\)
\(\hex{5}\)	\(\binary{0101}\)
\(\hex{6}\)	\(\binary{0110}\)
\(\hex{7}\)	\(\binary{0111}\)
\(\hex{8}\)	\(\binary{1000}\)
\(\hex{9}\)	\(\binary{1001}\)
\(\hex{a}\)	\(\binary{1010}\)
\(\hex{b}\)	\(\binary{1011}\)
\(\hex{c}\)	\(\binary{1100}\)
\(\hex{d}\)	\(\binary{1101}\)
\(\hex{e}\)	\(\binary{1110}\)
\(\hex{f}\)	\(\binary{1111}\)

Thus, the above English statement specifying the state of four switches can be written with a single hexadecimal digit,

\begin{equation*} \hex{d}_{16} = \binary{1101}_{2} \end{equation*}

When it is not clear from the context, we will indicate the base of a number in this text with a subscript. For example, \(100_{10}\) is written in decimal, \(100_{16}\) is written in hexadecimal, and \(100_{2}\) is written in binary.

Although octal is less commonly used, I encountered it in Raspbian while working on the material in Section 10.1. There are eight octal digits, each one representing a group of three bits. Table 2.1.2 shows the correspondence between each possible group of three bits and one octal digit.

Table 2.1.2. Octal representation of three bits.

One octal digit	Three binary digits (bits)
\(\octal{0}\)	\(\binary{000}\)
\(\octal{1}\)	\(\binary{001}\)
\(\octal{2}\)	\(\binary{010}\)
\(\octal{3}\)	\(\binary{011}\)
\(\octal{4}\)	\(\binary{100}\)
\(\octal{5}\)	\(\binary{101}\)
\(\octal{6}\)	\(\binary{110}\)
\(\octal{7}\)	\(\binary{111}\)

Hexadecimal digits are especially convenient when we need to specify the state of a group of, say, 16 or 32 switches. In place of each group of four bits, we can write one hexadecimal digit. For example,

\begin{equation*} \hex{6c2a}_{16} = \binary{0110} \; \binary{1100} \; \binary{0010} \; \binary{1010}_{2} \end{equation*}

and

\begin{equation*} \hex{0123} \; \hex{abcd}_{16} = \binary{0000} \; \binary{0001} \; \binary{0010} \; \binary{0011} \; \binary{1010} \; \binary{1011} \; \binary{1100} \; \binary{1101}_{2} \end{equation*}

A single bit has limited usefulness when we want to store data. Most modern computers only allow a program to access a byte, which is a contiguous group of eight bits. Historically, the number of bits in a byte has varied depending on the hardware and the operating system. For example, the CDC 6000 series of scientific mainframe computers used a six-bit byte. Nearly everyone uses “byte” to mean “eight bits” today.

Another important reason to learn hexadecimal is that the programming language may not allow you to specify a value in binary. Prefixing a number with “0x” (zero, lower-case ex) in C/C++ means that the number is expressed in hexadecimal. There is no standard C/C++ syntax for writing a number in binary, although some compilers have their own syntax for doing this. The syntax for specifying bit patterns in C/C++ is shown in Table 2.1.3. (The 32-bit pattern for the decimal value \(123\) will become clear after you read Sections 2.3 and 2.5.) Although the GNU assembler, as, includes a notation for specifying bit patterns in binary, it is usually more convenient to use the C/C++ notation for hexadecimal.

Table 2.1.3. C/C++ syntax for specifying literal numbers. Octal bits grouped by three for readability.

	Prefix	Example	32-bit pattern (binary)
Decimal	(none)	`123`	\(\binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0111} \; \binary{1011}\)
Hexadecimal	`0x`	`0x123`	\(\binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0000} \; \binary{0001} \; \binary{0010} \; \binary{0011}\)
Octal	`0`	`0123`	\(\binary{00} \; \binary{000} \; \binary{000} \; \binary{000} \; \binary{000} \; \binary{000} \; \binary{000} \; \binary{000} \; \binary{001} \; \binary{010} \; \binary{011}\)