C/C++ Basic Data Types

Section 4.1 C/C++ Basic Data Types

The ARM architecture defines several data types for both the 32- and 64-bit implementations. Table 4.1.1 shows the names for both implementations for completeness. As of this writing (September 2017) Raspbian only runs the Raspberry Pi in 32-bit mode.

Table 4.1.1. ARM architecture data sizes.

Name	Size (bits)	Architecture (32/64 bit)
Byte	8	both
Halfword	16	both
Word	32	both
Doubleword	64	both
Quadword	128	64-bit only

C and C++ provide several data types. Their sizes for the 32-bit ARM architecture are shown in Table 4.1.2.

Table 4.1.2. Sizes of some C/C++ data types in 32-bit ARM architecture.

Data type	Size	Description
`int`	Word	Integer
`long`	Word	Integer
`short`	Halfword	Integer
`char`	Byte	Byte
`long long`	Doubleword	Integer
`float`	Word	Single-precision IEEE floating-point
`double`	Doubleword	Double-precision IEEE floating-point
`bool`	Byte	Boolean
`void*`	Word	Address of data or code

In the 64-bit version (not the current Raspbian) the sizes are the same, except the void* data type is a Doubleword (64 bits).

A given “real world” value can usually be represented in more than one data type. For example, most people would think of “123” as representing “one hundred twenty-three.” This value could be stored in a computer in int format or char format, with each char holding one of the characters in this number. An int in our C/C++ environment is stored in 32 bits, and the bit pattern would be

\begin{gather*} \hex{0000007b} \end{gather*}

As a C-style text string it would also require four bytes of memory, but their bit patterns would be

\begin{gather*} \hex{31\quad 32\quad 33\quad 00} \end{gather*}

(Recall that a C-style string is terminated with a NUL character.)

The int format is easier to use in arithmetic and logical expressions, but the interface with the outside world through the screen and the keyboard uses the char format. If a user entered “\(123\)” from the keyboard, the operating system would read the individual characters, each in char format, and store them as a text string. The text string must be converted to int format. After the numbers are manipulated, the result must be converted from the int format to char format for display on the screen.

On the other hand, if the program primarily manipulates the numbers as text, rather than in arithmetic computations, it may be more efficient to keep them stored as text strings.

C programmers use functions in the stdio library and C++ programmers use functions in the iostream library to do these conversions between the int and char formats. For example, the C code sequence:

scanf("%d", &x);
x += 100;
printf("%d", x);

or the C++ code sequence:

cin >> x;
x += 100;
cout << x;

Reads characters, each as a separate char from the keyboard and converts the char sequence into the corresponding int format.
Adds 100 to the int.
Converts the resulting int into a char sequence and displays it on the screen.

The C or C++ I/O library functions in the code segments above do the necessary conversions between char sequences and the int storage format. However, once the conversion is performed, they ultimately call the read system call function to read bytes from the keyboard and the write system call function to write bytes to the screen. As shown in Figure 4.1.3, an application program can call the read and write functions directly to transfer bytes.

Figure 4.1.3.

When using the read and write system call functions for I/O, it is the programmer's responsibility to do the conversions between the char type used for I/O and the storage formats used within the program. We will soon be writing our own functions in assembly language to convert between the character format used for screen display and keyboard input, and the internal storage format of integers in the binary number system. The purpose of writing our own functions is to gain a thorough understanding of how data is represented internally in the computer.