Section 2.9 Using C Programs to Explore Data Formats
Before writing any programs, I urge you to read Appendix A on writing Makefiles, even if you are familiar with them. Many of the problems I have helped students solve are due to errors in their Makefile. And many of the Makefile errors go undetected due to the default behavior of the
make
program.
We will use the C programming language to illustrate these concepts because it takes care of the memory allocation problem, yet still allows us to get reasonably close to the hardware.
You probably learned to program in the higher-level, object-oriented paradigm using something like C++, Java, or Python. C does not support the object-oriented paradigm. It is a procedural programming language. The program is divided into functions. Since there are no classes in C, there is no such thing as a member function. The programmer focuses on the algorithms used in each function, and data items are explicitly passed to functions or accessed globally.
Most programming books start with a “Hello, World.” program, but since you already know about integers, we will start with a program that prints both an integer and a float.
Compiling and running the program in Listing 2.9.1 on my Raspberry Pi gave:
pi@rpi3:~/chp02 $ gcc -Wall -o intAndFloat intAndFloat.c pi@rpi3:~/chp02 $ ./intAndFloat The integer is 19088743 and the float is 19088.742188 pi@rpi3:~/chp02 $
Yes, the float
really is that far off. This will be explained in Chapter 16.
The program source code starts with some documentation that gives the name of the file, a very brief description of what the program does, the author's name, and the date it was written. Everything between the /*
and */
is considered a comment. It is there for the human reader and has no effect on the program itself.
The first operation that actually affects the program is to include another file, the stdio.h
header file. This particular header file, stdio.h
, defines the interface to many of the functions in the C Standard Library, which allows the compiler to know what to do when any of these functions are encountered in our source code.
Next you see the definition of a C function. All C programs are made up of functions which have the general format:
return-data-type function-name( parameter-list ) { function-body }
When a C program is executed, the operating system first sets up a C Runtime Environment, which sets up the resources on your computer to run the program. The C Runtime Environment then calls the main
function, so the program you write must begin with a function whose function-name
is main
. The main
function can call other functions, but program control normally ends up back in the main
function, which then returns to the C Runtime Environment. When the program has completed execution, it should return a single integer to the C Runtime Environment, so the return-data-type
is int
. In our example main
function here, no parameters are passed to it, so the parameter-list
is void
.
Our main
function has two variables, an integer value (int
) and a floating point value (float
), which are listed at the beginning of the function-body
. Most modern programming languages allow you to introduce new variables anywhere in the code, but C requires that they be listed at the beginning. (There are some exceptions to this rule, but they are beyond the scope of this book.) Think of it as being like listing the ingredients for a cooking recipe before the instructions on how to use them. This program provides values for each of the variables.
This program only prints the values of the two variables on the screen. It does this by calling a function in the C Standard Library, printf
. The printf
function allows for a great variety of formatting, but our use will be quite simple.
The first argument passed to the printf
function is essentially a template, surrounded by quotes, for what will be printed on the screen. It is simply the text that you want printed, with the %
character, immediately followed by some conversion code characters, at each place that you want the value of a variable substituted.
This text string is followed by a comma-separated list of the names of the variables to be substituted in the same order that their respective %
codes appear in the template. You should enter the program above, compile it, and then run it. I am convinced that you will see how printf
works in this simple program.
Common conversion codes (used after the %
character) are:
u |
unsigned decimal integer |
d or i
|
signed decimal integer |
f |
float |
x |
hexadecimal |
The conversion codes may include other characters to specify properties like the field width of the display, whether the value is left or right justified within the field, etc. We will not cover the details here. You can read man
page 3 for printf
to learn more.
Our next program illustrates how to read numbers from the keyboard.
The primary new concept introduced in Listing 2.9.2 is the scanf
function from the C Standard Library. After the program calls printf
to write a message on the screen, it calls scanf
to read what the user types on the keyboard. Since the user is asked to type an unsigned decimal integer, the conversion code is %u
. But with scanf
we need to prefix the variable name, unsignedInteger
, with the &
character. In C the &
is the “address of” operator, so the memory address where the variable is stored in memory is passed as an argument to scanf
. The address of the variable is needed because scanf
will read the keystrokes from the keyboard, convert the number to its equivalent binary representation, and store this bit pattern in memory at this address. You will get a chance to do this conversion in Chapter 14.
The second call to printf
writes a message on the screen asking the user to enter a bit pattern in hexadecimal, so scanf
needs the %x
conversion code. After both numbers are read from the keyboard printf
is called to write each of them on the screen twice. Enter the program, compile and run it. Now you have your own decimal-to-hexadecimal and hexadecimal-to-decimal conversion program.
The %u
conversion code that is used when reading the integers with scanf
and writing them with printf
indicates that these are unsigned integers. The %#010x
conversion code shows some of the more complex formatting that can be done with printf
. The #
together with the x
tells printf
to write the number in hexadecimal and prefix it with 0x
. The first 0
tells printf
to zero pad the displayed number so that it is the full width of the integer, that is, \(32\) bits. Thirty-two bits require eight hexadecimal digits to display, and the prefix is two more characters, so the 10
tells printf
to write \(10\) characters on the screen. Feel free to play around with this conversion code in your program to see the effects of changing it. You won't break anything.
The program in Listing 2.9.2 demonstrates a very important concept—hexadecimal is used as a human convenience for stating bit patterns. A number is not inherently binary, decimal, or hexadecimal. A particular value can be expressed in a precisely equivalent way in each of these three number bases. For that matter, it can be expressed equivalently in any number base. But since a computer is made up of binary switches, all numbers are stored in binary in the computer.