Using C Programs to Explore Data Formats

Section 2.9 Using C Programs to Explore Data Formats

Before writing any programs, I urge you to read Appendix A on writing Makefiles, even if you are familiar with them. Many of the problems I have helped students solve are due to errors in their Makefile. And many of the Makefile errors go undetected due to the default behavior of the make program.

We will use the C programming language to illustrate these concepts because it takes care of the memory allocation problem, yet still allows us to get reasonably close to the hardware.

You probably learned to program in the higher-level, object-oriented paradigm using something like C++, Java, or Python. C does not support the object-oriented paradigm. It is a procedural programming language. The program is divided into functions. Since there are no classes in C, there is no such thing as a member function. The programmer focuses on the algorithms used in each function, and data items are explicitly passed to functions or accessed globally.

Most programming books start with a “Hello, World.” program, but since you already know about integers, we will start with a program that prints both an integer and a float.

/* intAndFloat.c
 * Using printf to display an integer and a float.
 * 2017-09-29: Bob Plantz
 */
#include <stdio.h>

int main(void)
{
  int anInt = 19088743;
  float aFloat = 19088.743;

  printf("The integer is %d and the float is %f\n", anInt, aFloat);

  return 0;
}

Listing 2.9.1.

Compiling and running the program in Listing 2.9.1 on my Raspberry Pi gave:

pi@rpi3:~/chp02 $ gcc -Wall -o intAndFloat intAndFloat.c
pi@rpi3:~/chp02 $ ./intAndFloat
The integer is 19088743 and the float is 19088.742188
pi@rpi3:~/chp02 $

Yes, the float really is that far off. This will be explained in Chapter 16.

The program source code starts with some documentation that gives the name of the file, a very brief description of what the program does, the author's name, and the date it was written. Everything between the /* and */ is considered a comment. It is there for the human reader and has no effect on the program itself.

The first operation that actually affects the program is to include another file, the stdio.h header file. This particular header file, stdio.h, defines the interface to many of the functions in the C Standard Library, which allows the compiler to know what to do when any of these functions are encountered in our source code.

Next you see the definition of a C function. All C programs are made up of functions which have the general format:

return-data-type function-name( parameter-list ) {
  function-body
}

When a C program is executed, the operating system first sets up a C Runtime Environment, which sets up the resources on your computer to run the program. The C Runtime Environment then calls the main function, so the program you write must begin with a function whose function-name is main. The main function can call other functions, but program control normally ends up back in the main function, which then returns to the C Runtime Environment. When the program has completed execution, it should return a single integer to the C Runtime Environment, so the return-data-type is int. In our example main function here, no parameters are passed to it, so the parameter-list is void.

Our main function has two variables, an integer value (int) and a floating point value (float), which are listed at the beginning of the function-body. Most modern programming languages allow you to introduce new variables anywhere in the code, but C requires that they be listed at the beginning. (There are some exceptions to this rule, but they are beyond the scope of this book.) Think of it as being like listing the ingredients for a cooking recipe before the instructions on how to use them. This program provides values for each of the variables.

This program only prints the values of the two variables on the screen. It does this by calling a function in the C Standard Library, printf. The printf function allows for a great variety of formatting, but our use will be quite simple.

The first argument passed to the printf function is essentially a template, surrounded by quotes, for what will be printed on the screen. It is simply the text that you want printed, with the % character, immediately followed by some conversion code characters, at each place that you want the value of a variable substituted.

This text string is followed by a comma-separated list of the names of the variables to be substituted in the same order that their respective % codes appear in the template. You should enter the program above, compile it, and then run it. I am convinced that you will see how printf works in this simple program.

Common conversion codes (used after the % character) are:

`u`	unsigned decimal integer
`d` or `i`	signed decimal integer
`f`	float
`x`	hexadecimal

The conversion codes may include other characters to specify properties like the field width of the display, whether the value is left or right justified within the field, etc. We will not cover the details here. You can read man page 3 for printf to learn more.

Our next program illustrates how to read numbers from the keyboard.

/* echoDecHex.c
 * Asks user to enter an unsigned integer in decimal and
 * one in hexadecimal then echoes both in both bases.
 * 2017-09-29: Bob Plantz
 */

#include <stdio.h>

int main(void)
{
  unsigned int unsignedInteger;
  unsigned int bitPattern;

  printf("Enter an unsigned decimal integer: ");
  scanf("%u", &unsignedInteger);

  printf("Enter a bit pattern in hexadecimal: ");
  scanf("%x", &bitPattern);

  printf("%u is stored as %#010x, and\n", unsignedInteger, unsignedInteger);
  printf("%#010x represents the unsigned decimal integer %u\n",
             bitPattern, bitPattern);

  return 0;
}

Listing 2.9.2.

The primary new concept introduced in Listing 2.9.2 is the scanf function from the C Standard Library. After the program calls printf to write a message on the screen, it calls scanf to read what the user types on the keyboard. Since the user is asked to type an unsigned decimal integer, the conversion code is %u. But with scanf we need to prefix the variable name, unsignedInteger, with the & character. In C the & is the “address of” operator, so the memory address where the variable is stored in memory is passed as an argument to scanf. The address of the variable is needed because scanf will read the keystrokes from the keyboard, convert the number to its equivalent binary representation, and store this bit pattern in memory at this address. You will get a chance to do this conversion in Chapter 14.

The second call to printf writes a message on the screen asking the user to enter a bit pattern in hexadecimal, so scanf needs the %x conversion code. After both numbers are read from the keyboard printf is called to write each of them on the screen twice. Enter the program, compile and run it. Now you have your own decimal-to-hexadecimal and hexadecimal-to-decimal conversion program.

The %u conversion code that is used when reading the integers with scanf and writing them with printf indicates that these are unsigned integers. The %#010x conversion code shows some of the more complex formatting that can be done with printf. The # together with the x tells printf to write the number in hexadecimal and prefix it with 0x. The first 0 tells printf to zero pad the displayed number so that it is the full width of the integer, that is, \(32\) bits. Thirty-two bits require eight hexadecimal digits to display, and the prefix is two more characters, so the 10 tells printf to write \(10\) characters on the screen. Feel free to play around with this conversion code in your program to see the effects of changing it. You won't break anything.

The program in Listing 2.9.2 demonstrates a very important concept—hexadecimal is used as a human convenience for stating bit patterns. A number is not inherently binary, decimal, or hexadecimal. A particular value can be expressed in a precisely equivalent way in each of these three number bases. For that matter, it can be expressed equivalently in any number base. But since a computer is made up of binary switches, all numbers are stored in binary in the computer.