Examining Memory With a Debugger

Section 2.11 Examining Memory With a Debugger

Now that we have started writing programs, you need to learn how to use the GNU debugger, gdb. It may seem premature at this point. The programs are so simple, they hardly require debugging. Well, it is better to learn how to use the debugger on a simple example than on a complicated program that does not work. In other words, tackle one problem at a time.

There is a better reason for learning how to use gdb now. You will find that it is a very valuable tool for learning the material in this book, even when you write bug-free programs.

gdb has a large number of commands. The few here will be sufficient to get you started. You will see more in Section 8.5.

br source-filename:line-number — Set a breakpoint at the specified line-number in the source file, source-filename. Control will return to gdb when the line number is encountered.
cont — Continue program execution from the current location.
help command — Help on how to use command.
i r — Show the contents of the registers (“info registers”).
li LineNumber — List ten lines of the source code, centered at the line number specified by LineNumber.
print Expression — Evaluate Expression and print the value.
printf "format", var1, var2,… — Display the values of var1, var2,…. The "format" string follows the same rules as the printf in the C Standard Library.
r — Begin execution of a program that has been loaded under control of gdb.
x/nfs MemoryAddress — Display (examine) n values in memory in format f of size s starting at MemoryAddress.

Let us walk through an example of using gdb to explore the concepts covered in this book. I will use the program in Listing 2.9.1. I recommend that you get on your Raspberry Pi and follow along as you read this. I will describe the purpose of the commands as we walk through this terminal interaction. The addresses you see on your Raspberry Pi will probably be different than those in this example.

pi@rpi3:~/chp02 $ gcc -g -Wall -o intAndFloat intAndFloat.c

The “-g” option is required to tell the compiler to include debugger information in the executable program.

pi@rpi3:~/chp02 $ gdb ./intAndFloat
GNU gdb (Raspbian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help". Type "apropos word" to search for commands related to "word"...
Reading symbols from ./intAndFloat...done.
(gdb)li
1	/* intAndFloat.c
2	 * Using printf to display an integer and a float.
3	 * Bob Plantz - 18 Nov 2015
4	 */
5	#include <stdio.h>
6	
7	int main(void)
8	{
9	  int anInt = 19088743;
10	  float aFloat = 19088.743;
(gdb) 
11	
12	  printf("The integer is %d and the float is %f\n", anInt, aFloat);
13	
14	  return 0;
15	}
16

The li command lists ten lines of source code. Control returns to the gdb program as shown by the (gdb) prompt. Simply pushing the return key will repeat the previous command, and li is smart enough to display the next (up to) ten lines.

(gdb)br 12
Breakpoint 1 at 0x1045c: file intAndFloat.c, line 12.

I set a breakpoint at line 12. When the program is executing, if it ever gets to this statement, execution will pause before the statement is executed, and control will return to gdb.

(gdb)run
Starting program: /home/pi/chp02/intAndFloat 

Breakpoint 1, main () at intAndFloat.c:12
12	   printf("The integer is %i and the float is %f\n", anInt, aFloat);

The run command causes the program to start execution from the beginning. When it reaches our breakpoint, control returns to gdb, which displays the program statement that is ready to be executed.

(gdb)print anInt
$1 = 19088743
(gdb)print aFloat
$2 = 19088.7422

The print command displays the value currently stored in the named variable. There is a round off error in the float value. As mentioned above, this will be explained in Chapter 16.

(gdb)printf "anInt = %i and aFloat = %f\n", anInt, aFloat
anInt = 19088743 and aFloat = 19088.742188

The printf command can be used to format the displayed values. The formatting string is essentially the same as for the printf function in the C Standard Library.

(gdb)printf "anInt = %#010x and aFloat = %#010x\n", anInt, aFloat
anInt = 0x01234567 and aFloat = 0x00004a90

Take a moment and convert the hexadecimal values to decimal. The value of anInt is correct, but the value of aFloat is \(19088_{10}\text{.}\) The reason for this odd behavior is that the x formatting character causes the printf function to first convert the value to an int, then display that int in hexadecimal. In C/C++, conversion from float to int truncates the fractional part.

Fortunately, gdb provides another command for examining the contents of memory directly—that is, the actual bit patterns. In order to use this command, we need to determine the actual memory addresses where the anInt and aFloat variables are stored.

(gdb)print &anInt
$3 = (int *) 0x7efff194
(gdb)print &aFloat
$4 = (float *) 0x7efff190

The address-of operator (&) can be used to print the address of a variable. Notice that the addresses are very large.

(gdb)help x
Examine memory: x/FMT ADDRESS.
ADDRESS is an expression for the memory address to examine.
FMT is a repeat count followed by a format letter and a size letter.
Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
  t(binary), f(float), a(address), i(instruction), c(char), s(string)
  and z(hex, zero padded on the left).
Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).
The specified number of objects of the specified size are printed
according to the format.

Defaults for format and size letters are those previously used.
Default count is 1. Default address is following last thing printed
with this command or "print".

The x command is used to examine memory. Its help message is very brief, but it tells you everything you need to know.

(gdb)x/1dw 0x7efff194
0x7efff194:	19088743

The x command can be used to display the values in their stored data type. Read the help message carefully, and you will see that I commanded gdb to display one decimal (integer) word at the address 0x7efff194.

(gdb)x/1fw 0x7efff190
0x7efff190:	19088.7422

And this time I commanded gdb to display one float (in decimal) word at the address t{0x7efff194. Be careful not to confuse “decimal” and “float” here.

(gdb)x/1xw 0x7efff194
0x7efff194:	0x01234567
(gdb)x/4xb 0x7efff194
0x7efff194:	0x67	0x45	0x23	0x01

The display of the anInt variable in hexadecimal, which is located at memory address \(\hex{7efff194}\text{,}\) looks good.

The second command displays the bytes separately. The first byte (the one that contains \(\hex{67}\)) is located at the address shown on the left of the row. The next byte in the row is at the subsequent address (\(\hex{7efff195}\)). So this row displays each of the bytes stored at the four memory addresses \(\hex{7efff194}\text{,}\) \(\hex{7efff195}\text{,}\) \(\hex{7efff196}\text{,}\) and \(\hex{7efff197}\text{,}\) reading from left to right, that make up the variable, anInt.

However, when displaying these same four bytes separately, the least significant byte appears first in memory. This is called little endian storage order, which will be explained below.

(gdb)x/1xw 0x7efff190
0x7efff190:	0x4695217c
(gdb)x/4xb 0x7efff190
0x7efff190:	0x7c	0x21	0x95	0x46

The display of the aFloat variable in hexadecimal simply looks wrong. This is due to the storage format of floats, which is very different from ints. It will be explained in Chapter 16.

The byte-by-byte display of the aFloat variable in hexadecimal also shows that it is stored in little endian order.

(gdb)cont
Continuing.
The integer is 19088743 and the float is 19088.742188
[Inferior 1 (process 1274) exited normally]
(gdb)q
pi@rpi3:~/chp02 $

Finally, I continue to the end of the program. Notice that gdb is still running and I have to quit it.

This example illustrates a concept know as endianess:

Little Endian: Data is stored in memory with the least significant byte in the lowest-numbered address. That is, the “littlest” byte (in the sense of significance) comes first in memory.
Big Endian: Data is stored in memory with the most significant byte in the lowest-numbered address. That is, the “biggest” byte (in the sense of significance) comes first in memory.

Look again at the display of the four bytes beginning at \(\hex{7efff194}\) above. We can rearrange this display to show the bit patterns at each of the four locations:

\(\hex{7efff194}\text{:}\)	\(\hex{67}\)
\(\hex{7efff195}\text{:}\)	\(\hex{45}\)
\(\hex{7efff196}\text{:}\)	\(\hex{23}\)
\(\hex{7efff197}\text{:}\)	\(\hex{01}\)

Yet when we look at the entire 32-bit value in hexadecimal the bytes seem to be arranged in the proper order:

\(\hex{7efff194}\text{:}\)

\(\hex{01234567}\)

When we examine memory one byte at a time, each byte is displayed in numerically ascending addresses. At first glance, the value appears to be stored backwards, but it is correct for the endianess of our environment.

Our environment is configured in little-endian, but the ARM can be configured in either little- or big-endian. In the vast majority of programming situations, endianess is not an issue. However, you need to know about it because:

It can be confusing when examining individual memory locations in memory.
Using the contents of a variable with a data type other than the one used to store the contents often causes a programming error.
Mixing environments, for example, reading a file that was created in another environment, may cause errors.