The program in Listing 10.5 uses a while loop to write “Hello World” on the screen one character at a time. A common programming error is to create an “infinite” loop. It would be nice to have a tool that allows us to stop such a program in the middle of the loop so we can observe the state of registers and memory locations. That can help us to determine such things as whether the loop control variable is being changed as we planned.
Fortunately, the gnu program development environment includes a debugger, gdb (see [29]), that allows us to do just that. The gdb debugger allows you to load another program into memory and use gdb commands to control the execution of the other program — the target program — and to observe the states of its variables.
There is another, very important, reason for learning how to use gdb. This book describes how registers and memory are controlled by computer instructions. The gdb program is a very valuable learning tool, since it allows you to observe the behavior of each instruction, one step at a time.
gdb has a large number of commands, but the following are the most common ones that will be used in this book:
Here is a screen shot of how I assembled, linked, and then used gdb to control the execution of the program and observe its behavior. User input is boldface and the session is annotated in italics.
After assembling and linking the program, we start gdb program and load the helloworld program — the one we want to observe — into memory. This leaves me in gdb. The target program is not running.
The li command lists ten lines of the source code.
We are trying to observe the while loop. Providing an argument to the li command causes it to list ten lines centered around the value of the argument. We still do not see the entire loop. Pressing the Enter key tells gdb to repeat the immediately previous command. The li command is smart enough to list the next ten lines (only eight in this example since that takes us to the end of the source code in this file).
From the listed source code, we can see that the decision to exit the loop is made on line 29 in the source code. The jump to the allDone label will occur if the cmpb instruction on line 28 shows that the rsi register is pointing to a byte that contains zero — the ASCII NUL character. I set a breakpoint at line 29 so we can see what esi is pointing to.
I also set a breakpoint at line 37, the target of the jump. This second breakpoint serves as a sort of “safety net” in case I did not read the code correctly. If the program does not reach the breakpoint within the loop, perhaps I can work backwards and figure out my error from examining the registers and memory at this point.
The run command causes the target program, helloworld, to execute until it reaches a breakpoint. Control then returns to the gdb program.
IMPORTANT: The instruction at the breakpoint is not executed when the break occurs. It will be the first instruction to be executed when we command gdb to resume execution of the target program.
The i r command (notice the space between “i” and “r”) is used to display all the registers. The left-hand column shows the contents of the register in hexadecimal, and the right-hand column is in decimal. Addresses are usually stated in hexadecimal, so the contents of registers that are supposed to hold only addresses are not converted to decimal.
Since our primary interest is the rsi register, we can simplify the display by explicitly specifying which register(s) to display.
We should examine the byte that rsi is pointing to because that determines whether this jump instruction transfers control or not. The help x command provides a very brief reminder of the codes to use. The character display (c) shows two values for each byte — first in decimal, then the equivalent ASCII letter. We can see that rsi is pointing to the beginning of the text string. I chose to display ten characters to confirm that this is the correct text string.
We use the si command to single-step through a portion of the program. Recall that simply pushing the Enter key repeats the immediately previous gdb command.
The last step in this sequence gave an odd result. It caused the program to execute the call instruction, which took us into the write function. Since write is a library function, gdb does not have access to its source code. Hence, it cannot display the source code for us.
Not wanting to single-step through the write function, I use the cont command. The program displays the first letter of the string, “H”, on the screen, then loops back and breaks again at line 20. I display register rsi and examine the memory it is pointing to. We can see that the pointer variable, aString, is marching through the text string one character at a time.
Continuing the program shows that it will break back into gdb each time through the loop. We are reasonably confident that the loop is executing properly, so we remove the breakpoint in the loop.
With the breakpoint inside the loop removed, continuing the program displays the remainder of the text. Then it breaks at the breakpoint we set outside the loop. Recall that I set the breakpoint at line 37, but the program breaks at line 32. The reason is that there is no instruction on line 37, just a label. The first instruction following the label is on line 38.
I then look at the address in rsi. By examining two bytes previous to where it is currently pointing, we can easily see the last two characters that the program displayed before reaching the NUL character. And it is the NUL character that caused the loop to terminate.
Continuing the program, it completes normally. Notice that even though our target program has completed, we are still in gdb. We need to use the q command to exit from gdb.