Appendix C
Using the gdb Debugger for Assembly Language

The program in Listing 10.5 uses a while loop to write “Hello World” on the screen one character at a time. A common programming error is to create an “infinite” loop. It would be nice to have a tool that allows us to stop such a program in the middle of the loop so we can observe the state of registers and memory locations. That can help us to determine such things as whether the loop control variable is being changed as we planned.

Fortunately, the gnu program development environment includes a debugger, gdb (see [29]), that allows us to do just that. The gdb debugger allows you to load another program into memory and use gdb commands to control the execution of the other program — the target program — and to observe the states of its variables.

There is another, very important, reason for learning how to use gdb. This book describes how registers and memory are controlled by computer instructions. The gdb program is a very valuable learning tool, since it allows you to observe the behavior of each instruction, one step at a time.

gdb has a large number of commands, but the following are the most common ones that will be used in this book:

Here is a screen shot of how I assembled, linked, and then used gdb to control the execution of the program and observe its behavior. User input is boldface and the session is annotated in italics.

bob@ubuntu:~$ as --gstabs -o helloWorld3.o helloWorld3.s
bob@ubuntu:~$ gcc -o helloworld3 helloWorld3.o
bob@ubuntu:~$ gdb helloworld3
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/bob/progs/chap10/helloWorld3...done.

After assembling and linking the program, we start gdb program and load the helloworld program — the one we want to observe — into memory. This leaves me in gdb. The target program is not running.

(gdb) li
1 # helloWorld3.s
2 # "hello world" program using the write() system call
3 # one character at a time.
4 # Bob Plantz - 12 June 2009
5
6 # Useful constants
7         .equ    STDOUT,1
8 # Stack frame
9         .equ    aString,-8
10        .equ    localSize,-16

The li command lists ten lines of the source code.

(gdb) li 29
24         movl    $theString, %esi
25         movl    %esi, aString(%rbp) # *aString = "Hello World.∖  n";
26 whileLoop:
27         movl    aString(%rbp), %esi # current char in string
28         cmpb    $0, (%esi)  # null character?
29         je      allDone     # yes, all done
30
31         movl    $1, %edx    # one character
32         movl    $STDOUT, %edi  # standard out
33         call    write       # invoke write function
(gdb)
34
35         incl    aString(%rbp)  # aString++;
36         jmp     whileLoop   # back to top
37 allDone:
38         movl    $0, %eax    # return 0;
39         movq    %rbp, %rsp  # restore stack pointer
40         popq    %rbp        # restore base pointer
41         ret

We are trying to observe the while loop. Providing an argument to the li command causes it to list ten lines centered around the value of the argument. We still do not see the entire loop. Pressing the Enter key tells gdb to repeat the immediately previous command. The li command is smart enough to list the next ten lines (only eight in this example since that takes us to the end of the source code in this file).

(gdb) br 29
Breakpoint 1 at 0x40050b: file helloWorld3.s, line 29.
(gdb) br 37
Breakpoint 2 at 0x400521: file helloWorld3.s, line 37.

From the listed source code, we can see that the decision to exit the loop is made on line 29 in the source code. The jump to the allDone label will occur if the cmpb instruction on line 28 shows that the rsi register is pointing to a byte that contains zero — the ASCII NUL character. I set a breakpoint at line 29 so we can see what esi is pointing to.

I also set a breakpoint at line 37, the target of the jump. This second breakpoint serves as a sort of “safety net” in case I did not read the code correctly. If the program does not reach the breakpoint within the loop, perhaps I can work backwards and figure out my error from examining the registers and memory at this point.

(gdb) run
Starting program: /home/bob/progs/chap10//helloWorld3

Breakpoint 1, whileLoop () at helloWorld3.s:29
29         je      allDone     # yes, all done

The run command causes the target program, helloworld, to execute until it reaches a breakpoint. Control then returns to the gdb program.

IMPORTANT: The instruction at the breakpoint is not executed when the break occurs. It will be the first instruction to be executed when we command gdb to resume execution of the target program.

(gdb) i r
rax            0x7ffff7dd6568 140737351869800
rbx            0x0 0
rcx            0x400530 4195632
rdx            0x7fffffffe128 140737488347432
rsi            0x40061c 4195868
rdi            0x1 1
rbp            0x7fffffffe030 0x7fffffffe030
rsp            0x7fffffffe020 0x7fffffffe020
r8             0x4005c0 4195776
r9             0x7ffff7de9740 140737351948096
r10            0x7fffffffde90 140737488346768
r11            0x7ffff7a3e680 140737348101760
r12            0x400410 4195344
r13            0x7fffffffe110 140737488347408
r14            0x0 0
r15            0x0 0
rip            0x40050b 0x40050b <whileLoop+7>
eflags         0x206 [ PF IF ]
cs             0x33 51
ss             0x2b 43
ds             0x0 0
es             0x0 0
fs             0x0 0
gs             0x0 0
(gdb) i r rsi
rsi            0x40061c 4195868

The i r command (notice the space between “i” and “r”) is used to display all the registers. The left-hand column shows the contents of the register in hexadecimal, and the right-hand column is in decimal. Addresses are usually stated in hexadecimal, so the contents of registers that are supposed to hold only addresses are not converted to decimal.

Since our primary interest is the rsi register, we can simplify the display by explicitly specifying which register(s) to display.

(gdb) help x
Examine memory: x/FMT ADDRESS.
ADDRESS is an expression for the memory address to examine.
FMT is a repeat count followed by a format letter and a size letter.
Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
  t(binary), f(float), a(address), i(instruction), c(char) and s(string).
Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).
The specified number of objects of the specified size are printed
according to the format.

Defaults for format and size letters are those previously used.
Default count is 1.  Default address is following last thing printed
with this command or "print".
(gdb) x/10cb 0x40061c
0x40061c <theString>: 72 ’H’101 ’e’108 ’l’108 ’l’111 ’o’32 ’ ’119 ’w’111 ’o’
0x400624 <theString+8>: 114 ’r’108 ’l’

We should examine the byte that rsi is pointing to because that determines whether this jump instruction transfers control or not. The help x command provides a very brief reminder of the codes to use. The character display (c) shows two values for each byte — first in decimal, then the equivalent ASCII letter. We can see that rsi is pointing to the beginning of the text string. I chose to display ten characters to confirm that this is the correct text string.

(gdb) si
31         movl    $1, %edx    # one character
(gdb)
32         movl    $STDOUT, %edi  # standard out
(gdb)
33         call    write       # invoke write function
(gdb)
0x00000000004003f0 in write@plt ()

We use the si command to single-step through a portion of the program. Recall that simply pushing the Enter key repeats the immediately previous gdb command.

The last step in this sequence gave an odd result. It caused the program to execute the call instruction, which took us into the write function. Since write is a library function, gdb does not have access to its source code. Hence, it cannot display the source code for us.

(gdb) cont
Continuing.
H
Breakpoint 1, whileLoop () at helloWorld3.s:29
29         je      allDone     # yes, all done
(gdb) i r rsi
rsi            0x40061d 4195869
(gdb) x/10cb 0x40061d
0x40061d <theString+1>: 101 ’e’108 ’l’108 ’l’111 ’o’32 ’ ’119 ’w’111 ’o’114 ’r’
0x400625 <theString+9>: 108 ’l’100 ’d’

Not wanting to single-step through the write function, I use the cont command. The program displays the first letter of the string, “H”, on the screen, then loops back and breaks again at line 20. I display register rsi and examine the memory it is pointing to. We can see that the pointer variable, aString, is marching through the text string one character at a time.

(gdb) cont
Continuing.
e
Breakpoint 1, whileLoop () at helloWorld3.s:29
29         je      allDone     # yes, all done
(gdb) clear 29
Deleted breakpoint 1

Continuing the program shows that it will break back into gdb each time through the loop. We are reasonably confident that the loop is executing properly, so we remove the breakpoint in the loop.

(gdb) cont
Continuing.
llo world.

Breakpoint 2, allDone () at helloWorld3.s:38
38         movl    $0, %eax    # return 0;
(gdb) i r rsi
rsi            0x400629 4195881
(gdb) x/10cb 0x400649
0x400629 <theString+13>: 0 ’\ 000’0 ’\ 000’0 ’\ 000’1 ’\ 001’27 ’\ 033’3 ’\ 003’59 ’;’32 ’ ’
0x400631: 0 ’\ 000’0 ’\ 000’

With the breakpoint inside the loop removed, continuing the program displays the remainder of the text. Then it breaks at the breakpoint we set outside the loop. Recall that I set the breakpoint at line 37, but the program breaks at line 32. The reason is that there is no instruction on line 37, just a label. The first instruction following the label is on line 38.

I then look at the address in rsi. By examining two bytes previous to where it is currently pointing, we can easily see the last two characters that the program displayed before reaching the NUL character. And it is the NUL character that caused the loop to terminate.

(gdb) cont
Continuing.
[Inferior 1 (process 2612) exited normally]
(gdb) q
bob@ubuntu:~$

Continuing the program, it completes normally. Notice that even though our target program has completed, we are still in gdb. We need to use the q command to exit from gdb.