8 Program Data – Input, Store, Output

Chapter 8
Program Data – Input, Store, Output

Most programs follow a similar pattern:

Read data from an input device, such as the keyboard, a disk ﬁle, the internet, etc., into main memory.
Load data from main memory into CPU registers.
Perform arithmetic/logic operations on the data.
Store the results in main memory.
Write the results to an output device, such as the screen, a disk ﬁle, audio speakers, etc.

In this chapter you will learn how to call functions that can read input from the keyboard, allocate memory for storing data, and write output to the screen.

8.1 Calling write in 64-bit Mode

We start with a program that has no input. It simply writes constant data to the screen — the “Hello World” program.

We will use the C system call function write to display the text on the screen and show how to call it in assembly language. As we saw in Section 2.8 (page 53) the write function requires three arguments. Reading the argument list from left to right in Listing 8.1:

STDOUT_FILENO is the ﬁle descriptor of standard out, normally the screen. This symbolic name is deﬁned in the unistd.h header ﬁle.
Although the C syntax allows a programmer to place the text string here, only its address is passed to write, not the entire string.
The programmer has counted the number of characters in the text string to write to STDOUT_FILENO.

1/*
2 * helloWorld2.c
3 *
4 * "hello world" program using the write() system call.
5 * Bob Plantz - 8 June 2009
6 */
7#include <unistd.h>
8
9int main(void)
10{
11
12 write(STDOUT_FILENO, "Hello world.\n", 13);
13
14 return 0;
15}

Listing 8.1: “Hello world” program using the write system call function (C).

This program uses only constant data — the text string “Hello world.” Constant data used by a program is part of the program itself and is not changed by the program.

Looking at the compiler-generated assembly language in Listing 8.2, the constant data appears on line 4, as indicated by the comment added on that line. Comments have also been added on lines 11 – 14 to explain the argument set up for the call to write.

1        .file  "helloWorld2.c"
2        .section      .rodata
3.LC0:
4        .string"Hello world.\n"  # constant data
5        .text
6        .globlmain
7        .type  main, @function
8main:
9        pushq  %rbp
10        movq  %rsp, %rbp
11        movl  $13, %edx     # third argument
12        movl  $.LC0, %esi   # second argument
13        movl  $1, %edi      # first argument
14        call  write
15        movl  $0, %eax
16        popq  %rbp
17        ret
18        .size  main, .-main
19        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
20        .section      .note.GNU-stack,"",@progbits

Listing 8.2: “Hello world” program using the write system call function (gcc assembly language).

Data can only be located in one of two places in a computer:

in memory, or
in a CPU register.

(We are ignoring the case of reading from an input device or writing to an output device here.) Recall from the discussion of memory segments on page 502 that the Linux kernel uses diﬀerent memory segments for the various parts of a program. The directive on line 2,

2 .section .rodata

uses the .section assembler directive to direct the assembler to store the data that follows in a “read-only data” section in the object ﬁle. Even though it begins with a ‘.’ character .rodata is not an assembler directive but the name of a section in an ELF ﬁle.

Your ﬁrst thought is probably that the .rodata section should be loaded into a data segment in memory, but recall that data memory segments are read/write. Thus .rodata sections are mapped into a text segment, which is a read-only memory segment.

The .string directive on line 4,

3.LC0:
4 .string"Hello world.\n" # constant data

allocates enough bytes in memory to hold each of the characters in the text string, plus one for the NUL character at the end. The ﬁrst byte contains the ASCII code for the character ’H’, the second the ASCII code for ’e’, etc. Notice that the last character in this string is ’\n’, the newline character; it occupies only one byte of memory. So fourteen bytes of memory are allocated in the .rodata section in this program, and each byte is set to the corresponding ASCII code for each character in the text string. The label on line 3 provides a symbolic name for the beginning address of the text string so that the program can refer to this memory location.

The most common directives for allocating memory for data are shown in Table 8.1.


[label]	.space	expression	evaluates expression and allocates that many bytes; memory is not initialized

[label]	.string	"text"	initializes memory to null-terminated string

[label]	.asciz	"text"	same as .string

[label]	.ascii	"text"	initializes memory to the string without null

[label]	.byte	expression	allocates one byte and initializes it to the value of expression

[label]	.word	expression	allocates two bytes and initializes them to the value of expression

[label]	.long	expression	allocates four bytes and initializes them to the value of expression

[label]	.quad	expression	allocates eight bytes and initializes them to the value of expression

Table 8.1: Common assembler directives for allocating memory. The label is optional.

If these are used in the .rodata section, the values can only be used as constants in the program.

The assembly language instruction used to call a function is

		call	functionName

where functionName is the name of the function being called. The call instruction does two things:

The address in the rip register is pushed onto the call stack. (The call stack is described in Section 8.2.) Recall that the rip register is incremented immediately after the instruction is fetched. Thus, when the call instruction is executed, the value that gets pushed onto the stack is the address of the instruction immediately following the call instruction. That is, the return address gets pushed onto the stack in this ﬁrst step.
The address that functionName resolves to is placed in the rip register. This is the address of the function that is being called, so the next instruction to be fetched is the ﬁrst instruction in the called function.

The call of the write function is made on line 14.

14 call write

Before the call is made, any arguments to a function must be stored in their proper locations, as speciﬁed in the ABI [25]. Up to six arguments are passed in the general purpose registers. Reading the argument list from left to right in the C code, the order of using the registers is given in Table 8.2.


Argument	Register
ﬁrst	rdi
second	rsi
third	rdx
fourth	rcx
ﬁfth	r8
sixth	r9

Table 8.2: Order of passing arguments in general purpose registers.

If there are more than six arguments, the additional ones are pushed onto the call stack, but in right-to-left order. This will be described in Section 11.2.

Each of the three arguments to write in this program — the ﬁle descriptor, the address of the text string, and the number of bytes in the text string — is also a constant whose value is known when the program is ﬁrst loaded into memory and is not changed by the program. The locations of these constants on lines 11 – 13,

11        movl  $13, %edx     # third argument
12        movl  $.LC0, %esi   # second argument
13        movl  $1, %edi      # first argument

are not as obvious. The location of the data that an instruction operates on must be speciﬁed in the instruction and its operands. The manner in which the instruction uses an operand to locate the data is called the addressing mode. Assembly language includes a syntax that the programmer uses to specify the addressing mode for each operand. When the assembler translates the assembly language into machine code it sets the bit pattern in the instruction to the corresponding addressing mode for each operand. Then when the CPU decodes the instruction during program execution it knows where to locate the data represented by that operand.

The simplest addressing mode is register direct. The syntax is to simply use the name of a register, and the data is located in the register itself.

Register direct:: The data value is located in a CPU register.
syntax: the name of the register with a “%”preﬁx example: movl %eax, %ebx

The instructions on lines 9 – 10,

9 pushq %rbp
10 movq %rsp, %rbp

use the register direct addressing mode for their operands. The pushq instruction has only one operand, and the movq has two.

Each of the instructions on lines 11 – 13 use the register direct addressing mode for the destination, but the source operand is the data itself. So all three instructions employ the immediate data addressing mode for the source.

Immediate data:: The data value is located in memory immediately after the instruction. This addressing mode can only be used for a source operand.
syntax: the data value with a “$” preﬁx example: movq $0x123456789abcd, %rbx

Although the register direct addressing mode can be used to specify either a source or destination operand, or both, the immediate data addressing mode is valid only for a source operand.

Let us consider the mechanism by which the control unit accesses the data in the immediate data addressing mode. First, we should say a few words about how a control unit executes an instruction. Although a programmer thinks of each instruction as being executed atomically, it is actually done in discrete steps by the control unit. In addition to the registers used by a programmer, the CPU contains many registers that cannot be used directly. The control unit uses these registers as “scratch paper” for temporary storage of intermediate values as it progresses through the steps of executing an instruction.

Now, recall that when the control unit fetches an instruction from memory, it automatically increments the instruction pointer (rip) to the next memory location immediately following the instruction it just fetched. Usually, the instruction pointer would now be pointing to the next instruction in the program. But in the case of the immediate data addressing mode, the “$” symbol tells the assembler to store the operand at this location.

As the control unit decodes the just fetched instruction, it detects that the immediate data addressing mode has been used for the source operand. Since the instruction pointer is currently pointing to the data, it is a simple matter for the control unit to fetch it. Of course, when it does this fetch, the control unit increments the instruction pointer by the size of the data it just fetched.

Now the control unit has the source data, so it can continue executing the instruction. And when it has completed the current instruction, the instruction pointer is already pointing to the next instruction in the program.

The constants in the instructions on lines 11 and 13 are obvious. (The symbolic name “STDOUT_FILENO” is deﬁned in unistd.h as 1.) The constant on line 12 is the label .LC0, which resolves to the address of this memory location. As explained above, this address will be in the .rodata section when the program is loaded into memory. The address is not known within the .text segment when the ﬁle is ﬁrst compiled. The compiler leaves space for it immediately after the instruction (immediate addressing mode). Then when the address is determined during the linking phase, it is plugged in to the space left for it. The net result is that the address becomes immediate data when the program is executed.

So the following code sequence:

11        movl  $13, %edx     # third argument
12        movl  $.LC0, %esi   # second argument
13        movl  $1, %edi      # first argument
14        call    write

implements the C statement

13 write(STDOUT_FILENO, "Hello world.\n", 13);

in the original C program (Listing 8.1, page 542).

Some notes about the write function call:

The characters written to the screen must be stored in memory.
The number of bytes actually written to the screen is returned in the eax register. So if the current function is using eax, the value will be changed by the call to write.
The write function is a C wrapper that sets up the registers for the syscall instruction. Unfortunately, there is no guarantee that it restores the values that were in the registers when it was called.

8.2 Introduction to the Call Stack

Most variables are stored on the call stack. Before describing how this is done, we need to understand what stacks are and how they are used.

A stack is an area of memory for storing data items together with a pointer to the “top” of the stack. Informally, you can think of a stack as being organized very much like a stack of dinner plates on a shelf. We can only access the one item at the top of the stack. There are only two fundamental operations on a stack:

push data-item causes a the data-item to be placed on the top of the stack and moves the stack pointer to point to this latest item.
pop location causes the data item on the top of the stack to be removed and placed at location and moves the stack pointer to point to the next item left on the stack.

Notice that a stack is a “last in, ﬁrst out” (LIFO) data structure. That is, the last thing to be pushed onto the stack is the ﬁrst thing to be popped oﬀ.

To illustrate the stack concept let us use our dinner plate example. Say we have three diﬀerently colored dinner plates, a red one on the dining table, a green one on the kitchen counter, and a blue one on the bedside table. Now we will stack them on the shelf in the following way:

push dining-table-plate
push kitchen-counter-plate
push bedside-table-plate

At this point, our stack looks like:

Now if we perform the operation:

pop kitchen-counter

we will have a blue dinner plate on our kitchen counter, and our stack will look like:

A stack must be used according to a very strict discipline:

Always push an item onto the stack before popping anything oﬀ.
Never pop more things oﬀ than you have pushed on.
Always pop everything oﬀ the stack.
If you have no use for the item(s) to be popped oﬀ, you may simply adjust the stack pointer. This is equivalent to discarding the items that are popped oﬀ. (Our dinner plate analogy breaks down here.)

A good way to maintain this discipline is to think of the use of parentheses in an algebraic expression. A push is analogous to a left parenthesis, and a pop is analogous to a right parenthesis. An attempt to push too many items onto a stack causes stack overﬂow. And an attempt to pop items oﬀ the stack beyond the “bottom” causes stack underﬂow.

Next we will explore how we might implement a stack in C. Our program will allocate space in memory for storing data elements and provide both a push operation and a pop operation. A simple program is shown in Listing 8.3.

1/*
2 * stack.c
3 * implementation of push and pop stack operations in C
4 * Bob Plantz - 7 June 2012
5 *
6 */
7
8#include <stdio.h>
9
10int theStack[500];
11int *stackPointer = &theStack[500];
12
13/*
14 * precondition:
15 *     stackPointer points to data element at top of stack
16 * postcondtion:
17 *     address in stackPointer is decremented by four
18 *     dataValue is stored at top of stack
19 */
20void push(int dataValue)
21{
22    stackPointer--;
23    *stackPointer = dataValue;
24}
25
26/*
27 * precondition:
28 *     stackPointer points to data element at top of stack
29 * postcondtion:
30 *     data element at top of stack is copied to *dataLocation
31 *     address in stackPointer is incremented by four
32 */
33void pop(int *dataLocation)
34{
35    *dataLocation = *stackPointer;
36    stackPointer++;
37}
38
39int main(void)
40{
41    int x = 12;
42    int y = 34;
43    int z = 56;
44    printf("Start with the stack pointer at %p\n",
45            (void *)stackPointer);
46    printf("x = %i, y = %i, and z = %i\n", x, y, z);
47
48    push(x);
49    push(y);
50    push(z);
51    x = 100;
52    y = 200;
53    z = 300;
54    printf("push x\npush y\npush z\n");
55    printf("Now the stack pointer is at %p\n",
56            (void *)stackPointer);
57    printf("Change x, y, and z:\n");
58    printf("x = %i, y = %i, and z = %i\n", x, y, z);
59    pop(&z);
60    pop(&y);
61    pop(&x);
62
63    printf("pop z\npop y\npop x\n");
64    printf("And we end with the stack pointer at %p\n",
65            (void *)stackPointer);
66    printf("x = %i, y = %i, and z = %i\n", x, y, z);
67
68    return 0;
69}

Listing 8.3: A C implementation of a stack.

Read the code in Listing 8.3 and note the following:

The program uses a pointer, stackPointer, to keep track of the data value that is currently at the top of the stack.
The stack pointer is initialized to point to one beyond the highest array element in the array that is allocated for the stack. Thus the stack must “grow” from high-numbered elements to low-numbered elements as items are pushed onto the stack.
A push operation pre-decrements the stack pointer before storing an item on the stack.
A pop operation post-increments the stack pointer after retrieving an item from the stack.

The states of the variables from the program in Listing 8.3 are shown just after the stack is initialized in Figure 8.1. Notice that the stack pointer is pointing beyond the end of the array as a result of the C statement,

int *stackPointer = &theStack[500];

The stack is “empty” at this point.

Figure 8.1: The stack in Listing 8.3 when it is ﬁrst initialized. “????” means that the value in the array element is undeﬁned.

After pushing one value onto the stack

push(x);

the stack appears as shown in Figure 8.2. Here you can see that since the push operation pre-decrements the stack pointer, the ﬁrst data item to be placed on the stack is stored in a valid portion of the array.

Figure 8.2: The stack with one data item on it.

After all three data items — x, y, and z — are pushed onto the stack, it appears as shown in Figure 8.3. The stack pointer always points to the data item that is at the top of the stack. Notice that this stack is “growing” toward lower numbered elements in the array.

Figure 8.3: The stack with three data items on it.

After changing the values in the variables, the program in Listing 8.3 restores the original values by popping from the stack in reverse order. The state of the stack after all three pops are shown in Figure 8.4. Even though we know that the values are still stored in the array, the permissible stack operations — push and pop — will not allow us to access these values. Thus, from a programming point of view, the values are gone.

Figure 8.4: The stack after all three data items have been popped oﬀ. Even though the values are still stored in the array, it is considered a programming error to access them. The stack must be considered as “empty” when it is in this state.

Our very simple stack in this program does not protect against stack overﬂow or stack underﬂow. Most software stack implementations also include operations to check for an empty stack and for a full stack. And many implementations include an operation for looking at, but not removing, the top element. But these are not the main features of a stack data structure, so we will not be concerned with them here.

In GNU/Linux, as with most operating systems, the call stack has already been set up for us. We do not need to worry about allocating the memory or initializing a stack pointer. When the operating system transfers control to our program, the stack is ready for us to use.

The x86-64 architecture uses the rsp register for the call stack pointer. Although you could create your own stack and stack pointer, several instructions use the rsp register implicitly. And all these instructions cause the stack to grow from high memory addresses to low (see Exercise 8-2). Although this may seem a bit odd at ﬁrst, there are some good reasons for doing it this way.

In particular, think about how you might organize things in memory. Recall that the instruction pointer (the rip register) is automatically incremented by the control unit as your program is executed. Programs come in vastly diﬀerent sizes, so it makes sense to store the program instructions at low memory addresses. This allows maximum ﬂexibility with respect to program size.

The stack is a dynamic structure. You do not know ahead of time how much stack space will be required by any given program as it executes. It is impossible to know how much space to allocate for the stack. So you would like to allocate as much space as possible, and to keep it as far away from the programs as possible. The solution is to start the stack at the highest address and have it grow toward lower addresses.

This is a highly simpliﬁed rationalization for implementing stacks such that they grow “downward” in memory. The organization of various program elements in memory is much more complex than the simple description given here. But this may help you to understand that there are some good reasons for what may seem to be a rather odd implementation.

The assembly language push instruction is:

		pushq	source

The pushq instruction causes two actions:

The value in the rsp register is decremented by eight. That is, eight is subtracted from the stack pointer.
The eight bytes of the source operand are copied into memory at the new location pointed to by the (now decremented) stack pointer. The state of the operand is not changed.

The assembly language pop instruction is:

		popq	destination

The popq instruction causes two actions:

The eight bytes in the memory location pointed to by the stack pointer are copied to the destination operand. The previous state of the operand is replaced by the value from memory.
The value in the rsp register is incremented by eight. That is, eight is added to the stack pointer.

In the Intel syntax the “q” is not appended to the instruction.

	push	source
Intel® Syntax	pop	destination

The size of the operand, eight bytes, is determined by the operating system. When executing in 64-bit mode, all pushes and pops operate on 64-bit values. Unlike the mov instruction, you cannot push or pop 8-, 16-, or 32-bit values. This means that the address in the stack pointer (rsp register) will always be an integral multiple of eight.

A good example of using a stack is saving registers within a function. Recall that there is only one set of registers in the CPU. When one function calls another, the called function has no way of knowing which registers are being used by the calling function. The ABI [25] speciﬁes that the values in registers rbx, rbp, rsp, and r12 – r15 be preserved by the called function (see Table 6.4 on page 469).

The program in Listing 8.4 shows how to save and restore the values in these registers. Notice that since a stack is a LIFO structure, it is necessary to pop the values oﬀ the top of the stack in the reverse order from how they were pushed on.

1# saveRegisters.s
2# The rbx and r12 - r15 registers must be preserved by called function.
3# Sets a bit pattern in these registers, but restores original values
4# in the registers before returning to the OS.
5# Bob Plantz - 8 June 2009
6
7        .text
8        .globl  main
9        .type   main, @function
10main:
11        pushq   %rbp        # save caller’s frame pointer
12        movq    %rsp, %rbp  # establish our frame pointer
13
14        pushq   %rbx        # "must-save" registers
15        pushq   %r12
16        pushq   %r13
17        pushq   %r14
18        pushq   %r15
19
20        movb    $0x12, %bl  # "use" the registers
21        movw    $0xabcd, %r12w
22        movl    $0x1234abcd, %r13d
23        movq    $0xdcba, %r14
24        movq    $0x9876, %r15
25
26        popq   %r15         # restore registers
27        popq   %r14
28        popq   %r13
29        popq   %r12
30        popq   %rbx
31
32        movl   $0, %eax     # return 0
33        popq   %rbp         # restore caller’s frame pointer
34        ret                 # back to caller

Listing 8.4: Save and restore the contents of the rbx and r12 – r15 registers. See Table 6.4, page 469, for the registers that should be saved/restored in a function if they are used in the function.

The problem with this technique is maintaining the address in the stack pointer at a 16-byte boundary. Another way to save/restore the registers will be given in Section 11.2.

8.3 Viewing the Call Stack

This seems like a good place to use gdb to see how a stack frame is created and used. We will use the program in Listing 8.4. I used an editor to enter the code then assembled and linked it.

When using gdb to examine programs written in assembly language, another variant of the break command may be helpful. The version of gdb I used for this book skips over the function prologue. To cause gdb to break at the ﬁrst instruction of a function, the following form should be used:

break *label — sets a breakpoint at the speciﬁed label in the current source ﬁle. Notice that you cannot specify a label in another source ﬁle. Control will return to gdb when the label is encountered.

My typing is boldface.
$ gdb ./saveRegisters

GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/bob/my_book_working/progs/chap09/assignment2...done.

(gdb) li

(gdb)

11         pushq   %rbp        # save caller’s frame pointer
12         movq    %rsp, %rbp  # establish our frame pointer
13
14         pushq   %rbx        # "must-save" registers
15         pushq   %r12
16         pushq   %r13
17         pushq   %r14
18         pushq   %r15
19
20         movb    $0x12, %bl  # "use" the registers

I use the li command to list enough of the program to see where I should set the ﬁrst breakpoints.

(gdb) br *main

Breakpoint 1 at 0x4004cc: file saveRegisters.s, line 11.

(gdb) br 20

Breakpoint 2 at 0x4004d9: file saveRegisters.s, line 20.

I set the ﬁrst breakpoint on the ﬁrst instruction in the function. Notice that the label is on line 10, but it applies to the instruction on line 11. The second breakpoint is after all the registers have been saved on the call stack.

(gdb) run

Starting program: /home/bob/my_book_working/progs/chap08/saveRegisters

Breakpoint 1, main () at saveRegisters.s:11
11 pushq %rbp # save caller’s frame pointer

I run the program, it breaks at the ﬁrst breakpoint, and I can display the registers.

(gdb) i r rsp rbp rbx r12 r13 r14 r15 rip

rsp            0x7fffffffe068 0x7fffffffe068
rbp            0x0 0x0
rbx            0x0 0
r12            0x4003c0 4195264
r13            0x7fffffffe140 140737488347456
r14            0x0 0
r15            0x0 0
rip            0x4004cc 0x4004cc <main>

I use the i r (info registers) command to display the contents of the registers that are used in this program. The numbers in the right-hand column show the decimal equivalent of the bit patterns for some of the registers. If you replicate this example (a good thing to do) you will probably get diﬀerent values in your registers.

Next I want to follow how the stack changes as the program executes. This is a little tricky. The stack grows toward lower addresses, but gdb displays memory from low to high addresses. So I need to display the area of memory that the stack will grow into in order to see how this program changes it.

(gdb) x/7xg 0x7fffffffe068-6*8

0x7fffffffe038: 0x0000000000400510 0x0000000000000000
0x7fffffffe048: 0x00000000004003c0 0x00007fffffffe140
0x7fffffffe058: 0x0000000000000000 0x0000000000000000
0x7fffffffe068: 0x00007ffff7a3e76d

Six 64-bit registers will be pushed onto the stack. Since there are eight bytes in each register, I start the display of the stack memory at the current stack pointer (in the rsp register) minus 6*8 bytes. (Use the help x command if you forget the syntax for the examine memory command.) By displaying seven 64-bit values, I can see the value that was pushed onto the stack just before this function was called.

(gdb) cont

Continuing.

Breakpoint 2, main () at saveRegisters.s:20
20 movb $0x12, %bl # "use" the registers

(gdb) x/7xg 0x7fffffffe068-6*8

0x7fffffffe038: 0x0000000000000000 0x0000000000000000
0x7fffffffe048: 0x00007fffffffe140 0x00000000004003c0
0x7fffffffe058: 0x0000000000000000 0x0000000000000000
0x7fffffffe068: 0x00007ffff7a3e76d

When I continue, the program stops at the next breakpoint. I examine the memory that the stack is growing into and can see that the register contents were saved on the stack.

(gdb) i r rsp rbp rbx r12 r13 r14 r15 rip

rsp            0x7fffffffe038 0x7fffffffe038
rbp            0x7fffffffe060 0x7fffffffe060
rbx            0x0 0
r12            0x4003c0 4195264
r13            0x7fffffffe140 140737488347456
r14            0x0 0
r15            0x0 0
rip            0x4004d9 0x4004d9 <main+13>

A display of the registers shows that their contents have not changed, except the stack pointer (rsp) and the instruction pointer (rip). Note that the stack pointer is now pointing to the top of the stack area used by this function.

(gdb) li

15         pushq   %r12
16         pushq   %r13
17         pushq   %r14
18         pushq   %r15
19
20         movb    $0x12, %bl  # "use" the registers
21         movw    $0xabcd, %r12w
22         movl    $0x1234abcd, %r13d
23         movq    $0xdcba, %r14
24         movq    $0x9876, %r15

(gdb)

25
26         popq   %r15         # restore registers
27         popq   %r14
28         popq   %r13
29         popq   %r12
30         popq   %rbx
31
32         movl   $0, %eax     # return 0
33         popq   %rbp         # restore caller’s frame pointer
34         ret                 # back to caller

(gdb) br 26

Breakpoint 3 at 0x4004f4: file saveRegisters.s, line 26.

(gdb) br 32

Breakpoint 4 at 0x4004fd: file saveRegisters.s, line 32.

(gdb) cont

Continuing.

Breakpoint 3, main () at saveRegisters.s:26
26 popq %r15 # restore registers

The li command helps me decide where to set my next two breakpoints, and I continue to the next breakpoint.

(gdb) i r rsp rbp rbx r12 r13 r14 r15 rip

rsp            0x7fffffffe038 0x7fffffffe038
rbp            0x7fffffffe060 0x7fffffffe060
rbx            0x12 18
r12            0x40abcd 4238285
r13            0x1234abcd 305441741
r14            0xdcba 56506
r15            0x9876 39030
rip            0x4004f4 0x4004f4 <main+40>

Now we can see that the registers that were saved on the stack have been changed.

(gdb) cont

Continuing.

Breakpoint 4, main () at saveRegisters.s:32
32 movl $0, %eax # return 0

(gdb) i r rsp rbp rbx r12 r13 r14 r15 rip

rsp            0x7fffffffe060 0x7fffffffe060
rbp            0x7fffffffe060 0x7fffffffe060
rbx            0x0 0
r12            0x4003c0 4195264
r13            0x7fffffffe140 140737488347456
r14            0x0 0
r15            0x0 0
rip            0x4004fd 0x4004fd <main+49>

Continuing on to the next breakpoint and displaying the registers shows that the general purpose registers that were used — rbx, r12, r13, r14, and r15 — have been restored.

(gdb) si

33 popq %rbp # restore caller’s frame pointer

(gdb)

main () at saveRegisters.s:34
34 ret # back to caller

(gdb) i r rsp rbp rbx r12 r13 r14 r15 rip

rsp            0x7fffffffe068 0x7fffffffe068
rbp            0x0 0x0
rbx            0x0 0
r12            0x4003c0 4195264
r13            0x7fffffffe140 140737488347456
r14            0x0 0
r15            0x0 0
rip            0x400503 0x400503 <main+55>

Two single steps brings us to the last instruction, ret, which will return to the operating system. A display of the registers shows that they have been restored to the values they had when this function ﬁrst started. Of course, the instruction pointer (rip) has changed.

(gdb) cont

Continuing.
[Inferior 1 (process 3102) exited normally]

Finally, I use the continue command (cont) to run the program out to its end. Note: If you use the si command to single step beyond the ret instruction at the end of the main function, gdb will dutifully take you through the system libraries. At best, this is a waste of time.

(gdb) q
$

And, of course, I have to tell gdb to quit.

8.4 Local Variables on the Call Stack

Now we see that we can store values on the stack by pushing them, and that the push operation decreases the value in the stack pointer register, rsp. In other words, allocating variables on the call stack involves subtracting a value from the stack pointer. Similarly, deallocating variables from the call stack involves adding a value to the stack pointer.

From this it follows that we can create local variables on the call stack by simply subtracting the number of bytes required by each variable from the stack pointer. This does not store any data in the variables, it simply sets aside memory that we can use. (Perhaps you have experienced the error of forgetting to initialize a local variable in C!)

Next, we have to ﬁgure out a way to access this reserved data area on the call stack. Notice that there are no labels in this area of memory. So we cannot directly use a name like we did when accessing memory in the .data segment.

We could use the popl and pushl instructions to store data in this area. For example,

        popl   %eax
        movl   $0, %eax
        pushl  %eax

could be used to store zero in a variable. But this technique would obviously be very tedious, and any changes made to your code would almost certainly lead to a great deal of debugging. For example, can you ﬁgure out the reason I had to do a pop before pushing the value onto the stack? (Recall that the four bytes have already been reserved on the stack.)

At ﬁrst, it may seem tempting to use the stack pointer, rsp, as the reference pointer. But this creates complications if we wish to use the stack within the function.

A better technique would be to maintain another pointer to the local variable area on the stack. If we do not change this pointer throughout the function, we can always use the base register plus oﬀset addressing mode to directly access any of the local variables. The syntax is:

		oﬀset(register_name)

Intel® Syntax		[register_name + oﬀset]

When it is zero, the oﬀset is not required.

base register plus oﬀset:

The data value is located in memory. The address of the memory location is the sum of a value in a register plus an oﬀset value, which can be an 8-, 16- or 32-bit signed integer.

syntax: place parentheses around the register name with the oﬀset value immediately before the left parenthesis. examples: -8(%rbp); (%rsi); 12(%rax)

Intel® Syntax	[rbp - 8]; [rsi]; [rax + 12]

The appropriate register for implementing this is the frame pointer, rbp.

When a function is called, the calling function begins the process of creating an area on the stack, called the stack frame. Any arguments that need to be passed on the call stack are ﬁrst pushed onto it, as described in Section 11.2. Then the call instruction pushes the return address onto the call stack (page 546).

The ﬁrst thing that the called function must do is to complete the creation of the stack frame. The function prologue, ﬁrst introduced in Section 7.2 (page 498), performs the following actions at the very beginning of each function:

Save the caller’s value in the frame pointer on the stack.
Copy the current value in the stack pointer to the frame pointer.
Subtract a value from the stack pointer to allow for the local variables.

Once the function prologue has completed the stack frame, we observe that:

The local variables are located in an area of the call stack – between the addresses in the rsp and rbp registers.
The rbp register is a pointer to the bottom (the numerically highest address) of the local variable area.
The remaining area of the stack can be accessed using the stack pointer (rsp) as always.

Notice that each local variable is located at some ﬁxed oﬀset from the base register, rbp. In fact, it’s a negative oﬀset.

Listing 8.5 is the compiler-generated assembly language for the program in Listing 2.4 (page 53). Comments have been added to explain the parts of the code being discussed here.

1        .file  "echoChar1.c"
2        .section      .rodata
3.LC0:
4        .string"Enter one character: "
5.LC1:
6        .string"You entered: "
7        .text
8        .globlmain
9        .type  main, @function
10main:
11        pushq  %rbp            # save caller’s frame pointer
12        movq  %rsp, %rbp      # establish our frame pointer
13        subq  $16, %rsp       # space for local variable
14        movl  $21, %edx       # 21 characters
15        movl  $.LC0, %esi     # address of "Enter ... "
16        movl  $1, %edi        # STDOUT_FILENO
17        call  write
18        leaq  -1(%rbp), %rax  # address of aLetter var.
19        movl  $1, %edx        # 1 character
20        movq  %rax, %rsi      # address in correct reg.
21        movl  $0, %edi        # STDIN_FILENO
22        call  read
23        movl  $13, %edx       # 13 characters
24        movl  $.LC1, %esi     # address of "You ... "
25        movl  $1, %edi        # STDOUT_FILENO
26        call  write
27        leaq  -1(%rbp), %rax  # address of aLetter var
28        movl  $1, %edx        # 1 character
29        movq  %rax, %rsi      # address in correct reg.
30        movl  $1, %edi        # STDOUT_FILENO
31        call  write
32        movl  $0, %eax        # return 0;
33        leave                   # undo stack frame
34        ret                     # back to caller
35        .size  main, .-main
36        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
37        .section      .note.GNU-stack,"",@progbits

Listing 8.5: Echoing characters entered from the keyboard (gcc assembly language). Comments added. Refer to Listing 2.4 for the original C version.

The function begins by pushing a copy of the caller’s frame pointer (in the rbp register) onto the call stack, thus saving it. Next it sets the frame pointer for this register at the current top of the stack. These two actions establish a reference point to the stack frame for this function.

Next the program allocates sixteen bytes on the stack for the local variable, thus growing the stack frame by sixteen bytes. It may seem wasteful to set aside so much memory since the only variable in this program requires only one byte of memory, but the ABI [25] speciﬁes that the stack pointer (rsp) should be on a sixteen-byte address boundary before calling another function. The easiest way to comply with this speciﬁcation is to allocate memory for local variables in multiples of sixteen.

Figure 8.5 shows the state of the stack just after the prologue has been executed.

Figure 8.5: Local variables in the program from Listing 8.5 are allocated on the stack. Numbers on the left are oﬀsets from the address in the frame pointer (rbp register).

The return address to the calling function is safely stored on the stack, followed by the caller’s frame pointer value. The stack pointer (rsp) has been moved up the stack to allow memory for the local variable. If this function needs to push data onto the stack, such activity will not interfere with the local variable, the caller’s frame pointer value, nor the return address. The frame pointer (rbp) provides a reference point for accessing the local variable.

IMPORTANT: The space for the local variables must be allocated immediately after establishing the frame pointer. Any other use of the stack within the function, e.g., saving registers, must be done after allocating space for local variables.

Most of the code in the body of the function is already familiar to you, but the instruction that loads the address of the local variable, aString into the rax register:

18 leaq -1(%rbp), %rax # address of aLetter var.

is new. It uses the base register plus oﬀset addressing mode for the source. We can see from the instruction on line 18 that the aString variable is located one byte negative from the address in the rbp register. Since the call stack grows toward negative addresses, this is the next available byte in this function’s stack frame.

As with the write function, the second argument to the read function must be the address of a variable. However, the address of aString cannot be known when the program is compiled and linked because it is the address of a variable that exists in the stack frame. There is no way for the compiler or linker to know where this function’s stack frame will be in memory when it is called. The address of the variable must be computed at run time.

Each instruction that accesses a stack frame variable must compute the variable’s address, which is called the eﬀective address. The instruction for computing addresses is load eﬀective address — leal for 32-bit and leaq for 64-bit addresses. The syntax of the lea instruction is

		leaw	source, %register

where w = l for 32-bit, q for 64-bit.

Intel® Syntax		lea	register, source

The source operand must be a memory location. The lea instruction computes the eﬀective address of the source operand and stores that address in the destination register. So the instruction

leaq -1(%rbp), %rax

takes the value in rbp (the base address of this function’s stack frame), adds -1 to it, and stores this sum in rax. Now rax contains the address of the variable aLetter. (The address still needs to be moved to rsi for the call to the read function.)

So the following code sequence:

18        leaq  -1(%rbp), %rax  # address of aLetter var.
19        movl  $1, %edx        # 1 character
20        movq  %rax, %rsi      # address in correct reg.
21        movl  $0, %edi        # STDIN_FILENO
22        call  read

implements the C statement

14 read(STDIN_FILENO, &aLetter, 1); // one character

in the original C program (Listing 2.4, page 53). (Yes, it would have been more eﬃcient to use rsi as the destination for the leaq instruction. Recall that this program was compiled with the -O0 option, no optimization. You can also expect this to vary across diﬀerent versions of the compiler.)

Some notes about the read function call:

The characters read from the keyboard must be stored in memory. You cannot pass the name of a CPU register to the read function.
The number of bytes actually read from the keyboard is returned in the eax register. So if the current function is using eax, the value will be changed by the call to read.
The read function is a C wrapper that sets up the registers for the syscall instruction. Unfortunately, there is no guarantee that it restores the values that were in the registers when it was called.

IMPORTANT: Since neither the write nor the read system call functions are guaranteed to restore the values in the registers, your program must save any required register values before calling either of these functions.

There is also a new instruction on line 33:

33 leave # undo stack frame

Just before this function exits the portion of the stack frame allocated by this function must be released and the value in the rbp register restored. The leave instruction performs the actions:

movq %rbp, %rsp
popq %rbp

which eﬀectively

deletes the local variables
restores the caller’s frame pointer value

After the epilogue has been executed, the stack is in the state shown in Figure 8.6.

Figure 8.6: Local variable stack area in the program from Listing 8.5. Although the values in the gray area may remain they are invalid; using them at this point is a programming error.

The stack pointer (rsp) points to the address that will return program ﬂow back to the instruction immediately after the call instruction that called this function. Although the data that was stored in the memory which is now above the stack pointer is still there, it is a violation of stack protocol to access it.

One more step remains in completing execution of this function — returning to the calling function. Since the return address is at the top of the call stack, this is a simple matter of popping the address from the top of the stack into the rip register. This requires a special instruction,

		ret

which does not require any arguments.

Recall that there are two classes of local variables in C:

Automatic: variables are created when the function is ﬁrst entered. They are deleted upon exit from the function, so any value stored in them during execution of the function is lost.
Static: variables are created when the program is ﬁrst started. Any values stored in them persist throughout the lifetime of the program.

Most local variables in a function are automatic variables. General purpose registers are used for local variables whenever possible. Since there is only one set of general purpose registers, a function that is using one for a variable must be careful to save the value in the register before calling another function. Register usage is speciﬁed by the ABI [25] as shown in Table 6.4 on page 469. But you should not write code that depends upon everyone else following these recommendations, and there are only a small number of registers available for use as variables. In C/C++, most of the automatic variables are typically allocated on the call stack. As you have seen in the discussion above, they are created (automatically) in the prologue when the function ﬁrst starts and are deleted in the epilogue just as it ends. Static variables must be stored in the data segment.

We are now in a position to write the echoChar program in assembly language. The program is shown in Listing 8.6.

1# echoChar2.s
2# Prompts user to enter a character, then echoes the response
3# Bob Plantz - 8 June 2009
4
5# Useful constants
6        .equ    STDIN,0
7        .equ    STDOUT,1
8# Stack frame
9        .equ    aLetter,-1
10        .equ    localSize,-16
11# Read only data
12        .section  .rodata
13prompt:
14        .string "Enter one character: "
15        .equ    promptSz,.-prompt-1
16msg:
17        .string "You entered: "
18        .equ    msgSz,.-msg-1
19# Code
20        .text                  # switch to text section
21        .globl  main
22        .type   main, @function
23main:
24        pushq   %rbp           # save caller’s frame pointer
25        movq    %rsp, %rbp     # establish our frame pointer
26        addq    $localSize, %rsp  # for local variable
27
28        movl    $promptSz, %edx # prompt size
29        movl    $prompt, %esi  # address of prompt text string
30        movl    $STDOUT, %edi  # standard out
31        call    write          # invoke write function
32
33        movl    $2, %edx       # 1 character, plus newline
34        leaq    aLetter(%rbp), %rsi # place to store character
35        movl    $STDIN, %edi   # standard in
36        call    read           # invoke read function
37
38        movl    $msgSz, %edx   # message size
39        movl    $msg, %esi     # address of message text string
40        movl    $STDOUT, %edi  # standard out
41        call    write          # invoke write function
42
43        movl    $2, %edx       # 1 character, plus newline
44        leaq    aLetter(%rbp), %rsi # place where character stored
45        movl    $STDOUT, %edi  # standard out
46        call    write          # invoke write function
47
48        movl    $0, %eax       # return 0
49        movq    %rbp, %rsp     # delete local variables
50        popq    %rbp           # restore caller’s frame pointer
51        ret                    # back to calling function

Listing 8.6: Echoing characters entered from the keyboard (programmer assembly language).

This program introduces another assembler directive (lines 6,7,9,10,15,18):

		.equ	name, expression

The .equ directive evaluates the expression and sets the name equivalent to it. Note that the expression is evaluated during assembly, not during program execution. In essence, the name and its value are placed on the symbol table during the ﬁrst pass of the assembler. During the second pass, wherever the programmer has used “name” the assembler substitutes the number that the expression evaluated to during the ﬁrst pass.

You see an example on line 9 of Listing 8.6:

9 .equ aLetter,-1

In this case the expression is simply -1. Then when the symbol is used on line 34:

34 leaq aLetter(%rbp), %rsi # place to store character

the assembler substitutes -1 during the second pass, and it is exactly the same as if the programmer had written:

leaq -1(%rbp), %rsi # place to store character

Of course, using .equ to provide a symbolic name makes the code much easier to read.

An example of a more complex expression is shown on lines 13 – 15:

13prompt:
14 .string "Enter one character: "
15 .equ promptSz,.-prompt-1

The “.” means “this address”. Recall that the .string directive allocates one byte for each character in the text string, plus one for the NUL character. So it has allocated 22 bytes here. The expression computes the diﬀerence between the beginning and the end of the memory allocated by .string, minus 1. Thus, promptSz is entered on the symbol table as being equivalent to 21. And on line 28 the programmer can use this symbolic name,

28 movl $promptSz, %edx # prompt size

which is much easier than counting each of the characters by hand and writing:

movl $21, %edx # prompt size

More importantly, the programmer can change the text string and the assembler will compute the new length and change the number in the instruction automatically. This is obviously much less prone to error.

Be careful not to mistake the .equ directive as creating a variable. It does not allocate any memory. It simply gives a symbolic name to a number you wish to use in your program, thus making your code easier to read.

Notice that the amount of memory allocated for local variables is a multiple of 16 in order to preserve optimal stack pointer alignment:

10 .equ localSize,-16

8.4.1 Calling printf and scanf in 64-bit Mode

The printf function can be used to format data and write it to the screen, and the scanf function can be used to read formatted input from the keyboard. In order to see how to call these two functions in assembly language we begin with the C program in Listing 8.7.

1/*
2 * echoInt1.c
3 * Reads an integer from the keyboard and echos it.
4 * Bob Plantz - 11 June 2009
5 */
6
7#include <stdio.h>
8
9int main(void)
10{
11    int anInt;
12
13    printf("Enter an integer number: ");
14    scanf("%i", &anInt);
15    printf("You entered: %i\n", anInt);
16
17    return 0;
18}

Listing 8.7: Calling printf and scanf to write and read formatted I/O (C).

The assembly language generated by the gcc compiler is shown in Listing 8.8. Comments have been added to explain the printf and scanf calls.

1        .file  "echoInt1.c"
2        .section      .rodata
3.LC0:
4        .string"Enter an integer number: "
5.LC1:
6        .string"%i"
7.LC2:
8        .string"You entered: %i\n"
9        .text
10        .globlmain
11        .type  main, @function
12main:
13        pushq  %rbp
14        movq  %rsp, %rbp
15        subq  $16, %rsp
16        movl  $.LC0, %edi     # address of message
17        movl  $0, %eax        # no floats
18        call  printf
19        leaq  -4(%rbp), %rax  # address of anInt
20        movq  %rax, %rsi      # move to correct reg.
21        movl  $.LC1, %edi     # address of format string
22        movl  $0, %eax        # no floats
23        call  __isoc99_scanf
24        movl  -4(%rbp), %eax  # copy of anInt value
25        movl  %eax, %esi      # move to correct register
26        movl  $.LC2, %edi     # address of format string
27        movl  $0, %eax        # no floats
28        call  printf
29        movl  $0, %eax
30        leave
31        ret
32        .size  main, .-main
33        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
34        .section      .note.GNU-stack,"",@progbits

Listing 8.8: Calling printf and scanf to write and read formatted I/O (gcc assembly language).

The ﬁrst call to printf passes only one argument. However, on line 17 in Listing 8.8 0 is passed in eax:

16        movl  $.LC0, %edi     # address of message
17        movl  $0, %eax        # no floats
18        call  printf

The eax register is not listed as being used for passing arguments (see Section 8.1).

Both printf and scanf can take a variable number of arguments. The ABI [25] speciﬁes that the total number of arguments passed in SSE registers must be passed in rax. As you will learn in Section 14.5, the SSE registers are used for passing ﬂoats in 64-bit mode. Since no ﬂoat arguments are being passed in this call, rax must be set to 0. Recall that setting eax to 0 also sets the high-order bits of rax to 0 (Table 7.1, page 508).

The call to scanf on line 14 in the C version passes two arguments:

scanf("%i", &anInt);

That call is implemented in assembly language on lines 19 – 23 in Listing 8.8:

19        leaq  -4(%rbp), %rax  # address of anInt
20        movq  %rax, %rsi      # move to correct reg.
21        movl  $.LC1, %edi     # address of format string
22        movl  $0, %eax        # no floats
23        call  __isoc99_scanf

Again, we see that the eax register must be set to 0 because there are no ﬂoat arguments.

The program written in assembly language (Listing 8.9) is easier to read because the programmer has used symbolic names for the constants and the stack variable.

1# echoInt2.s
2# Prompts user to enter an integer, then echoes the response
3# Bob Plantz -- 11 June 2009
4
5# Stack frame
6        .equ    anInt,-4
7        .equ    localSize,-16
8# Read only data
9        .section  .rodata
10prompt:
11        .string "Enter an integer number: "
12scanFormat:
13        .string "%i"
14printFormat:
15        .string "You entered: %i\n"
16# Code
17        .text                  # switch to text section
18        .globl  main
19        .type   main, @function
20main:
21        pushq   %rbp           # save caller’s frame pointer
22        movq    %rsp, %rbp     # establish our frame pointer
23        addq    $localSize, %rsp  # for local variable
24
25        movl    $prompt, %edi  # address of prompt text string
26        movq    $0, %rax       # no floating point args.
27        call    printf         # invoke printf function
28
29        leaq    anInt(%rbp), %rsi  # place to store integer
30        movl    $scanFormat, %edi  # address of scanf format string
31        movq    $0, %rax       # no floating point args.
32        call    scanf          # invoke scanf function
33
34        movl    anInt(%rbp), %esi   # the integer
35        movl    $printFormat, %edi  # address of printf text string
36        movq    $0, %rax       # no floating point args.
37        call    printf         # invoke printf function
38
39        movl    $0, %eax       # return 0
40        movq    %rbp, %rsp     # delete local variables
41        popq    %rbp           # restore caller’s frame pointer
42        ret                    # back to calling function

Listing 8.9: Calling printf and scanf to write and read formatted I/O (programmer assembly language).

8.5 Designing the Local Variable Portion of the Call Stack

When designing a function in assembly language, you need to determine where each local variable will be located in the memory that is allocated on the call stack. The ABI [25] speciﬁes that:

Each variable should be aligned on an address that is a multiple of its size.
The address in the stack pointer (rsp) should be a multiple of 16 immediately before another function is called.

These rules are best illustrated by considering the program in Listing 8.10.

1/*
2 * varAlign1.c
3 * Allocates some local variables to illustrate their
4 * alignment on the call stack.
5 * Bob Plantz - 11 June 2009
6 */
7
8#include <stdio.h>
9
10int main(void)
11{
12    char alpha, beta, gamma;
13    char *letterPtr;
14    int number;
15    int *numPtr;
16
17    alpha = ’A’;
18    beta = ’B’;
19    gamma = ’C’;
20    number = 123;
21    letterPtr = α
22    numPtr = &number;
23
24    printf("%c %c %c %i\n", *letterPtr,
25           beta, gamma, *numPtr);
26
27    return 0;
28}

Listing 8.10: Some local variables (C).

The assembly language generated by the compiler is shown in Listing 8.11 with comments added for explanation.

1        .file  "varAlign1.c"
2        .section      .rodata
3.LC0:
4        .string"%c %c %c %i\n"
5        .text
6        .globlmain
7        .type  main, @function
8main:
9        pushq  %rbp
10        movq  %rsp, %rbp
11        subq  $32, %rsp        # 2 * 16
12        movb  $65, -23(%rbp)   # alpha = ’A’
13        movb  $66, -22(%rbp)   # beta = ’B’;
14        movb  $67, -21(%rbp)   # gamma = ’C’;
15        movl  $123, -20(%rbp)  # number = 123;
16        leaq  -23(%rbp), %rax
17        movq  %rax, -16(%rbp)  # letterPtr = α
18        leaq  -20(%rbp), %rax
19        movq  %rax, -8(%rbp)   # numPtr = &number;
20        movq  -8(%rbp), %rax
21        movl  (%rax), %esi
22        movsbl-21(%rbp), %ecx
23        movsbl-22(%rbp), %edx
24        movq  -16(%rbp), %rax
25        movzbl(%rax), %eax
26        movsbl%al, %eax
27        movl  %esi, %r8d
28        movl  %eax, %esi
29        movl  $.LC0, %edi
30        movl  $0, %eax
31        call  printf
32        movl  $0, %eax
33        leave
34        ret
35        .size  main, .-main
36        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
37        .section      .note.GNU-stack,"",@progbits

Listing 8.11: Some local variables (gcc assembly language).

Twenty-three bytes are required for storing these variables:

three bytes for the three char variables.
four bytes for the int variable.
sixteen bytes for the two pointer variables.

However, the ABI states that the stack pointer must be on a 16-byte address boundary, so we need to allocate 32 bytes for the local variables:

11 subq $32, %rsp # 2 * 16

Listing 8.12 shows how an assembly language programmer uses symbolic names to write code that is easier to read. In this program, each variable is at the same relative location in the stack frame as in the gcc-generated version.

1# varAlign2.s
2# Allocates some local variables to illustrate their
3# alignment on the call stack.
4# Bob Plantz - 11 June 2009
5# Stack frame
6        .equ    alpha,-23
7        .equ    beta,-22
8        .equ    gamma,-21
9        .equ    number,-20
10        .equ    letterPtr,-16
11        .equ    numPtr,-8
12        .equ    localSize,-32
13# Read only data
14        .section  .rodata
15format:
16        .string "%c %c %c %i\n"
17# Code
18        .text
19        .globl  main
20        .type   main, @function
21main:
22        pushq   %rbp           # save caller’s frame pointer
23        movq    %rsp, %rbp     # establish our frame pointer
24        addq    $localSize, %rsp    # for local vars
25
26        movb    $’A’, alpha(%rbp)   # initialize variables
27        movb    $’B’, beta(%rbp)
28        movb    $’C’, gamma(%rbp)
29        movl    $123, number(%rbp)
30
31        leaq    alpha(%rbp), %rax   # initialize pointers
32        movq    %rax, letterPtr(%rbp)
33        leaq    number(%rbp), %rax
34        movq    %rax, numPtr(%rbp)
35
36        movq    numPtr(%rbp), %rax  # load pointer
37        movl    (%rax), %r8d        # for dereference
38        movb    gamma(%rbp), %cl
39        movb    beta(%rbp), %dl
40        movq    letterPtr(%rbp), %rax
41        movb    (%rax), %sil
42        movl    $format, %edi
43        movq    $0, %rax
44        call    printf
45
46        movl    $0, %eax       # return 0 to OS
47        movq    %rbp, %rsp     # restore stack pointer
48        popq    %rbp           # restore caller’s frame pointer
49        ret

Listing 8.12: Some local variables (programmer assembly language).

Notice the assembly language syntax for single character constants on lines 26 – 28:

26        movb    $’A’, alpha(%rbp)  # initialize variables
27        movb    $’B’, beta(%rbp)
28        movb    $’C’, gamma(%rbp)

The GNU assembly language info documentation speciﬁes that only the ﬁrst single quote, ’A, is required. But the C syntax, ’A’, also works, so we have used that because it is generally easier to read.¹

We can summarize the proper sequence of instructions for establishing a local variable environment in a function:

Push the calling function’s frame pointer onto the stack.
Copy the value in the stack pointer register (rsp) into the frame pointer register (rbp) to establish the frame pointer for the current function.
Allocate space for the local variables by moving the stack pointer to a lower address.

Just before ending this function, these three steps need to be undone. Since the frame pointer is pointing to where the top of the stack was before we allocated memory for local variables, the local variable memory can be deleted by simply copying the value in the frame pointer to the stack pointer. Now the calling function’s frame pointer value is at the top of the stack. The ending sequence is:

Copy the value in the frame pointer register (rbp) to the stack pointer register (rsp).
Pop the value at the top of the stack into the frame pointer register (rbp).

Listing 8.13 shows the general format that must be followed when writing a function. If you follow this format and do everything in the order that is given for all your functions, you will have many fewer problems getting them to work properly. If you do not, I guarantee that you will have many problems.

1# general.s
2        .text
3        .globl  general
4        .type   general, @function
5general:
6        pushq   %rbp       # save calling function’s frame pointer
7        movq    %rsp, %rbp # establish our frame pointer
8
9# 1. Allocate memory for local variables and saving registers,
10#    ensuring that the address in rsp is a multiple of 16.
11# 2. Save the contents of general purpose registers that must be
12#    preserved and are used in this function.
13
14# 3. The code that implements the function goes here.
15
16# 4. Restore the contents of the general purpose registers that
17#    were saved in step 2.
18# 5. Place the return value, if any, in the eax register.
19
20       movq    %rbp, %rsp  # delete local variables and reg. save area
21       popq    %rbp        # restore calling function’s frame
22                           #      pointer
23       ret

Listing 8.13: General format of a function written in assembly language.

8.6 Using syscall to Perform I/O

The printf and scanf functions discussed in Section 2.5 (page 37) are C library functions that convert program data to and from text formats for interacting with users via the screen and keyboard. The write and read functions discussed in Section 2.8 (page 53) are C wrapper functions that only pass bytes to output and from input devices, relying on the program to perform the conversions so that the bytes are meaningful to the I/O device. Ultimately, each of these functions call upon the services of the operating system to perform the actual byte transfers to and from I/O devices.

In assembly language, you do not need to use the C environment. The convention is to begin program execution at the __start label. (Note that there are two underscore characters.) The assembler is used as before, but instead of using gcc to link in the C libraries, use ld directly. You need to specify the entry point of your program. For example, the command for the program in Listing 8.14 is:

bob$ ld -e __start -o echoChar3 echoChar3.o

When performing I/O you invoke the Linux operations yourself. The technique involves moving the arguments to speciﬁc registers, placing a special code in the eax register, and then using the syscall instruction to call a function in the operating system. (The way this works is described in Section 15.6 on page 875.) The operating system will perform the action speciﬁed by the code in the eax register, using the arguments passed in the other registers. The values required for reading from and writing to ﬁles are given in Table 8.3.

system call	eax	edi	rsi	edx

read	0	ﬁle descriptor	pointer to place to store bytes	number of bytes to read

write	1	ﬁle descriptor	pointer to ﬁrst byte to write	number of bytes to write

exit	60

Table 8.3: Register set up for using syscall instruction to read, write, or exit.

In Listing 8.14 we have rewritten the program of Listing 8.6 without using the C environment.

1# echoChar3.s
2# Prompts user to enter a character, then echoes the response
3# Does not use C libraries
4# Bob Plantz -- 11 June 2009
5
6# Useful constants
7        .equ    STDIN,0
8        .equ    STDOUT,1
9        .equ    READ,0
10        .equ    WRITE,1
11        .equ    EXIT,60
12# Stack frame
13        .equ    aLetter,-16
14        .equ    localSize,-16
15# Read only data
16        .section .rodata       # the read-only data section
17prompt:
18        .string "Enter one character: "
19        .equ    promptSz,.-prompt-1
20msg:
21        .string "You entered: "
22        .equ    msgSz,.-msg-1
23# Code
24        .text                  # switch to text section
25        .globl  __start
26
27__start:
28        pushq   %rbp           # save caller’s frame pointer
29        movq    %rsp, %rbp     # establish our frame pointer
30        addq    $localSize, %rsp  # for local variable
31
32        movl    $promptSz, %edx # prompt size
33        movl    $prompt, %esi  # address of prompt text string
34        movl    $STDOUT, %edi  # standard out
35        movl    $WRITE, %eax
36        syscall                # request kernel service
37
38        movl    $2, %edx       # 1 character, plus newline
39        leaq    aLetter(%rbp), %rsi # place to store character
40        movl    $STDIN, %edi   # standard in
41        movl    $READ, %eax
42        syscall                # request kernel service
43
44        movl    $msgSz, %edx   # message size
45        movl    $msg, %esi     # address of message text string
46        movl    $STDOUT, %edi  # standard out
47        movl    $WRITE, %eax
48        syscall                # request kernel service
49
50        movl    $2, %edx       # 1 character, plus newline
51        leaq    aLetter(%rbp), %rsi # place where character stored
52        movl    $STDOUT, %edi  # standard out
53        movl    $WRITE, %eax
54        syscall                # request kernel service
55
56        movq    %rbp, %rsp     # delete local variables
57        popq    %rbp           # restore caller’s frame pointer
58        movl    $EXIT, %eax    # exit from this process
59        syscall

Listing 8.14: Echo character program using the syscall instruction.

Comparing this program with the one in Listing 8.6, the program arguments are the same and are passed in the same registers. The only diﬀerence with using the syscall function is that you have to provide a code for the operation to be performed in the eax register. The complete list of system operations that can be performed are in the system ﬁle /usr/include/asm-x86_64/unistd.h. (The path on your system may be diﬀerent.)

To determine the arguments that must be passed to each system operation read section 2 of the man page for that operation. For example, the arguments for the write system call can be seen by using

bob$ man 2 write

Then follow the rules in Section 8.1 for placing the arguments in the proper registers.

8.7 Calling Functions, 32-Bit Mode

In 32-bit mode all the arguments are pushed onto the call stack in right-to-left order. Listing 8.15 shows how to call the write() system call function.

1# fourChars_32.s
2# displays four characters on the screen using the write() system call.
3# (32-bit version.)
4# Bob Plantz - 19 March 2008
5
6# Read only data
7        .section  .rodata
8Chars:
9        .byte   ’A’
10        .byte   ’-’
11        .byte   ’Z’
12        .byte   ’\n’
13# Code
14        .text
15        .globl  main
16        .type   main, @function
17main:
18        pushl   %ebp           # save frame pointer
19        movl    %esp, %ebp     # set new frame pointer
20
21        pushl   $4             # send four bytes
22        pushl   $Chars         #    at this location
23        pushl   $1             #        to screen.
24        call    write
25        addl    $12, %esp
26
27        movl    $0, %eax       # return 0;
28        movl    %ebp, %esp     # restore stack pointer
29        popl    %ebp           # restore frame pointer
30        ret

Listing 8.15: Displaying four characters on the screen using the write system call function in assembly language.

After all three arguments have been pushed onto the call stack, it looks like:

where the notation (esp) + n means “the address in the esp register plus n.” The stack pointer, the esp register, points to the last item pushed onto the call stack. The other two arguments are stored on the stack below the top item. Don’t forget that “below” on the call stack is at numerically higher addresses because the stack grows toward lower addresses.

When the call instruction is executed, the return address is pushed onto the call stack as shown here:

where “return” is the address where the called function is supposed to return to at the end of its execution. So the arguments are readily available inside the called function; you will learn how to access them in Chapter 8. And as long as the called function does not change the return address, and restores the stack pointer to the position it was in when the function was called, it can easily return to the calling function.

Now, let’s look at what happens to the stack memory area in the assembly language program in Listing 8.15. Assume that the value in the esp register when the main function is called is 0xbffffc5c and that the value in the ebp register is 0xbffffc6a. Immediately after the subl $8, %esp instruction is executed, the stack looks like:

address	contents
bffffc50:	????????
bffffc54:	????????
bffffc58:	bffffc6a
bffffc5c:	important information

the value in the esp register is 0xbffffc50, and the value in the ebp register is 0xbffffc58. The “?” indicates that the states of the bits in the indicated memory locations are irrelevant to us. That is, the memory between locations 0xbffffc50 and 0xbffffc57 is “garbage.”

We have to assume that the values in bytes number 0xbffffc5c, 5d, 5e, and 5f were placed there by the function that called this function and have some meaning to that function. So we have to be careful to preserve the value there.

Since the esp register contains 0xbffffc50, we can continue using the stack — pushing and popping — without disturbing the eight bytes between locations 0xbffffc50 and 0xbffffc57. These eight bytes are the ones we will use for storing the local variables. And if we take care not to change the value in the ebp register throughout the function, we can easily access the local variables.

8.8 Instructions Introduced Thus Far

This summary shows the assembly language instructions introduced thus far in the book. The page number where the instruction is explained in more detail, which may be in a subsequent chapter, is also given. This book provides only an introduction to the usage of each instruction. You need to consult the manuals ([2] – [6], [14] – [18]) in order to learn all the possible uses of the instructions.

8.8.1 Instructions

data movement:
opcode	source	destination	action	page

movs	$imm/%reg	%reg/mem	move	506

movs	mem	%reg	move	506

movsss	$imm/%reg	%reg/mem	move, sign extend	693

movzss	$imm/%reg	%reg/mem	move, zero extend	693

popw		%reg/mem	pop from stack	566

pushw	$imm/%reg/mem		push onto stack	566


s = b, w, l, q; w = l, q

arithmetic/logic:
opcode	source	destination	action	page

cmps	$imm/%reg	%reg/mem	compare	676

cmps	mem	%reg	compare	676

incs	%reg/mem		increment	698

leaw	mem	%reg	load eﬀective address	579

subs	$imm/%reg	%reg/mem	subtract	612

subs	mem	%reg	subtract	612


s = b, w, l, q; w = l, q

program ﬂow control:
opcode	location	action	page

call	label	call function	546

je	label	jump equal	679

jmp	label	jump	691

jne	label	jump not equal	679

leave		undo stack frame	580

ret		return from function	583

syscall		call kernel function	587

8.8.2 Addressing Modes

___________________________________________________________

register direct:	The data value is located in a CPU register.
	syntax: name of the register with a “%” preﬁx.
	example: movl %eax, %ebx

immediate data:	The data value is located immediately after the instruction. Source operand only.
	syntax: data value with a “$” preﬁx.
	example: movl $0xabcd1234, %ebx

base register plus oﬀset:	The data value is located in memory. The address of the memory location is the sum of a value in a base register plus an oﬀset value.
	syntax: use the name of the register with parentheses around the name and the oﬀset value immediately before the left parenthesis.
	example: movl $0xaabbccdd, 12(%eax)

8.9 Exercises

8-1

(§8.1) Enter the C program in Listing 8.1 and get it to work correctly. Run the program under gdb, setting a break point at the call to write. When the program breaks, use the si (Step one instruction exactly) command to execute the instructions that load registers with the arguments. As you do this, keep track of the contents in the appropriate argument registers and the rip register. What is the address where the text string is stored? If you single step into the write function, use the cont command to continue through it.

8-2

(§8.2) Modify the program in Listing 8.3 so that the stack grows from lower numbered array elements to higher numbered ones.

8-3

(§8.2) Enter the assembly language program in Listing 8.4 and show that the rbp and rsp registers are also saved and restored by this function.

8-4

(§8.4) Enter the C program in Listing 2.4 (page 53) and compile it with the debugging option, -g. Run the program under gdb, setting a break point at each of the calls to write and read. Each time the program breaks, use the si (Step one instruction exactly) command to execute the instructions that load registers with the arguments. As you do this, keep track of the contents in the appropriate argument registers and the rip register. What are the addresses where the text strings are stored? What is the address of the aLetter variable? If you single step into either the write or read functions, use the cont command to continue through it.

8-5

(§8.4) Modify the assembly language program in Listing 8.6 such that it also reads the newline character when the user enters a single character. Run the program with gdb. Set a breakpoint at the ﬁrst instruction, then run the program. When it breaks, write down the values in the rsp and rbp registers. Write down the changes in these two registers as you single step (si command) through the ﬁrst three instructions.

Set breakpoints at the instruction that calls the read function and at the next instruction immediately after that one. Examine the values in the argument-passing registers.

From the addresses you wrote down above, determine where the two characters (user’s character plus newline) that are read from the keyboard will be stored, and examine that area of memory.

Use the cont command to continue execution through the read function. Enter a character. When the program breaks back into gdb, examine the area of memory again to make sure the two characters got stored there.

8-6

(§8.4) Write a program in assembly language that prompts the user to enter an integer, then displays its hexadecimal equivalent.

8-7

(§8.4) Write a program in assembly language that “declares” four char variables and four int variables, and initializes all eight variables with appropriate values. Then call printf to display the values of all eight variables with only one call.

[next] [prev] [prev-tail] [front] [up]

Chapter 8Program Data – Input, Store, Output

8.1 Calling write in 64-bit Mode

8.2 Introduction to the Call Stack

8.3 Viewing the Call Stack

8.4 Local Variables on the Call Stack

8.4.1 Calling printf and scanf in 64-bit Mode

8.5 Designing the Local Variable Portion of the Call Stack

8.6 Using syscall to Perform I/O

8.7 Calling Functions, 32-Bit Mode

8.8 Instructions Introduced Thus Far

8.8.1 Instructions

8.8.2 Addressing Modes

8.9 Exercises

Chapter 8
Program Data – Input, Store, Output