Section 10.1 Passing Arguments in Registers
If there were no other programs running on your Raspberry Pi, you could use most of the registers any way you like. But an operating system is a collection of programs that coordinate all the activities of the hardware, and they expect that register usage will follow specific rules. To make sure all the programs work together, ARM Inc. publishes manuals describing the rules. We will be following the rules described in Procedure Call Standard for the ARM Architecture[3].
A C function must make use of the general purpose registers as shown in Table 10.1.1. The column labeled “Restore Contents?” shows whether the function needs to ensure that the value in the register is the same when it returns to the calling function as it contained when the this function was called. This may seem to be very limiting, but you will see below how to save the contents of a register and restore the contents later in the function. This allows you to use the register within a function without disturbing the calling function's use of it.
Restore | |||
Register | Synonym | Contents? | Purpose |
r0 |
N | argument/results | |
r1 |
N | argument/results | |
r2 |
N | argument/results | |
r3 |
N | argument/results | |
r4 |
Y | local variable | |
r5 |
Y | local variable | |
r6 |
Y | local variable | |
r7 |
Y | local variable | |
r8 |
Y | local variable | |
r9 |
Y | depends on platform standard | |
r10 |
Y | local variable | |
r11 |
fp |
Y | frame pointer/local variable |
r12 |
ip |
N | intra-procedure-call scratch |
r13 |
sp |
Y | stack pointer |
r14 |
lr |
N | link register |
r15 |
pc |
N | program counter |
When one C function calls another, only the first four arguments to the called function are passed in registers. Reading a C argument list from left to right, Table 10.1.2 shows the order in which the arguments are stored in the registers. If more arguments need to be passed, they are stored on the stack, which will be explained below.
Argument | Register |
arg1 | r0 |
arg2 | r1 |
arg3 | r2 |
arg4 | r3 |
We start with a program that takes no input from the user—the “Hello World” program. It simply writes constant data to the screen.
In Section 2.15 you learned how to call the write
function in C. We will now learn how to call it in assembly language. The C program in Listing 10.1.3 uses the write
system call function to display “Hello world.” in your terminal window.
Reading the argument list from left to right:
STDOUT_FILENO
is the file descriptor of standard out, normally the screen. This symbolic name is defined in theunistd.h
header file.Although the C syntax allows a programmer to place the text string here, only its address is passed to
write
, not the entire string.This is the number of characters in the text string to write to
STDOUT_FILENO
, which the programmer has counted. (If you think that counting characters is a good job for a computer to do, you are on the right track.)
This program uses only constant data—the file descriptor number, the text string “Hello world.”, and the number of bytes that make up the text string. Constant data used by a program is part of the program itself and is not changed by the program.
The compiler-generated assembly language is shown in Listing 10.1.4. I have added some comments—using “@@
”—to help explain what this code is doing.
Storing the constant data with the function introduces four new assembler directives. The first two,
.section .rodata
direct the assembler to place what follows in the “Read Only” data section. The operating system will prevent the program from changing these memory contents when the program is executing.
The .ascii
directive,
.ascii "Hello world.\012\000"
directs the assembler to store the ASCII value of each character in memory. Comparing the text string here with the original C source code, we see that the compiler has replaced ‘\n
’ with ‘\012
’. The ‘\
’ is used as an escape character to indicate that this is the bit pattern of this byte. The as
assembler is using C notation here, so the leading ‘\(\octal{0}\)’ shows that ‘\(\octal{12}\)’ is the octal value of the newline character. (Type “man ascii
” in a terminal window.) Recall that C-style text strings are terminated with a ‘NUL
’ character, ‘\000
’.
Be sure to notice that once the constant data has been taken care of, the assembler is directed to change to the .text
segment.
Near the end of the function there is another assembler directive, which is labeled.
.L3: .word .LC0 @@ address of text string
.word
directs the assembler to allocate a word of memory and store its argument there. The argument here is .LC0
, which is the address of the text string. Addresses are 32 bits. Using the mov
instruction to load constants into a register has some restrictions (Section 11.3.3) so we have to store the 32-bit value in memory and use an ldr
instruction to load the address into a register.
Common assembler directives for allocating memory for data are shown in Table 10.1.5. All but the .space
directive initialize memory to the value(s) indicated in the table. If these are used in the .rodata
section, the values cannot be changed under program control.
[\(label\)] | .space |
\(expression\) | Evaluates \(expression\) and allocates that many bytes; |
memory not changed. | |||
[\(label\)] | .string |
“\(text\)” | Allocates number of bytes in \(text\) string, plus NUL byte at end; |
initializes them to the ASCII codes for the string. | |||
[\(label\)] | .asciz |
“\(text\)” | Same as .string . |
[\(label\)] | .ascii |
“\(text\)” | Same as .asciz , but without NUL byte. |
[\(label\)] | .byte |
\(expression\) | Allocates one byte; initializes it to value of \(expression\text{.}\) |
[\(label\)] | .hword |
\(expression\) | Allocates two bytes; initializes them to value of \(expression\text{.}\) |
[\(label\)] | .word |
\(expression\) | Allocates four bytes; initializes them to value of \(expression\text{.}\) |
[\(label\)] | .quad |
\(expression\) | Allocates eight bytes; initializes them to value of \(expression\text{.}\) |
When the compiler needs to generates labels, it starts them with the ‘.
’ character. This helps to avoid potential conflicts with global labels that a programmer has used, for example the name of another function. You can tell that these are labels and not assembler directives because they end with the ‘:
’ character.
The arguments to be passed to the write
function are loaded into the appropriate registers with the three instructions:
mov r0, #1 @@ STDOUT_FILENO is 1 ldr r1, .L3 @@ address of text string mov r2, #14 @@ number of bytes to write
Listing 10.1.4 introduces three more instructions:
bl
is used to call functions.pop
is used to load registers with values from the stack.push
is used to save register contents on the stack.
BL
-
Calls a function at a pc-relative address.
BL{<c>} <label>
<c>
is the condition code, Table 9.2.1.<label>
is a labeled memory address.
The address of the instruction immediately following this
bl
instruction is loaded into thelr
register, and the address of<label>
is moved to thepc
, thus causing program execution to branch to that location.
The push
and pop
instructions use the stack. Before writing our own assembly language version of this program, I will explain how the stack is used in Sections 10.2–10.3.
POP
-
Loads multiple words from the top of stack memory into a list of registers, starting with the lowest numbered register and continuing in ascending register number.
POP<c> <registers> % use {reg1, reg2,...} LDMDB<c> <Rn>{!}, <registers> % equivalent when Rn = sp! LDMFD<c> <Rn>{!}, <registers> % equivalent when Rn = sp!
<c>
is the condition code, Table 9.2.1.<Rn>
is the base register, which contains the memory address where the load begins. The most common use is the stack pointer,sp
.<registers>
is a comma-separated list of the destination registers. The list is enclosed in curly braces, as shown in Listing 10.1.4.If ‘
!
’ is used,<Rn>
is updated to the last address loaded from.These three instructions are the same when
sp
is the base register and the ‘!
’ is used.
PUSH
-
Stores multiple words from a list of registers into the top of stack memory, starting with the highest numbered register and continuing in descending register number.
PUSH<c> <registers> % use {reg1, reg2,...} STMDB<c> <Rn>{!}, <registers> % equivalent when Rn = sp! STMFD<c> <Rn>{!}, <registers> % equivalent when Rn = sp!
<c>
is the condition code, Table 9.2.1.<Rn>
is the base register, which contains the memory address where the store begins. The most common use is the stack pointer,sp
.<registers>
is a comma-separated list of the source registers. The list is enclosed in curly braces, as shown in Listing 10.1.4.If ‘
!
’ is used,<Rn>
is updated to the last address stored to.These three instructions are the same when
sp
is the base register and the ‘!
’ is used.
The push
and pop
instructions should be used in a complementary way. If you push
register values onto the stack, the complementary pop
instruction would have the same register list. Since it starts with the lowest numbered register, it would restore the original values into the registers on the list as it pops each value off the stack.
You may wonder why the very simple functions in Chapter 9 used the str
/ldr
instructions to save/restore the stack pointer without creating a frame pointer, while the main
function here uses the push
/pop
instructions so that a frame pointer can be created. The reason is that this main
function calls another function, while the functions in Chapter 9 do not. A frame pointer is a technique to “save our place” on the stack when calling other functions.