Skip to main content
\(\newcommand{\doubler}[1]{2#1} \newcommand{\binary}{\texttt} \newcommand{\hex}{\texttt} \newcommand{\octal}{\texttt} \newcommand{\prog}{\texttt} \newcommand{\lt}{ < } \newcommand{\gt}{ > } \newcommand{\amp}{ & } \)

Section10.1Passing Arguments in Registers

If there were no other programs running on your Raspberry Pi, you could use most of the registers any way you like. But an operating system is a collection of programs that coordinate all the activities of the hardware, and they expect that register usage will follow specific rules. To make sure all the programs work together, ARM Inc. publishes manuals describing the rules. We will be following the rules described in Procedure Call Standard for the ARM Architecture[A.1.3].

A C function must make use of the general purpose registers as shown in Table 10.1.1. The column labeled “Restore Contents?” shows whether the function needs ensure that the value in the register is the same when it returns to the calling function as it contained when the this function was called. This may seem to be very limiting, but you will see below how to save the contents of a register and restore the contents later in the function. This allows you to use the register within a function without disturbing the calling function's use of it.

Restore
Register Synonym Contents? Purpose
r0 N argument/results
r1 N argument/results
r2 N argument/results
r3 N argument/results
r4 Y local variable
r5 Y local variable
r6 Y local variable
r7 Y local variable
r8 Y local variable
r9 Y depends on platform standard
r10 Y local variable
r11 fp Y frame pointer/local variable
r12 ip N intra-procedure-call scratch
r13 sp Y stack pointer
r14 lr N link register
r15 pc N program counter
Table10.1.1Register usage by a called function.

When one C function calls another, only the first four arguments to the called function are passed in registers. Reading a C argument list from left to right, Table 10.1.2 shows the order in which the arguments are stored in the registers. If more arguments need to be passed, they are stored on the stack, which will be explained below.

Argument Register
arg1 r0
arg2 r1
arg3 r2
arg4 r3
Table10.1.2Order of storing arguments in registers before calling another C function, reading the C argument list from left to right. Only the first four arguments are passed in registers.

We start with a program that takes no input from the user—the “Hello World” program. It simply writes constant data to the screen.

In Section 2.8 you learned how to call the write function in C. We will now learn how to call it in assembly language. The C program in Listing 10.1.3 uses the write system call function to display “Hello world.” in your terminal window.

/*
 * helloWorld1.c
 * "Hello World" program using the write() system call.
 * Bob Plantz - 26 July 2016
 */
#include <unistd.h>

int main(void)
{
  write(STDOUT_FILENO, "Hello, World!\n", 14);
  
  return 0;
}
Listing10.1.3“Hello World” program using the write system call function (C).

Reading the argument list from left to right:

  1. STDOUT_FILENO is the file descriptor of standard out, normally the screen. This symbolic name is defined in the unistd.h header file.

  2. Although the C syntax allows a programmer to place the text string here, only its address is passed to write, not the entire string.

  3. This is the number of characters in the text string to write to STDOUT_FILENO, which the programmer has counted. (If you think that counting characters is a good job for a computer to do, you are on the right track.)

This program uses only constant data—the file descriptor number, the text string “Hello world.”, and the number of bytes that make up the text string. Constant data used by a program is part of the program itself and is not changed by the program.

The compiler-generated assembly language is shown in Listing 10.1.4. I have added some comments—using “@@”—to help explain what this code is doing.

        .arch armv6
        .fpu vfp
        .file   "helloWorld1.c"
        .section   .rodata
        .align  2
.LC0:
        .ascii  "Hello, World!\012\000"
        .text
        .align  2
        .global main
        .type   main, %function
main:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 1, uses_anonymous_args = 0
        stmfd   sp!, {fp, lr}
        add     fp, sp, #4
        mov     r0, #1           @@ STDOUT_FILENO is 1
        ldr     r1, .L3          @@ address of text string
        mov     r2, #14          @@ number of bytes to write
        bl      write
        mov     r3, #0
        mov     r0, r3
        ldmfd   sp!, {fp, pc}
.L4:
        .align  2
.L3:
        .word   .LC0
        .ident  "GCC: (Raspbian 4.9.2-10) 4.9.2"
Listing10.1.4“Hello World” program using the write system call function (gcc asm).

Note: In this and subsequent “gcc asm” listings I will not show the .eabi and .fpu assembler directives because they do not apply to the programs we are writing in this book. Although the .arch, .file, and .ident directives are not required for the programming in this book, I will leave them in to help document the parameters of the compilation.

Storing the constant data with the function introduces four new assembler directives. The first two,

.section  .rodata

direct the assembler to place what follows in the “Read Only” data section. The operating system will prevent the program from changing these memory contents when the program is executing.

The .ascii directive,

.ascii  "Hello world.\012\000"

directs the assembler to store the ASCII value of each character in memory. Comparing the text string here with the original C source code, we see that the compiler has replaced ‘\n’ with ‘\012’. The ‘\’ is used as an escape character to indicate that this is the bit pattern of this byte. The as assembler is using C notation here, so the leading ‘\(\octal{0}\)’ shows that ‘\(\octal{12}\)’ is the octal value of the newline character. (Type “man ascii” in a terminal window.) Recall that C-style text strings are terminated with a ‘NUL’ character, ‘\000’.

Be sure to notice that once the constant data has been taken care of, the assembler is directed to change to the .text segment.

Near the end of the function there is another assembler directive, which is labeled.

.L3:
        .word   .LC0           @@ address of text string

.word directs the assembler to allocate a word of memory and store its argument there. The argument here is .LC0, which is the address of the text string. Addresses are 32 bits. Using the mov instruction to load constants into a register has some restrictions (Section 11.3.3) so we have to store the 32-bit value in memory and use an ldr instruction to load the address into a register.

Common assembler directives for allocating memory for data are shown in Table 10.1.5. All but the .space directive initialize memory to the value(s) indicated in the table. If these are used in the .rodata section, the values cannot be changed under program control.

[\(label\)] .space \(expression\) Evaluates \(expression\) and allocates that many bytes; memory is not initialized.
[\(label\)] .string “\(text\)” Allocates number of bytes in \(text\) string, plus NUL byte at end, and initializes memory to the ASCII codes for the string.
[\(label\)] .asciz “\(text\)” Same as .string.
[\(label\)] .ascii “\(text\)” Same as .asciz, but without NUL byte.
[\(label\)] .byte \(expression\) Allocates one byte and initializes it to the value of \(expression\text{.}\)
[\(label\)] .hword \(expression\) Allocates two bytes and initializes them to the value of \(expression\text{.}\)
[\(label\)] .word \(expression\) Allocates four bytes and initializes them to the value of \(expression\text{.}\)
[\(label\)] .quad \(expression\) Allocates eight bytes and initializes them to the value of \(expression\text{.}\)
Table10.1.5Common assembler directives for allocating memory. The \(label\) is optional.

When the compiler needs to generates labels, it starts them with the ‘.’ character. This helps to avoid potential conflicts with global labels that a programmer has used, for example the name of another function. You can tell that these are labels and not assembler directives because they end with the ‘:’ character.

The arguments to be passed to the write function are loaded into the appropriate registers with the three instruction:

mov     r0, #1      @@ STDOUT_FILENO is 1
ldr     r1, .L3     @@ address of text string
mov     r2, #14     @@ number of bytes to write

Listing 10.1.4 introduces three more instructions:

  • bl is used to call functions.

  • ldmfd is used to load registers with values from the stack.

  • stmfd is used to save register contents on the stack.

The stack is described in Section 10.2.

BL

Calls a function at a pc-relative address.

BL{<c>}    <label>
  • <c> is the condition code, Table 9.2.1.

  • <label> is a labeled memory address.

The address of the instruction immediately following this bl instruction is loaded into the lr register, and the address of <label> is moved to the pc, thus causing program execution to branch to that location.

The ldmfd and stmfd instructions use the stack. Before writing our own assembly language version of this program, I will explain how the stack is used in the next two Sections.

LDMFD

Loads multiple words from memory into a list of registers, starting with the lowest numbered register and continuing in ascending register number.

LDMFD<c>  <Rn>{!}, <registers>   % use {reg1, reg2,...}
  • <c> is the condition code, Table 9.2.1.

  • <Rn> is the base register, which contains the memory address where the load begins. A common use is the stack pointer, sp.

  • <registers> is a comma-separated list of the destination registers. The list is enclosed in curly braces, as shown in Listing 10.1.4.

  • If ‘!’ is used, <Rn> is updated to the last address loaded from.

STMFD

Stores multiple words from a list of registers into a sequential area of memory, starting with the highest numbered register and continuing in descending register number.

STMFD<c>  <Rn>{!}, <registers>   % use {reg1, reg2,...}
  • <c> is the condition code, Table 9.2.1.

  • <Rn> is the base register, which contains the memory address where the store begins. A common use is the stack pointer, sp.

  • <registers> is a comma-separated list of the source registers. The list is enclosed in curly braces, as shown in Listing 10.1.4.

  • If ‘!’ is used, <Rn> is updated to the last address stored to.

The ldmfd and stmfd instructions are typically used in a complementary way. If you use the stack pointer, sp, as the base register, stmfd acts to push register values onto the stack, starting with the highest numbered register in the list. The complementary ldmfd instruction would have the same register list. Since it starts with the lowest numbered register, it would restore the original values into the registers on the list as it pops each value off the stack. Many assemblers provide “push” and “pop” macros, but using them tends to mask the action that is actually taking place, so we will use the actual instructions in this book.

You may wonder why the very simple functions in Chapter 9 used the str/ldr instructions to save/restore the stack pointer without creating a frame pointer, while the main function here uses the stmfd/ldmfd instructions so that a frame pointer can be created. The reason is that this main function calls another function, while the functions in Chapter 9 do not. A frame pointer is a technique to “save our place” on the stack when calling other functions.