Registers r0–r3 and a portion of the call stack are used for the activation record. The area of the stack used for the activation record is called a stack frame. A function sets up its own stack frame and usually stores the following information in it:
The return address back to the calling function.
The calling function's frame pointer.
Register values that must be saved for the calling function.
Local variables for the current function.
Arguments (or their addresses) beyond those that fit within registers r0–r3 are placed on the stack by the calling function before branching to the called function. In these cases the calling function begins the creation of the stack frame. An argument that is less that 32 bits, for example a char, is passed in a full 32-bit word.
Listings 13.2.1–13.2.3 show a program that passes nine arguments to a function, sumNine.
The subfunction here, sumNine1.c, is accompanied by a header file. The header file contains the function declaration without the function definition. Including the header file in the calling function's file provides the compiler with a prototype of how the subfunction is called. Thus the compiler knows how to generate the assembly language code to call the subfunction.
Note that we have also included the header file in the file that defines the subfunction. This provides a double check that the function declaration in the header file matches the function definition header in its definition.
When writing in assembly language, you do not use header files to provide declarations for the functions you are calling. Since there is no compilation phase, you need to understand the calling sequence and write the correct code yourself. On the other hand, if you write a function in assembly language that you wish to make callable from C code, you need to supply a header file with the function declaration so the compiler knows how to call your assembly language function.
Listings 13.2.6–13.2.7 show my assembly language version of the program in Listings 13.2.1–13.2.3. I have simplified the sumNine function by directly using the arguments passed in registers. The compiler version (Listing 13.2.4 first saved these arguments in the local stack frame, thus freeing up the registers for local use.
Because only four registers are available for passing arguments, the main function places the remaining five arguments on the stack, as shown in Figure 13.2.8. The stack pointer is pointing to the top of the stack of these arguments when the subfunction is called.
The gcc compiler chose to allocate space on the stack for both the local variables and the stack arguments at the same time, but I chose to separate the two operations. Placing the stack argument allocation near the call to the function:
@ Function call: sumNine(a, b, c, d, e, f, g, h, i);
sub sp, sp, #argSz @ space for args
helps to prevent programming errors if the call to sumNine is changed. This technique also eliminates the need to read through the entire function before computing the overall amount of memory required on the stack. Also notice that I deallocate the space on the stack used for passing arguments immediately upon return from the function:
bl sumNine
add sp, sp, argSz @ restore sp
This technique treats each function call more like a single function call statement in a high-level language. It would be highly unusual that the additional stack pointer operations would have a measurable effect on program performance.
The prologue in the subfunction:
sub sp, sp, 8 @ space for fp, lr
str fp, [sp, 0] @ save fp
str lr, [sp, 4] @ and lr
add fp, sp, 4 @ set our frame pointer
saves the caller's frame pointer and the address in the link register (the return address) on the stack, and then sets the frame pointer for the subfunction. This places the stack in the state shown in Figure 13.2.9. Within the called function, we access the arguments relative to the frame pointer.
The overall pattern of a stack frame is shown in Figure 13.2.10. The r11 register serves as the frame pointer to the stack frame. Once the frame pointer address has been established in a function, its value must never be changed. The frame pointer points to the return address, and the calling function's frame pointer is located \(-4\) bytes offset from the frame pointer. Arguments to the function are positive offsets from the frame pointer, and local variables are negative offsets from the frame pointer.
It is essential that you follow the register usage and argument passing disciplines precisely. Any deviation can cause errors that are very difficult to debug.
In the calling function:
Assume that the values in the r0, r1, r2, and r3 registers will be changed by the called function.
The first four arguments are passed in the r0, r1, r2, and r3 registers in left-to-right order.
Arguments beyond four are stored on the stack as though they had been pushed onto the stack in right-to-left order. Yes, the order is opposite.
Use the bl instruction to invoke the function you wish to call.
Upon entering the called function:
Save the value in the link register (r14), the caller's frame pointer (in register r11), and the contents of any additional registers that must be restored on the stack.
Establish a new frame pointer address by adding \(4 \times (n - 1)\text{,}\) where \(n =\) number of registers saved, to the address in the stack pointer (r13).
Allocate space on the stack for all the local variables, plus any required register save space, by subtracting the number of bytes required from sp, observing stack alignment restrictions.
Within the called function:
Do not use the stack pointer to access arguments or local variables. sp is pointing to the current bottom of the portion of the stack that is accessible to this function, observing the usual stack discipline.
Never change the value in the frame pointer, r11.
Local variables are on the stack and are accessed through negative offsets from the frame pointer.
Arguments passed on the stack to the function are accessed through positive offsets from the frame pointer.
When leaving the called function:
Place the return value, if any, in r0.
Deallocate the local variables by adding the same amount to sp that was subtracted at the beginning of the function.
Restore the saved register values from the stack into the proper registers.
The bx lr instruction returns control to the calling function.