Section 13.2 Accessing Arguments in a Function
Registers r0
–r3
and a portion of the call stack are used for the activation record. The area of the stack used for the activation record is called a stack frame. A function sets up its own stack frame and usually stores the following information in it:
The return address back to the calling function.
The calling function's frame pointer.
Register values that must be saved for the calling function.
Local variables for the current function.
Arguments (or their addresses) beyond those that fit within registers r0
–r3
are placed on the stack by the calling function before branching to the called function. In these cases the calling function begins the creation of the stack frame. An argument that is less that 32 bits, for example a char
, is passed in a full 32-bit word.
Listings 13.2.1–13.2.3 show a program that passes nine arguments to a function, sumNine
.
/* nineInts1.c * Declares and adds nine integers. * 2017-09-29: Bob Plantz */ #include <stdio.h> #include "sumNine1.h" int main(void) { int total; int a = 1; int b = 2; int c = 3; int d = 4; int e = 5; int f = 6; int g = 7; int h = 8; int i = 9; total = sumNine(a, b, c, d, e, f, g, h, i); printf("The sum is %i\n", total); return 0; }
sumNine
, showing the passing of nine arguments in C. Link with the file in Listing 13.2.3.(C)/* sumNine1.h * Computes sum of nine integers. * 2017-09-29: Bob Plantz */ #ifndef SUMNINE_H #define SUMNINE_H int sumNine(int one, int two, int three, int four, int five, int six, int seven, int eight, int nine); #endif
sumNine
function, which computes the integer sum of its nine arguments. (C)/* sumNine1.c * Computes sum of nine integers. * 2017-09-29: Bob Plantz */ #include <stdio.h> #include "sumNine1.h" int sumNine(int one, int two, int three, int four, int five, int six, int seven, int eight, int nine) { int x; x = one + two + three + four + five + six + seven + eight + nine; return x; }
sumNine
function, which computes the integer sum of its nine arguments. (C)The subfunction here, sumNine1.c
, is accompanied by a header file. The header file contains the function declaration without the function definition. Including the header file in the calling function's file provides the compiler with a prototype of how the subfunction is called. Thus the compiler knows how to generate the assembly language code to call the subfunction.
Note that we have also included the header file in the file that defines the subfunction. This provides a double check that the function declaration in the header file matches the function definition header in its definition.
When writing in assembly language, you do not use header files to provide declarations for the functions you are calling. Since there is no compilation phase, you need to understand the calling sequence and write the correct code yourself. On the other hand, if you write a function in assembly language that you wish to make callable from C code, you need to supply a header file with the function declaration so the compiler knows how to call your assembly language function.
Listings 13.2.4–13.2.5 show the assembly language generated by the compiler for the functions in Listings 13.2.1–13.2.3.
.arch armv6 .section .rodata .align 2 .LC0: .ascii "The sum is %i\012\000" .text .align 2 .global main .syntax unified .arm .fpu vfp .type main, %function main: @ args = 0, pretend = 0, frame = 40 @ frame_needed = 1, uses_anonymous_args = 0 push {fp, lr} add fp, sp, #4 sub sp, sp, #64 @@ space for locals and args mov r3, #1 str r3, [fp, #-8] mov r3, #2 str r3, [fp, #-12] mov r3, #3 str r3, [fp, #-16] mov r3, #4 str r3, [fp, #-20] mov r3, #5 str r3, [fp, #-24] mov r3, #6 str r3, [fp, #-28] mov r3, #7 str r3, [fp, #-32] mov r3, #8 str r3, [fp, #-36] mov r3, #9 str r3, [fp, #-40] ldr r3, [fp, #-40] str r3, [sp, #16] @@ arg i ldr r3, [fp, #-36] str r3, [sp, #12] @@ arg h ldr r3, [fp, #-32] str r3, [sp, #8] @@ arg g ldr r3, [fp, #-28] str r3, [sp, #4] @@ arg f ldr r3, [fp, #-24] str r3, [sp] @@ arg e ldr r3, [fp, #-20] @@ arg d ldr r2, [fp, #-16] @@ arg c ldr r1, [fp, #-12] @@ arg b ldr r0, [fp, #-8] @@ arg a bl sumNine str r0, [fp, #-44] ldr r1, [fp, #-44] ldr r0, .L3 bl printf mov r3, #0 mov r0, r3 sub sp, fp, #4 @ sp needed pop {fp, pc} .L4: .align 2 .L3: .word .LC0 .ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
sumNine
, showing the assembly language for passing nine arguments in C. (gcc asm).arch armv6 .file "sumNine1.c" .text .align 2 .global sumNine .syntax unified .arm .fpu vfp .type sumNine, %function sumNine: @ args = 20, pretend = 0, frame = 24 @ frame_needed = 1, uses_anonymous_args = 0 @ link register save eliminated. str fp, [sp, #-4]! add fp, sp, #0 sub sp, sp, #28 str r0, [fp, #-16] @@ save the arguments str r1, [fp, #-20] @@ passed in str r2, [fp, #-24] @@ registers str r3, [fp, #-28] @@ in local area ldr r2, [fp, #-16] @@ load arg one ldr r3, [fp, #-20] @@ and arg two add r2, r2, r3 @@ r2 is subtotal ldr r3, [fp, #-24] @@ load arg three add r2, r2, r3 @@ add to subtotal ldr r3, [fp, #-28] @@ etc... add r2, r2, r3 ldr r3, [fp, #4] add r2, r2, r3 ldr r3, [fp, #8] add r2, r2, r3 ldr r3, [fp, #12] add r2, r2, r3 ldr r3, [fp, #16] add r2, r2, r3 ldr r3, [fp, #20] @@ load arg nine add r3, r2, r3 @@ add to subtotal str r3, [fp, #-8] @@ store in x ldr r3, [fp, #-8] mov r0, r3 @@ return x; add sp, fp, #0 @ sp needed ldr fp, [sp], #4 bx lr .ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
sumNine
function, which computes the integer sum of its nine integer arguments. (gcc asm)Listings 13.2.6–13.2.7 show my assembly language version of the program in Listings 13.2.1–13.2.3. I have simplified the sumNine
function by directly using the arguments passed in registers. The compiler version (Listing 13.2.4 first saved these arguments in the local stack frame, thus freeing up the registers for local use.
@ nineInts2.s @ Sums 1 through 9 @ 2017-09-29: Bob Plantz @ Define my Raspberry Pi .cpu cortex-a53 .fpu neon-fp-armv8 .syntax unified @ modern syntax @ Useful source code constants .equ a,-8 .equ b,-12 .equ c,-16 .equ d,-20 .equ e,-24 .equ f,-28 .equ g,-32 .equ h,-36 .equ i,-40 .equ total,-44 .equ locals,40 @ Need 5 args on stack for sumNine function .equ arg5,0 .equ arg6,4 .equ arg7,8 .equ arg8,12 .equ arg9,16 .equ argSz,24 @ 5x4, 8-byte aligned @ Program constant data .section .rodata .align 2 resultMsg: .asciz "The sum is %i\n" @ The code .text .align 2 .global main .type main, %function main: sub sp, sp, 8 @ space for fp, lr str fp, [sp, 0] @ save fp str lr, [sp, 4] @ and lr add fp, sp, 4 @ set our frame pointer sub sp, sp, locals @ space for locals mov r3, 1 @ initialize local vars str r3, [fp, a] @ a = 1; mov r3, 2 str r3, [fp, b] @ b = 2; mov r3, 3 str r3, [fp, c] @ etc... mov r3, 4 str r3, [fp, d] mov r3, 5 str r3, [fp, e] mov r3, 6 str r3, [fp, f] mov r3, 7 str r3, [fp, g] mov r3, 8 str r3, [fp, h] mov r3, 9 str r3, [fp, i] @ i = 9; @ Function call: sumNine(a, b, c, d, e, f, g, h, i); sub sp, sp, argSz @ space for args ldr r3, [fp, e] @ set up args for call str r3, [sp, arg5] @ e is 5th arg ldr r3, [fp, f] str r3, [sp, arg6] @ f is 6th arg ldr r3, [fp, g] str r3, [sp, arg7] @ etc... ldr r3, [fp, h] str r3, [sp, arg8] ldr r3, [fp, i] str r3, [sp, arg9] ldr r0, [fp, a] @ args 1 - 4 ldr r1, [fp, b] @ go in ldr r2, [fp, c] @ regs ldr r3, [fp, d] @ 0 - 3 bl sumNine add sp, sp, argSz @ restore sp str r0, [fp, total] @ total returned in r0 ldr r0, resultMsgAddr @ print result ldr r1, [fp, total] bl printf mov r0, 0 @ return 0; add sp, sp, locals @ deallocate local var ldr fp, [sp, 0] @ restore caller fp ldr lr, [sp, 4] @ lr add sp, sp, 8 @ and sp bx lr @ return .align 2 resultMsgAddr: .word resultMsg
sumNine
, showing the assembly language rules for passing nine arguments. Link with the file in Listing 13.2.7. (prog asm)@ sumNine2.s @ Computes sum of nine integers. @ Calling sequence: @ Four ints in r0 - r3 @ Five ints pushed onto stack @ bl sumNine @ Sum returned in r0 @ 2017-09-29: Bob Plantz @ Define my Raspberry Pi .cpu cortex-a53 .fpu neon-fp-armv8 .syntax unified @ modern syntax @ Useful source code constants .equ five,4 @ args 5 - 9 .equ six,8 .equ seven,12 .equ eight,16 .equ nine,20 .text .align 2 .global sumNine .type sumNine, %function sumNine: sub sp, sp, 8 @ space for fp, lr str fp, [sp, 0] @ save fp str lr, [sp, 4] @ and lr add fp, sp, 4 @ set our frame pointer @ Sum four register arguments add r0, r0, r1 @ subtotal = one + two add r0, r0, r2 @ subtotal += three add r0, r0, r3 @ subtotal += four @ Add in five arguments from stack ldr r3, [fp, five] @ load five add r0, r0, r3 @ subtotal += five ldr r3, [fp, six] @ load six add r0, r0, r3 @ subtotal += six ldr r3, [fp, seven] @ load seven add r0, r0, r3 @ subtotal += seven ldr r3, [fp, eight] @ load eight add r0, r0, r3 @ subtotal += eight ldr r3, [fp, nine] @ load nine add r0, r0, r3 @ total += nine @ ready for return ldr fp, [sp, 0] @ restore caller fp ldr lr, [sp, 4] @ lr add sp, sp, 8 @ and sp bx lr @ return
Because only four registers are available for passing arguments, the main
function places the remaining five arguments on the stack, as shown in Figure 13.2.8. The stack pointer is pointing to the top of the stack of these arguments when the subfunction is called.
The gcc
compiler chose to allocate space on the stack for both the local variables and the stack arguments at the same time, but I chose to separate the two operations. Placing the stack argument allocation near the call to the function:
@ Function call: sumNine(a, b, c, d, e, f, g, h, i); sub sp, sp, #argSz @ space for args
helps to prevent programming errors if the call to sumNine
is changed. This technique also eliminates the need to read through the entire function before computing the overall amount of memory required on the stack. Also notice that I deallocate the space on the stack used for passing arguments immediately upon return from the function:
bl sumNine add sp, sp, argSz @ restore sp
This technique treats each function call more like a single function call statement in a high-level language. It would be highly unusual that the additional stack pointer operations would have a measurable effect on program performance.
The prologue in the subfunction:
sub sp, sp, 8 @ space for fp, lr str fp, [sp, 0] @ save fp str lr, [sp, 4] @ and lr add fp, sp, 4 @ set our frame pointer
saves the caller's frame pointer and the address in the link register (the return address) on the stack, and then sets the frame pointer for the subfunction. This places the stack in the state shown in Figure 13.2.9. Within the called function, we access the arguments relative to the frame pointer.
The overall pattern of a stack frame is shown in Figure 13.2.10. The r11
register serves as the frame pointer to the stack frame. Once the frame pointer address has been established in a function, its value must never be changed. The frame pointer points to the return address, and the calling function's frame pointer is located \(-4\) bytes offset from the frame pointer. Arguments to the function are positive offsets from the frame pointer, and local variables are negative offsets from the frame pointer.
It is essential that you follow the register usage and argument passing disciplines precisely. Any deviation can cause errors that are very difficult to debug.
-
In the calling function:
Assume that the values in the
r0
,r1
,r2
, andr3
registers will be changed by the called function.The first four arguments are passed in the
r0
,r1
,r2
, andr3
registers in left-to-right order.Arguments beyond four are stored on the stack as though they had been pushed onto the stack in right-to-left order. Yes, the order is opposite.
Use the
bl
instruction to invoke the function you wish to call.
-
Upon entering the called function:
Save the value in the link register (
r14
), the caller's frame pointer (in registerr11
), and the contents of any additional registers that must be restored on the stack.Establish a new frame pointer address by adding \(4 \times (n - 1)\text{,}\) where \(n =\) number of registers saved, to the address in the stack pointer (
r13
).Allocate space on the stack for all the local variables, plus any required register save space, by subtracting the number of bytes required from
sp
, observing stack alignment restrictions.
-
Within the called function:
Do not use the stack pointer to access arguments or local variables.
sp
is pointing to the current bottom of the portion of the stack that is accessible to this function, observing the usual stack discipline.Never change the value in the frame pointer,
r11
.Local variables are on the stack and are accessed through negative offsets from the frame pointer.
Arguments passed on the stack to the function are accessed through positive offsets from the frame pointer.
-
When leaving the called function:
Place the return value, if any, in
r0
.Deallocate the local variables by adding the same amount to
sp
that was subtracted at the beginning of the function.Restore the saved register values from the stack into the proper registers.
The
bx lr
instruction returns control to the calling function.