Section 13.2 Accessing Arguments in a Function
Registers r0–r3 and a portion of the call stack are used for the activation record. The area of the stack used for the activation record is called a stack frame. A function sets up its own stack frame and usually stores the following information in it:
The return address back to the calling function.
The calling function's frame pointer.
Register values that must be saved for the calling function.
Local variables for the current function.
Arguments (or their addresses) beyond those that fit within registers r0–r3 are placed on the stack by the calling function before branching to the called function. In these cases the calling function begins the creation of the stack frame. An argument that is less that 32 bits, for example a char, is passed in a full 32-bit word.
Listings 13.2.1–13.2.3 show a program that passes nine arguments to a function, sumNine.
/* nineInts1.c
* Declares and adds nine integers.
* 2017-09-29: Bob Plantz
*/
#include <stdio.h>
#include "sumNine1.h"
int main(void)
{
int total;
int a = 1;
int b = 2;
int c = 3;
int d = 4;
int e = 5;
int f = 6;
int g = 7;
int h = 8;
int i = 9;
total = sumNine(a, b, c, d, e, f, g, h, i);
printf("The sum is %i\n", total);
return 0;
}
sumNine, showing the passing of nine arguments in C. Link with the file in Listing 13.2.3.(C)/* sumNine1.h
* Computes sum of nine integers.
* 2017-09-29: Bob Plantz
*/
#ifndef SUMNINE_H
#define SUMNINE_H
int sumNine(int one, int two, int three, int four, int five,
int six, int seven, int eight, int nine);
#endif
sumNine function, which computes the integer sum of its nine arguments. (C)/* sumNine1.c
* Computes sum of nine integers.
* 2017-09-29: Bob Plantz
*/
#include <stdio.h>
#include "sumNine1.h"
int sumNine(int one, int two, int three, int four, int five,
int six, int seven, int eight, int nine)
{
int x;
x = one + two + three + four + five + six
+ seven + eight + nine;
return x;
}
sumNine function, which computes the integer sum of its nine arguments. (C)The subfunction here, sumNine1.c, is accompanied by a header file. The header file contains the function declaration without the function definition. Including the header file in the calling function's file provides the compiler with a prototype of how the subfunction is called. Thus the compiler knows how to generate the assembly language code to call the subfunction.
Note that we have also included the header file in the file that defines the subfunction. This provides a double check that the function declaration in the header file matches the function definition header in its definition.
When writing in assembly language, you do not use header files to provide declarations for the functions you are calling. Since there is no compilation phase, you need to understand the calling sequence and write the correct code yourself. On the other hand, if you write a function in assembly language that you wish to make callable from C code, you need to supply a header file with the function declaration so the compiler knows how to call your assembly language function.
Listings 13.2.4–13.2.5 show the assembly language generated by the compiler for the functions in Listings 13.2.1–13.2.3.
.arch armv6
.section .rodata
.align 2
.LC0:
.ascii "The sum is %i\012\000"
.text
.align 2
.global main
.syntax unified
.arm
.fpu vfp
.type main, %function
main:
@ args = 0, pretend = 0, frame = 40
@ frame_needed = 1, uses_anonymous_args = 0
push {fp, lr}
add fp, sp, #4
sub sp, sp, #64 @@ space for locals and args
mov r3, #1
str r3, [fp, #-8]
mov r3, #2
str r3, [fp, #-12]
mov r3, #3
str r3, [fp, #-16]
mov r3, #4
str r3, [fp, #-20]
mov r3, #5
str r3, [fp, #-24]
mov r3, #6
str r3, [fp, #-28]
mov r3, #7
str r3, [fp, #-32]
mov r3, #8
str r3, [fp, #-36]
mov r3, #9
str r3, [fp, #-40]
ldr r3, [fp, #-40]
str r3, [sp, #16] @@ arg i
ldr r3, [fp, #-36]
str r3, [sp, #12] @@ arg h
ldr r3, [fp, #-32]
str r3, [sp, #8] @@ arg g
ldr r3, [fp, #-28]
str r3, [sp, #4] @@ arg f
ldr r3, [fp, #-24]
str r3, [sp] @@ arg e
ldr r3, [fp, #-20] @@ arg d
ldr r2, [fp, #-16] @@ arg c
ldr r1, [fp, #-12] @@ arg b
ldr r0, [fp, #-8] @@ arg a
bl sumNine
str r0, [fp, #-44]
ldr r1, [fp, #-44]
ldr r0, .L3
bl printf
mov r3, #0
mov r0, r3
sub sp, fp, #4
@ sp needed
pop {fp, pc}
.L4:
.align 2
.L3:
.word .LC0
.ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
sumNine, showing the assembly language for passing nine arguments in C. (gcc asm) .arch armv6
.file "sumNine1.c"
.text
.align 2
.global sumNine
.syntax unified
.arm
.fpu vfp
.type sumNine, %function
sumNine:
@ args = 20, pretend = 0, frame = 24
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
str fp, [sp, #-4]!
add fp, sp, #0
sub sp, sp, #28
str r0, [fp, #-16] @@ save the arguments
str r1, [fp, #-20] @@ passed in
str r2, [fp, #-24] @@ registers
str r3, [fp, #-28] @@ in local area
ldr r2, [fp, #-16] @@ load arg one
ldr r3, [fp, #-20] @@ and arg two
add r2, r2, r3 @@ r2 is subtotal
ldr r3, [fp, #-24] @@ load arg three
add r2, r2, r3 @@ add to subtotal
ldr r3, [fp, #-28] @@ etc...
add r2, r2, r3
ldr r3, [fp, #4]
add r2, r2, r3
ldr r3, [fp, #8]
add r2, r2, r3
ldr r3, [fp, #12]
add r2, r2, r3
ldr r3, [fp, #16]
add r2, r2, r3
ldr r3, [fp, #20] @@ load arg nine
add r3, r2, r3 @@ add to subtotal
str r3, [fp, #-8] @@ store in x
ldr r3, [fp, #-8]
mov r0, r3 @@ return x;
add sp, fp, #0
@ sp needed
ldr fp, [sp], #4
bx lr
.ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
sumNine function, which computes the integer sum of its nine integer arguments. (gcc asm)Listings 13.2.6–13.2.7 show my assembly language version of the program in Listings 13.2.1–13.2.3. I have simplified the sumNine function by directly using the arguments passed in registers. The compiler version (Listing 13.2.4 first saved these arguments in the local stack frame, thus freeing up the registers for local use.
@ nineInts2.s
@ Sums 1 through 9
@ 2017-09-29: Bob Plantz
@ Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified @ modern syntax
@ Useful source code constants
.equ a,-8
.equ b,-12
.equ c,-16
.equ d,-20
.equ e,-24
.equ f,-28
.equ g,-32
.equ h,-36
.equ i,-40
.equ total,-44
.equ locals,40
@ Need 5 args on stack for sumNine function
.equ arg5,0
.equ arg6,4
.equ arg7,8
.equ arg8,12
.equ arg9,16
.equ argSz,24 @ 5x4, 8-byte aligned
@ Program constant data
.section .rodata
.align 2
resultMsg:
.asciz "The sum is %i\n"
@ The code
.text
.align 2
.global main
.type main, %function
main:
sub sp, sp, 8 @ space for fp, lr
str fp, [sp, 0] @ save fp
str lr, [sp, 4] @ and lr
add fp, sp, 4 @ set our frame pointer
sub sp, sp, locals @ space for locals
mov r3, 1 @ initialize local vars
str r3, [fp, a] @ a = 1;
mov r3, 2
str r3, [fp, b] @ b = 2;
mov r3, 3
str r3, [fp, c] @ etc...
mov r3, 4
str r3, [fp, d]
mov r3, 5
str r3, [fp, e]
mov r3, 6
str r3, [fp, f]
mov r3, 7
str r3, [fp, g]
mov r3, 8
str r3, [fp, h]
mov r3, 9
str r3, [fp, i] @ i = 9;
@ Function call: sumNine(a, b, c, d, e, f, g, h, i);
sub sp, sp, argSz @ space for args
ldr r3, [fp, e] @ set up args for call
str r3, [sp, arg5] @ e is 5th arg
ldr r3, [fp, f]
str r3, [sp, arg6] @ f is 6th arg
ldr r3, [fp, g]
str r3, [sp, arg7] @ etc...
ldr r3, [fp, h]
str r3, [sp, arg8]
ldr r3, [fp, i]
str r3, [sp, arg9]
ldr r0, [fp, a] @ args 1 - 4
ldr r1, [fp, b] @ go in
ldr r2, [fp, c] @ regs
ldr r3, [fp, d] @ 0 - 3
bl sumNine
add sp, sp, argSz @ restore sp
str r0, [fp, total] @ total returned in r0
ldr r0, resultMsgAddr @ print result
ldr r1, [fp, total]
bl printf
mov r0, 0 @ return 0;
add sp, sp, locals @ deallocate local var
ldr fp, [sp, 0] @ restore caller fp
ldr lr, [sp, 4] @ lr
add sp, sp, 8 @ and sp
bx lr @ return
.align 2
resultMsgAddr:
.word resultMsg
sumNine, showing the assembly language rules for passing nine arguments. Link with the file in Listing 13.2.7. (prog asm)@ sumNine2.s
@ Computes sum of nine integers.
@ Calling sequence:
@ Four ints in r0 - r3
@ Five ints pushed onto stack
@ bl sumNine
@ Sum returned in r0
@ 2017-09-29: Bob Plantz
@ Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified @ modern syntax
@ Useful source code constants
.equ five,4 @ args 5 - 9
.equ six,8
.equ seven,12
.equ eight,16
.equ nine,20
.text
.align 2
.global sumNine
.type sumNine, %function
sumNine:
sub sp, sp, 8 @ space for fp, lr
str fp, [sp, 0] @ save fp
str lr, [sp, 4] @ and lr
add fp, sp, 4 @ set our frame pointer
@ Sum four register arguments
add r0, r0, r1 @ subtotal = one + two
add r0, r0, r2 @ subtotal += three
add r0, r0, r3 @ subtotal += four
@ Add in five arguments from stack
ldr r3, [fp, five] @ load five
add r0, r0, r3 @ subtotal += five
ldr r3, [fp, six] @ load six
add r0, r0, r3 @ subtotal += six
ldr r3, [fp, seven] @ load seven
add r0, r0, r3 @ subtotal += seven
ldr r3, [fp, eight] @ load eight
add r0, r0, r3 @ subtotal += eight
ldr r3, [fp, nine] @ load nine
add r0, r0, r3 @ total += nine
@ ready for return
ldr fp, [sp, 0] @ restore caller fp
ldr lr, [sp, 4] @ lr
add sp, sp, 8 @ and sp
bx lr @ return
Because only four registers are available for passing arguments, the main function places the remaining five arguments on the stack, as shown in Figure 13.2.8. The stack pointer is pointing to the top of the stack of these arguments when the subfunction is called.
The gcc compiler chose to allocate space on the stack for both the local variables and the stack arguments at the same time, but I chose to separate the two operations. Placing the stack argument allocation near the call to the function:
@ Function call: sumNine(a, b, c, d, e, f, g, h, i);
sub sp, sp, #argSz @ space for args
helps to prevent programming errors if the call to sumNine is changed. This technique also eliminates the need to read through the entire function before computing the overall amount of memory required on the stack. Also notice that I deallocate the space on the stack used for passing arguments immediately upon return from the function:
bl sumNine add sp, sp, argSz @ restore sp
This technique treats each function call more like a single function call statement in a high-level language. It would be highly unusual that the additional stack pointer operations would have a measurable effect on program performance.
The prologue in the subfunction:
sub sp, sp, 8 @ space for fp, lr str fp, [sp, 0] @ save fp str lr, [sp, 4] @ and lr add fp, sp, 4 @ set our frame pointer
saves the caller's frame pointer and the address in the link register (the return address) on the stack, and then sets the frame pointer for the subfunction. This places the stack in the state shown in Figure 13.2.9. Within the called function, we access the arguments relative to the frame pointer.
The overall pattern of a stack frame is shown in Figure 13.2.10. The r11 register serves as the frame pointer to the stack frame. Once the frame pointer address has been established in a function, its value must never be changed. The frame pointer points to the return address, and the calling function's frame pointer is located \(-4\) bytes offset from the frame pointer. Arguments to the function are positive offsets from the frame pointer, and local variables are negative offsets from the frame pointer.
It is essential that you follow the register usage and argument passing disciplines precisely. Any deviation can cause errors that are very difficult to debug.
-
In the calling function:
Assume that the values in the
r0,r1,r2, andr3registers will be changed by the called function.The first four arguments are passed in the
r0,r1,r2, andr3registers in left-to-right order.Arguments beyond four are stored on the stack as though they had been pushed onto the stack in right-to-left order. Yes, the order is opposite.
Use the
blinstruction to invoke the function you wish to call.
-
Upon entering the called function:
Save the value in the link register (
r14), the caller's frame pointer (in registerr11), and the contents of any additional registers that must be restored on the stack.Establish a new frame pointer address by adding \(4 \times (n - 1)\text{,}\) where \(n =\) number of registers saved, to the address in the stack pointer (
r13).Allocate space on the stack for all the local variables, plus any required register save space, by subtracting the number of bytes required from
sp, observing stack alignment restrictions.
-
Within the called function:
Do not use the stack pointer to access arguments or local variables.
spis pointing to the current bottom of the portion of the stack that is accessible to this function, observing the usual stack discipline.Never change the value in the frame pointer,
r11.Local variables are on the stack and are accessed through negative offsets from the frame pointer.
Arguments passed on the stack to the function are accessed through positive offsets from the frame pointer.
-
When leaving the called function:
Place the return value, if any, in
r0.Deallocate the local variables by adding the same amount to
spthat was subtracted at the beginning of the function.Restore the saved register values from the stack into the proper registers.
The
bx lrinstruction returns control to the calling function.
