Section 12.1 Repetition
The algorithms we choose when programming interact closely with the data storage structure. As you probably know, a string of characters is stored in an array. Each element of the array is of type char, and in C the end of the data is signified with a sentinel value, the NUL character (see Table 2.13.1).
Array processing is usually a repetitive task. The processing of a character string is a good example of repetition. Consider the C program in Listing 12.1.1.
/* helloLoop1.c
* "hello world" program using the write() system call
* one character at a time.
* 2017-09-29: Bob Plantz
*/
#include <unistd.h>
int main(void)
{
char *aString = "Hello World.\n";
while (*aString != '\0') {
write(STDOUT_FILENO, aString, 1);
aString++;
}
return 0;
}
The while statement:
while (*aString != '\0')
{
...
}
controls the execution of the statements within the {...} block:
It evaluates the Boolean expression
*aString != '\0'.If the Boolean expression evaluates to false, program flow jumps to the statement immediately following the
{...}block.If the Boolean expression evaluates to true, program flow enters the
{...}block and executes the statements there in sequence.At the end of the
{...}block program flow jumps back up to the evaluation of the Boolean expression.
The pointer variable is incremented with the
aString++;
statement. Notice that this variable must be changed inside the {...} block. Otherwise, the Boolean expression will always evaluate to true, giving an “infinite” loop.
It is important that you identify the variable the while construct uses to control program flow—the Loop Control Variable (LCV). Make sure that the value of the LCV is changed within the {...} block. There may be more than one LCV.
The way that the while construct controls program flow can be seen in the flow chart in Figure 12.1.2.
while loop. The large diamond represents a binary decision that leads to two possible paths, “true” or “false.” Program flow returns to the top of the while loop after the body has been executed for a reevaluation of the Boolean expression.This flow chart shows that we need the following assembly language tools to construct a while loop:
Instruction(s) to evaluate Boolean expressions.
An instruction that conditionally transfers control (branches) to another location in the program. This is represented by the large diamond, which shows two possible paths.
An instruction that unconditionally transfers control to another location in the program. This is represented by the line that leads from “Execute body of
whileloop” back to the top.
We will explore instructions that provide these tools in the next three subsections.
Subsection 12.1.1 Comparison Instructions
Although most ARM instructions can optionally affect the condition flags, there are two instructions specfically intended to compare two values. They always set the condition flags, but do not change either of the compared values.
CMP-
Does an arithmetic comparison of two values and sets condition flags according to the result.
CMP{<c>} <Rn>, #<const> % immediate CMP{<c>} <Rn>, <Rm> {,<shift>} % register CMP{<c>} <Rn>, <Rm>, <type> <Rs> % register-shifted register<c>is the condition code, Table 9.2.1.<Rn>and<Rm>and<Rn>are registers to compare.<Rs>contains the shift amount in the “register-shifted register” form.\(-257 \le const \le +256\text{,}\) or \(const = +256, +260, +264, \ldots, +65280\text{,}\) or \(const = -261, -265, \ldots, -65281\text{.}\) This odd sequence of values will be explained in Section 11.3.3
<shift>and<type>are explained in Section 9.2.3
The values in
<Rm>,<Rn>, and<Rs>are unchanged. Only the condition codes in theCPSRregister are changed to show the result of a subtraction. In the “immediate” form,<const>is subtracted from the value in<Rn>. In the “register” and “register-shifted register” forms, the value in<Rm>is subtracted from the value in<Rn>. If a shift is specified, the value in<Rm>is shifted by the specified amount before the subtraction is performed. TST-
Does a logical comparison of two values and sets condition flags according to the result.
TST{<c>} <Rn>, #<const> % immediate TST{<c>} <Rn>, <Rm> {,<shift>} % register TST{<c>} <Rn>, <Rm>, <type> <Rs> % register-shifted register<c>is the condition code, Table 9.2.1.<Rn>and<Rm>and<Rn>are registers to compare.<Rs>contains the shift amount in the “register-shifted register” form.\(-257 \le const \le +256\text{,}\) or \(const = +256, +260, +264, \ldots, +65280\text{,}\) or \(const = -261, -265, \ldots, -65281\text{.}\) This odd sequence of values will be explained in Section 11.3.3
<shift>and<type>are explained in Section 9.2.3
The values in
<Rm>,<Rn>, and<Rs>are unchanged. Only the condition codes in theCPSRregister are changed to show the result of a bitwise AND. In the “immediate” form, a bitwise AND is performed between<const>and the value in<Rn>. In the “register” and “register-shifted register” forms, the bitwise AND is performed between the value in<Rm>and the value in<Rn>. If a shift is specified, the value in<Rm>is shifted by the specified amount before the AND is performed.
Subsection 12.1.2 while Loop
We are now prepared to look at how a while loop is constructed at the assembly language level. As usual, we begin with the assembly language generated by the gcc compiler for the program in Listing 12.1.1, which is shown in Listing 12.1.3 with comments added.
.arch armv6
.file "helloLoop1.c"
.section .rodata
.align 2
.LC0:
.ascii "Hello World.\012\000"
.text
.align 2
.global main
.syntax unified
.arm
.fpu vfp
.type main, %function
main:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 1, uses_anonymous_args = 0
push {fp, lr}
add fp, sp, #4
sub sp, sp, #8
ldr r3, .L5 @@ address of text string
str r3, [fp, #-8] @@ store in pointer variable
b .L2 @@ gcc starts at bottom of loop
.L3:
mov r2, #1 @@ one byte
ldr r1, [fp, #-8] @@ current address into string array
mov r0, #1 @@ STDOUT_FILENO
bl write
ldr r3, [fp, #-8] @@ current address into string array
add r3, r3, #1 @@ next byte position
str r3, [fp, #-8] @@ update pointer variable
.L2:
ldr r3, [fp, #-8] @@ current address into string array
ldrb r3, [r3] @@ get byte, zero extend to 32 bits
cmp r3, #0 @@ at end of string array?
bne .L3 @@ no, back to top of loop
mov r3, #0 @@ yes, return 0;
mov r0, r3
sub sp, fp, #4
@ sp needed
pop {fp, pc}
.L6:
.align 2
.L5:
.word .LC0
.size main, .-main
.ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
We see another new instruction here, ldrb, which you can probably guess operates like ldr (Section 9.2) but loads a byte from memory instead of an entire word.
The compiler has created code that starts at the bottom of the while loop instead of the top:
b .L2
This improves the efficiency of the algorithm, perhaps at the expense of readability.
My assembly language solution checks the loop control variable at the top of the loop, as shown in Listing 12.1.4.
@ helloLoop2.s
@ "Hello world" one char at a time.
@ 2017-09-29: Bob Plantz
@ Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified @ modern syntax
@ Useful source code constants
.equ STDOUT,1
.equ NUL,0
@ Constant program data
.section .rodata
.align 2
theString:
.asciz "Hello World.\n"
.text
.align 2
.global main
.type main, %function
main:
sub sp, sp, 16 @ space for saving regs
@ (keeping 8-byte sp align)
str r4, [sp, 4] @ save r4
str fp, [sp, 8] @ fp
str lr, [sp, 12] @ lr
add fp, sp, 12 @ set our frame pointer
ldr r4, theStringAddr @ use r4 as pointer variable
whileLoop:
ldrb r3, [r4] @ get a char
cmp r3, NUL @ end of string?
beq allDone @ yes, all done
mov r0, STDOUT @ no, write to screen
mov r1, r4 @ address of current char
mov r2, #1 @ write 1 byte
bl write
add r4, r4, 1 @ increment pointer var
b whileLoop @ back to top
allDone:
mov r0, 0 @ return 0;
ldr r4, [sp, 4] @ restore r4
ldr fp, [sp, 8] @ fp
ldr lr, [sp, 12] @ lr
add sp, sp, 16 @ restore sp
bx lr @ return
theStringAddr:
.word theString
Subsection 12.1.3 for Loop
A while seems natural for processing an array that is terminated with a sentinel value, but many programmers prefer a for loop for processing a known number of array elements. A C version of a “Hello world” program using a for loop is shown in Listing 12.1.5.
/* forLoop1.c
* "hello world" program using the write() system call
* one character at a time.
* 2017-09-29: Bob Plantz
*/
#include <unistd.h>
int main(void)
{
register char *aString = "Hello World.\n";
register int i;
for (i = 0; i <= 13; i++) {
write(STDOUT_FILENO, aString, 1);
aString++;
}
return 0;
}
for loop (C).Comparing the assembly language for the for loop in Listing 12.1.6 with the assembly language for the while loop in Listing 12.1.5, we see that there is very little difference. The main difference is that we asked that the compiler use registers for local variables in the for loop version.
.arch armv6
.fpu vfp
.file "forLoop1.c"
.section .rodata
.align 2
.LC0:
.ascii "Hello World.\012\000"
.text
.align 2
.global main
.type main, %function
main:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
stmfd sp!, {r4, r5, fp, lr}
add fp, sp, #12
ldr r5, .L5 @@ address of string
mov r4, #0 @@ i = 0'
b .L2
.L3:
mov r0, #1 @@ STDOUT_FILENO
mov r1, r5 @@ address of char
mov r2, #1 @@ one byte
bl write
add r5, r5, #1 @@ increment pointer variable
add r4, r4, #1 @@ i++
.L2:
cmp r4, #13 @@ i = 13?
ble .L3 @@ <=, continue loop
mov r3, #0 @@ yes, return 0;
mov r0, r3
ldmfd sp!, {r4, r5, fp, pc}
.L6:
.align 2
.L5:
.word .LC0
.ident "GCC: (Raspbian 4.9.2-10) 4.9.2"
for loop (gcc asm).Listing 12.1.7 shows my version of the for loop.
@ forLoop2.s
@ "Hello world" one char at a time.
@ 2017-09-29: Bob Plantz
@ Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified @ modern syntax
@ Useful source code constants
.equ STDOUT,1
@ Constant program data
.section .rodata
.align 2
theString:
.asciz "Hello World.\n"
.equ strngLngth,.-theString
@ The program
.text
.align 2
.global main
.type main, %function
main:
sub sp, sp, 16 @ space for saving regs
str r4, [sp, 0] @ save r4
str r5, [sp, 4] @ r5
str fp, [sp, 8] @ fp
str lr, [sp, 12] @ lr
add fp, sp, 12 @ set our frame pointer
ldr r5, theStringAddr @ address of string
mov r4, 0 @ i = 0;
for:
mov r0, STDOUT @ write to screen
mov r1, r5 @ current char
mov r2, 1 @ one byte
bl write
add r5, r5, 1 @ next char in array
add r4, r4, 1 @ i++;
cmp r4, strngLngth @ all chars?
ble for @ no, keep going
mov r0, 0 @ yes, return 0;
ldr r4, [sp, 0] @ restore r4
ldr r5, [sp, 4] @ r5
ldr fp, [sp, 8] @ fp
ldr lr, [sp, 12] @ lr
add sp, sp, 16 @ restore sp
bx lr @ return
.align 2
theStringAddr:
.word theString
for loop (prog asm).The message here is that there is no difference between a for loop and a while loop at the machine code level. You should use the construct in your high level language that is more appropriate for your specific application. In general, use a while loop for a sentinel-controlled loop, and a for loop for a count-controlled loop.
