Section 12.1 Repetition
The algorithms we choose when programming interact closely with the data storage structure. As you probably know, a string of characters is stored in an array. Each element of the array is of type char
, and in C the end of the data is signified with a sentinel value, the NUL
character (see Table 2.13.1).
Array processing is usually a repetitive task. The processing of a character string is a good example of repetition. Consider the C program in Listing 12.1.1.
/* helloLoop1.c * "hello world" program using the write() system call * one character at a time. * 2017-09-29: Bob Plantz */ #include <unistd.h> int main(void) { char *aString = "Hello World.\n"; while (*aString != '\0') { write(STDOUT_FILENO, aString, 1); aString++; } return 0; }
The while
statement:
while (*aString != '\0') { ... }
controls the execution of the statements within the {...}
block:
It evaluates the Boolean expression
*aString != '\0'
.If the Boolean expression evaluates to false, program flow jumps to the statement immediately following the
{...}
block.If the Boolean expression evaluates to true, program flow enters the
{...}
block and executes the statements there in sequence.At the end of the
{...}
block program flow jumps back up to the evaluation of the Boolean expression.
The pointer variable is incremented with the
aString++;
statement. Notice that this variable must be changed inside the {...}
block. Otherwise, the Boolean expression will always evaluate to true, giving an “infinite” loop.
It is important that you identify the variable the while
construct uses to control program flow—the Loop Control Variable (LCV). Make sure that the value of the LCV is changed within the {...}
block. There may be more than one LCV.
The way that the while
construct controls program flow can be seen in the flow chart in Figure 12.1.2.
while
loop. The large diamond represents a binary decision that leads to two possible paths, “true” or “false.” Program flow returns to the top of the while
loop after the body has been executed for a reevaluation of the Boolean expression.This flow chart shows that we need the following assembly language tools to construct a while
loop:
Instruction(s) to evaluate Boolean expressions.
An instruction that conditionally transfers control (branches) to another location in the program. This is represented by the large diamond, which shows two possible paths.
An instruction that unconditionally transfers control to another location in the program. This is represented by the line that leads from “Execute body of
while
loop” back to the top.
We will explore instructions that provide these tools in the next three subsections.
Subsection 12.1.1 Comparison Instructions
Although most ARM instructions can optionally affect the condition flags, there are two instructions specfically intended to compare two values. They always set the condition flags, but do not change either of the compared values.
CMP
-
Does an arithmetic comparison of two values and sets condition flags according to the result.
CMP{<c>} <Rn>, #<const> % immediate CMP{<c>} <Rn>, <Rm> {,<shift>} % register CMP{<c>} <Rn>, <Rm>, <type> <Rs> % register-shifted register
<c>
is the condition code, Table 9.2.1.<Rn>
and<Rm>
and<Rn>
are registers to compare.<Rs>
contains the shift amount in the “register-shifted register” form.\(-257 \le const \le +256\text{,}\) or \(const = +256, +260, +264, \ldots, +65280\text{,}\) or \(const = -261, -265, \ldots, -65281\text{.}\) This odd sequence of values will be explained in Section 11.3.3
<shift>
and<type>
are explained in Section 9.2.3
The values in
<Rm>
,<Rn>
, and<Rs>
are unchanged. Only the condition codes in theCPSR
register are changed to show the result of a subtraction. In the “immediate” form,<const>
is subtracted from the value in<Rn>
. In the “register” and “register-shifted register” forms, the value in<Rm>
is subtracted from the value in<Rn>
. If a shift is specified, the value in<Rm>
is shifted by the specified amount before the subtraction is performed. TST
-
Does a logical comparison of two values and sets condition flags according to the result.
TST{<c>} <Rn>, #<const> % immediate TST{<c>} <Rn>, <Rm> {,<shift>} % register TST{<c>} <Rn>, <Rm>, <type> <Rs> % register-shifted register
<c>
is the condition code, Table 9.2.1.<Rn>
and<Rm>
and<Rn>
are registers to compare.<Rs>
contains the shift amount in the “register-shifted register” form.\(-257 \le const \le +256\text{,}\) or \(const = +256, +260, +264, \ldots, +65280\text{,}\) or \(const = -261, -265, \ldots, -65281\text{.}\) This odd sequence of values will be explained in Section 11.3.3
<shift>
and<type>
are explained in Section 9.2.3
The values in
<Rm>
,<Rn>
, and<Rs>
are unchanged. Only the condition codes in theCPSR
register are changed to show the result of a bitwise AND. In the “immediate” form, a bitwise AND is performed between<const>
and the value in<Rn>
. In the “register” and “register-shifted register” forms, the bitwise AND is performed between the value in<Rm>
and the value in<Rn>
. If a shift is specified, the value in<Rm>
is shifted by the specified amount before the AND is performed.
Subsection 12.1.2 while
Loop
We are now prepared to look at how a while
loop is constructed at the assembly language level. As usual, we begin with the assembly language generated by the gcc
compiler for the program in Listing 12.1.1, which is shown in Listing 12.1.3 with comments added.
.arch armv6 .file "helloLoop1.c" .section .rodata .align 2 .LC0: .ascii "Hello World.\012\000" .text .align 2 .global main .syntax unified .arm .fpu vfp .type main, %function main: @ args = 0, pretend = 0, frame = 8 @ frame_needed = 1, uses_anonymous_args = 0 push {fp, lr} add fp, sp, #4 sub sp, sp, #8 ldr r3, .L5 @@ address of text string str r3, [fp, #-8] @@ store in pointer variable b .L2 @@ gcc starts at bottom of loop .L3: mov r2, #1 @@ one byte ldr r1, [fp, #-8] @@ current address into string array mov r0, #1 @@ STDOUT_FILENO bl write ldr r3, [fp, #-8] @@ current address into string array add r3, r3, #1 @@ next byte position str r3, [fp, #-8] @@ update pointer variable .L2: ldr r3, [fp, #-8] @@ current address into string array ldrb r3, [r3] @@ get byte, zero extend to 32 bits cmp r3, #0 @@ at end of string array? bne .L3 @@ no, back to top of loop mov r3, #0 @@ yes, return 0; mov r0, r3 sub sp, fp, #4 @ sp needed pop {fp, pc} .L6: .align 2 .L5: .word .LC0 .size main, .-main .ident "GCC: (Raspbian 6.3.0-18+rpi1) 6.3.0 20170516"
We see another new instruction here, ldrb
, which you can probably guess operates like ldr
(Section 9.2) but loads a byte from memory instead of an entire word.
The compiler has created code that starts at the bottom of the while
loop instead of the top:
b .L2
This improves the efficiency of the algorithm, perhaps at the expense of readability.
My assembly language solution checks the loop control variable at the top of the loop, as shown in Listing 12.1.4.
@ helloLoop2.s @ "Hello world" one char at a time. @ 2017-09-29: Bob Plantz @ Define my Raspberry Pi .cpu cortex-a53 .fpu neon-fp-armv8 .syntax unified @ modern syntax @ Useful source code constants .equ STDOUT,1 .equ NUL,0 @ Constant program data .section .rodata .align 2 theString: .asciz "Hello World.\n" .text .align 2 .global main .type main, %function main: sub sp, sp, 16 @ space for saving regs @ (keeping 8-byte sp align) str r4, [sp, 4] @ save r4 str fp, [sp, 8] @ fp str lr, [sp, 12] @ lr add fp, sp, 12 @ set our frame pointer ldr r4, theStringAddr @ use r4 as pointer variable whileLoop: ldrb r3, [r4] @ get a char cmp r3, NUL @ end of string? beq allDone @ yes, all done mov r0, STDOUT @ no, write to screen mov r1, r4 @ address of current char mov r2, #1 @ write 1 byte bl write add r4, r4, 1 @ increment pointer var b whileLoop @ back to top allDone: mov r0, 0 @ return 0; ldr r4, [sp, 4] @ restore r4 ldr fp, [sp, 8] @ fp ldr lr, [sp, 12] @ lr add sp, sp, 16 @ restore sp bx lr @ return theStringAddr: .word theString
Subsection 12.1.3 for
Loop
A while
seems natural for processing an array that is terminated with a sentinel value, but many programmers prefer a for
loop for processing a known number of array elements. A C version of a “Hello world” program using a for
loop is shown in Listing 12.1.5.
/* forLoop1.c * "hello world" program using the write() system call * one character at a time. * 2017-09-29: Bob Plantz */ #include <unistd.h> int main(void) { register char *aString = "Hello World.\n"; register int i; for (i = 0; i <= 13; i++) { write(STDOUT_FILENO, aString, 1); aString++; } return 0; }
for
loop (C).Comparing the assembly language for the for
loop in Listing 12.1.6 with the assembly language for the while
loop in Listing 12.1.5, we see that there is very little difference. The main difference is that we asked that the compiler use registers for local variables in the for
loop version.
.arch armv6 .fpu vfp .file "forLoop1.c" .section .rodata .align 2 .LC0: .ascii "Hello World.\012\000" .text .align 2 .global main .type main, %function main: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 1, uses_anonymous_args = 0 stmfd sp!, {r4, r5, fp, lr} add fp, sp, #12 ldr r5, .L5 @@ address of string mov r4, #0 @@ i = 0' b .L2 .L3: mov r0, #1 @@ STDOUT_FILENO mov r1, r5 @@ address of char mov r2, #1 @@ one byte bl write add r5, r5, #1 @@ increment pointer variable add r4, r4, #1 @@ i++ .L2: cmp r4, #13 @@ i = 13? ble .L3 @@ <=, continue loop mov r3, #0 @@ yes, return 0; mov r0, r3 ldmfd sp!, {r4, r5, fp, pc} .L6: .align 2 .L5: .word .LC0 .ident "GCC: (Raspbian 4.9.2-10) 4.9.2"
for
loop (gcc asm).Listing 12.1.7 shows my version of the for
loop.
@ forLoop2.s @ "Hello world" one char at a time. @ 2017-09-29: Bob Plantz @ Define my Raspberry Pi .cpu cortex-a53 .fpu neon-fp-armv8 .syntax unified @ modern syntax @ Useful source code constants .equ STDOUT,1 @ Constant program data .section .rodata .align 2 theString: .asciz "Hello World.\n" .equ strngLngth,.-theString @ The program .text .align 2 .global main .type main, %function main: sub sp, sp, 16 @ space for saving regs str r4, [sp, 0] @ save r4 str r5, [sp, 4] @ r5 str fp, [sp, 8] @ fp str lr, [sp, 12] @ lr add fp, sp, 12 @ set our frame pointer ldr r5, theStringAddr @ address of string mov r4, 0 @ i = 0; for: mov r0, STDOUT @ write to screen mov r1, r5 @ current char mov r2, 1 @ one byte bl write add r5, r5, 1 @ next char in array add r4, r4, 1 @ i++; cmp r4, strngLngth @ all chars? ble for @ no, keep going mov r0, 0 @ yes, return 0; ldr r4, [sp, 0] @ restore r4 ldr r5, [sp, 4] @ r5 ldr fp, [sp, 8] @ fp ldr lr, [sp, 12] @ lr add sp, sp, 16 @ restore sp bx lr @ return .align 2 theStringAddr: .word theString
for
loop (prog asm).The message here is that there is no difference between a for
loop and a while
loop at the machine code level. You should use the construct in your high level language that is more appropriate for your specific application. In general, use a while
loop for a sentinel-controlled loop, and a for
loop for a count-controlled loop.