Section 14.6 Division
Algorithm 2.5.1 shows how we can compute the decimal equivalent of an int stored in binary format. It repeatedly divides the int by \(10\text{.}\) The remainder after each integer division is the equivalent decimal digit, starting with the low-order digits.
Many programming languages use “modulo” (‘%’ in C) and “remainder” interchangeably. The definitions of “modulo” vary in the literature. The differences arise when dealing with negative numbers. Our use here will not use negative numbers, so it will not be an issue.
There are two simple divide instructions, sdiv and udiv.
SDIV-
Divides a signed 32-bit value into another signed 32-bit value, producing a 32-bit signed result.
SDIV{<c>} <Rd>, <Rn>, <Rm>The condition flags are not changed.
<c>is the condition code, Table 9.2.1.<Rd>specifies the destination register.<Rm>contains the divisor and<Rn>the dividend.
The value in
<Rn>is divided by the value in<Rm>and the result is stored in<Rd>. All values are treated as signed values. The remainder is lost. UDIV-
Divides an unsigned 32-bit value into another unsigned 32-bit value, producing a 32-bit unsigned result.
UDIV{<c>} <Rd>, <Rn>, <Rm>The condition flags are not changed.
<c>is the condition code, Table 9.2.1.<Rd>specifies the destination register.<Rm>contains the divisor and<Rn>the dividend.
The value in
<Rn>is divided by the value in<Rm>and the result is stored in<Rd>. All values are treated as unsigned values. The remainder is lost.
Since the divide instructions in the ARM ignore the remainder, we will need to compute it on our own in order to use Algorithm 2.5.1. The sequence of instructions:
udiv r0, r6, r7 @ no, div to get quotient mul r1, r0, r7 @ need for computing remainder sub r2, r6, r1 @ the mod (remainder)
uses the udiv instruction to compute the quotient. The quotient is then multiplied by the divisor. Subtracting this result from the original dividend yields the remainder. Listing 14.6.1 shows how this can be done.
@ uIntToDec.s
@ Converts an int to the corresponding unsigned
@ decimal text string.
@ Calling sequence:
@ r0 <- address of place to store string
@ r1 <- int to convert
@ bl uIntToDec
@ 2017-09-29: Bob Plantz
@ Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified @ modern syntax
@ Constant for assembler
.equ tempString,-32 @ for temp string
.equ locals,16 @ space for local vars
.equ zero,0x30 @ ascii 0
.equ NUL,0
@ The program
.text
.align 2
.global uIntToDec
.type uIntToDec, %function
uIntToDec:
sub sp, sp, 24 @ space for saving regs
str r4, [sp, 0] @ save r4
str r5, [sp, 4] @ r5
str r6, [sp, 8] @ r6
str r7, [sp, 12] @ r7
str fp, [sp, 16] @ fp
str lr, [sp, 20] @ lr
add fp, sp, 20 @ set our frame pointer
sub sp, sp, locals @ for local vars
mov r4, r0 @ caller's string pointer
add r5, fp, tempString @ temp string
mov r7, 10 @ decimal constant
mov r0, NUL @ end of C string
strb r0, [r5]
add r5, r5, 1 @ move to char storage
mov r0, zero @ assume the int is 0
strb r0, [r5]
movs r6, r1 @ int to convert
beq copyLoop @ zero is special case
convertLoop:
cmp r6, 0 @ end of int?
beq copy @ yes, copy for caller
udiv r0, r6, r7 @ no, div to get quotient
mul r1, r0, r7 @ need for computing remainder
sub r2, r6, r1 @ the mod (remainder)
mov r6, r0 @ the quotient
orr r2, r2, zero @ convert to numeral
strb r2, [r5]
add r5, r5, 1 @ next char position
b convertLoop
copy:
sub r5, r5, 1 @ last char stored locally
copyLoop:
ldrb r0, [r5] @ get local char
strb r0, [r4] @ store the char for caller
cmp r0, NUL @ end of local string?
beq allDone @ yes, we're done
add r4, r4, 1 @ no, next caller location
sub r5, r5, 1 @ next local char
b copyLoop
allDone:
strb r0, [r4] @ end C string
add sp, sp, locals @ deallocate local var
ldr r4, [sp, 0] @ restore r4
ldr r5, [sp, 4] @ r5
ldr r6, [sp, 8] @ r6
ldr r7, [sp, 12] @ r7
ldr fp, [sp, 16] @ fp
ldr lr, [sp, 20] @ lr
add sp, sp, 24 @ sp
bx lr @ return
This algorithm produces the decimal numeral characters starting with the low-order digits. So the function stores the characters backwards in a local char array. It then copies the characters to the address passed by the calling function, thus reversing the string.
As discussed in the Preface, we consider only a small subset of the ARM instruction set architecture in this book. But there are many instructions that can be very useful for improving the efficiency of some computations. One such instruction is mls which performs the multiply and subtract in one operation, thus simplifying the computation of the remainder.
MLS-
Multiplies two 32-bit values in registers, subtracts the result from the value in a third register, and stores that in a fourth register.
MLS{<c>} <Rd>, <Rn>, <Rm>, <Ra>The condition flags are not changed.
<c>is the condition code, Table 9.2.1.<Rd>specifies the destination register.<Rm>and<Rn>contain the multiplier and multiplicand.<Ra>contains the minuend.
The values in
<Rm>and<Rn>are multiplied, the result is subtracted from the value in<Ra>, and the result is stored in<Rd>. Only the low-order 32 bits are retained.
