Section 14.6 Division
Algorithm 2.5.1 shows how we can compute the decimal equivalent of an int
stored in binary format. It repeatedly divides the int
by \(10\text{.}\) The remainder after each integer division is the equivalent decimal digit, starting with the low-order digits.
Many programming languages use “modulo” (‘%
’ in C) and “remainder” interchangeably. The definitions of “modulo” vary in the literature. The differences arise when dealing with negative numbers. Our use here will not use negative numbers, so it will not be an issue.
There are two simple divide instructions, sdiv
and udiv
.
SDIV
-
Divides a signed 32-bit value into another signed 32-bit value, producing a 32-bit signed result.
SDIV{<c>} <Rd>, <Rn>, <Rm>
The condition flags are not changed.
<c>
is the condition code, Table 9.2.1.<Rd>
specifies the destination register.<Rm>
contains the divisor and<Rn>
the dividend.
The value in
<Rn>
is divided by the value in<Rm>
and the result is stored in<Rd>
. All values are treated as signed values. The remainder is lost. UDIV
-
Divides an unsigned 32-bit value into another unsigned 32-bit value, producing a 32-bit unsigned result.
UDIV{<c>} <Rd>, <Rn>, <Rm>
The condition flags are not changed.
<c>
is the condition code, Table 9.2.1.<Rd>
specifies the destination register.<Rm>
contains the divisor and<Rn>
the dividend.
The value in
<Rn>
is divided by the value in<Rm>
and the result is stored in<Rd>
. All values are treated as unsigned values. The remainder is lost.
Since the divide instructions in the ARM ignore the remainder, we will need to compute it on our own in order to use Algorithm 2.5.1. The sequence of instructions:
udiv r0, r6, r7 @ no, div to get quotient mul r1, r0, r7 @ need for computing remainder sub r2, r6, r1 @ the mod (remainder)
uses the udiv
instruction to compute the quotient. The quotient is then multiplied by the divisor. Subtracting this result from the original dividend yields the remainder. Listing 14.6.1 shows how this can be done.
@ uIntToDec.s @ Converts an int to the corresponding unsigned @ decimal text string. @ Calling sequence: @ r0 <- address of place to store string @ r1 <- int to convert @ bl uIntToDec @ 2017-09-29: Bob Plantz @ Define my Raspberry Pi .cpu cortex-a53 .fpu neon-fp-armv8 .syntax unified @ modern syntax @ Constant for assembler .equ tempString,-32 @ for temp string .equ locals,16 @ space for local vars .equ zero,0x30 @ ascii 0 .equ NUL,0 @ The program .text .align 2 .global uIntToDec .type uIntToDec, %function uIntToDec: sub sp, sp, 24 @ space for saving regs str r4, [sp, 0] @ save r4 str r5, [sp, 4] @ r5 str r6, [sp, 8] @ r6 str r7, [sp, 12] @ r7 str fp, [sp, 16] @ fp str lr, [sp, 20] @ lr add fp, sp, 20 @ set our frame pointer sub sp, sp, locals @ for local vars mov r4, r0 @ caller's string pointer add r5, fp, tempString @ temp string mov r7, 10 @ decimal constant mov r0, NUL @ end of C string strb r0, [r5] add r5, r5, 1 @ move to char storage mov r0, zero @ assume the int is 0 strb r0, [r5] movs r6, r1 @ int to convert beq copyLoop @ zero is special case convertLoop: cmp r6, 0 @ end of int? beq copy @ yes, copy for caller udiv r0, r6, r7 @ no, div to get quotient mul r1, r0, r7 @ need for computing remainder sub r2, r6, r1 @ the mod (remainder) mov r6, r0 @ the quotient orr r2, r2, zero @ convert to numeral strb r2, [r5] add r5, r5, 1 @ next char position b convertLoop copy: sub r5, r5, 1 @ last char stored locally copyLoop: ldrb r0, [r5] @ get local char strb r0, [r4] @ store the char for caller cmp r0, NUL @ end of local string? beq allDone @ yes, we're done add r4, r4, 1 @ no, next caller location sub r5, r5, 1 @ next local char b copyLoop allDone: strb r0, [r4] @ end C string add sp, sp, locals @ deallocate local var ldr r4, [sp, 0] @ restore r4 ldr r5, [sp, 4] @ r5 ldr r6, [sp, 8] @ r6 ldr r7, [sp, 12] @ r7 ldr fp, [sp, 16] @ fp ldr lr, [sp, 20] @ lr add sp, sp, 24 @ sp bx lr @ return
This algorithm produces the decimal numeral characters starting with the low-order digits. So the function stores the characters backwards
in a local char
array. It then copies the characters to the address passed by the calling function, thus reversing the string.
As discussed in the Preface, we consider only a small subset of the ARM instruction set architecture in this book. But there are many instructions that can be very useful for improving the efficiency of some computations. One such instruction is mls
which performs the multiply and subtract in one operation, thus simplifying the computation of the remainder.
MLS
-
Multiplies two 32-bit values in registers, subtracts the result from the value in a third register, and stores that in a fourth register.
MLS{<c>} <Rd>, <Rn>, <Rm>, <Ra>
The condition flags are not changed.
<c>
is the condition code, Table 9.2.1.<Rd>
specifies the destination register.<Rm>
and<Rn>
contain the multiplier and multiplicand.<Ra>
contains the minuend.
The values in
<Rm>
and<Rn>
are multiplied, the result is subtracted from the value in<Ra>
, and the result is stored in<Rd>
. Only the low-order 32 bits are retained.