Chapter 13. Assembly: Subroutines
Programmers need to break their programs into pieces. In Java, the pieces are called methods; in C, they are called functions; and in assembly language, they are called subroutines. In this chapter we look at how to write subroutines; and through learning about subroutines, we'll also learn about how methods/functions in higher-level languages are actually executed.
13.1. The link register
We've seen already that each processor has a register called its
program counter to track the address of the instruction it is
about to execute. The ARM processor uses R15 for this purpose. In fact,
assembly programs would typically write
rather than
PC
, though the two are synonymous.R15
In invoking a subroutine, a program must store where the processor
should return after completing the subroutine. The ARM processor uses
R14 for this purpose; it is called the link register,
and is usually referenced as
in programs.LR
Below is an example illustrating how this works, using our strcpy example from the previous chapter. Note that this is
not how you should typically call a subroutine; we'll soon see
the right
way. For now, we just want to illustrate how one
could write subroutines using the instructions we've already
seen.
; Note: This way of calling subroutines is stylistically poor.
ADD LR, PC, #after ; place into LR the address to return to
B strcpy
after ; main code continues its work here after subroutine completes
strcpy LDRB R2, [R1], #1
STRB R2, [R0], #1
TST R2, R2 ; repeat if R2 is nonzero
BNE strcpy
MOV PC, LR ; return back into the code calling strcpy
This fragment first loads into
LR the address of the instruction
following the call to strcpy, labeled after in the
above program. In the next instruction, it branches into
strcpy, and the subroutines starts its work. Once complete,
the subroutine copies
LR back into
PC, which leads the processor to pick up at the
after label.
The process of calling subroutines happens enough that the ARM
processors felt that the two-instruction process in the above
illustration was too cumbersome. So they created a new instruction type,
called BL. Thus, rather than the first two instructions above,
we'd write simply
.BL strcpy
13.2. The program stack
Often we want subroutines that themselves call subroutines. Our subroutines will of course use some registers to perform their work. But that brings up a problem: How should we save registers that contain useful information, so that subroutines we call can use those registers for their own purposes?
The solution is naturally to save the registers' values in memory. In fact, we'll use a stack — called the program stack — to save information that a subroutine needs. When it starts, the subroutine will allocate a new block of memory on the top of the stack; and when it returns, the subroutine will release the block from the top, leaving on the stack's top the block of memory for the subroutine that called it.
We implement the stack within our programs by using a large block of memory, and we use a register called the stack pointer to point to the location of the top of the stack. In most processors, including the ARM processor, the stack pointer typically starts at a high address and decreases as we push more values onto it.
In the ARM processor, R13 is conventionally
used for the stack pointer; in assembly programs, we'd typically
reference it as
. We'll push things
onto the stack using the SPSTMDB instruction, and we'll pop
things off the stack using LDMIA.
An example of such a subroutine is below. This subroutine, an adaptation of our earlier fragment for adding the numbers of an array, needs two additional registers beyond the registers containing the parameters. Since a subroutine that calls this subroutine may not be able to afford those registers, we instead opt to write the subroutine so that it saves both registers onto the stack as soon as it is called; and just before returning, it restores the registers to their previous values.
; sumArray: Places sum of entries in array into R0. On entry, R0
; should be address of first array element, R1 should be array length.
sumArray STMDB SP!, { R4, R5 } ; push R4 and R5 onto stack
MOV R4, #0
sumLoop MOV R5, [R0], #1
ADD R4, R4, R5
SUBS R1, R1, #1
BNE sumLoop
MOV R0, R4
LDMIA SP!, { R4, R5 } ; restore R4 and R5 from stack
MOV LR, PC ; return back to after sumArray call
One of the most common registers a subroutine will want to save is
the link register, since the subroutine will often want to modify the
link register itself as it calls other subroutines. There is a handy
trick involving this: When we restore the registers at the subroutine's
end, we can easily restore the link register's saved value into the
program counter instead. As a result, we won't need the
instruction.MOV PC, LR
subName STMDB SP!, { R4-R5,LR }
; code within subroutine goes here, with perhaps some calls
; to other subroutines (thus changing LR)
LDMIA SP!, { R4-R5,PC } ; loading into PC returns out of subroutine
13.3. Calling conventions
To write large assembly programs, we need a standard system for passing parameters, return values, and allocate registers between subroutines. After all, if each subroutine created its own system, things would quickly get very confusing as we try to remember for each subroutine how to hand parameters to it and which registers it uses. Such a standard system is called a calling convention, and often there's a standard calling convention associated with the processor.
For the ARM processor, we'll follow the standard calling convention
that parameters are passed by placing the parameter values into
registers R0 through R3 before calling the subroutine, and a subrotine
returns a value by placing it into R0 before returning.
In the rare situation that a subroutine wants more than four parameters,
we'd place any additional parameters onto the stack before entering the
subroutine (with the earlier parameters pushed last onto the stack, so
that the fifth parameter is on the stack's top (referenced by
SP).
Each subroutine is allowed to alter R0 through R3 as it wishes; but if it uses R4 through R12, it must restore them to their previous values. It must also restore the stack pointer R13, effectively removing everything from the stack. It may change the link pointer R14.
Assembly programmers divide the registers into caller-save registers and callee-save registers. Caller-save registers are those that the subroutine may change, such as R0 through R3 in the ARM convention described above: They are caller-save because since a caller of a subroutine must save the registers' values if it wants the values after the subroutine completes. Callee-save registers are those that a subroutine must leave unchanged, like R4 through R12 is the convention described above: Upon being called, the subroutine (the callee) must save the registers' values if it wishes to use them.
It's beneficial for a calling convention to designate both caller-save registers and callee-save registers. If the convention designated all registers as callee-save, then subroutines would not be able to use any registers at all without saving them onto the stack first — which would be a waste, since some of the saved registers would be transient values that the calling subroutine did not care about long-term. And if the convention designated all registers as caller-save, then programmers would be forced to save many registers before every call to a subroutine and to restore them afterwards, lengthening the amount of time to call a subroutine.
