last updated: 2022-05-10
Song of this chapter: Yosemite Sam > The Looney Tunes Show Blow The Stack Song Boomerang > Blow My Stack!
The stack is a part of the memory always needed if we use subroutines (functions) or interrupts (interrupt service routines (ISR)) in our program.
To explain the working of the stack we will first look at the mechanism how the hardware executes a program.
The Program Counter PC
or instruction pointer IP
in our ATmega328 microcontroller is a a 14 bit processor address register. It indicates where a computer is in its program sequence. The PC
holds (stores) the address of the next instruction to be executed (points to the next instruction). The PC
is not accessible directly.
Normally the PC
is incremented after fetching an instruction. But if we have jump (jmp
), relative jump (rjmp
) or conditionally jump instructions (called branch instructions, e.g. brne
(branch if not equal)), the PC
is loaded with the address of the label we want to jump. Let's look at an example:
In our previous Assembly language example, we used a rcall
instruction to jump to our DELAY
subroutine (function). Why didn't we use a jmp
instruction instead?
Let's look at a modified example:
This can not work! The first jump works because the subroutine brings us back to the main program, one instruction behind the initial jump (red arrows). The second jump (green) will not work, because the subroutine doesn't bring us back to the instruction behind the jump instruction (dashed green arrow) but to the label BACK
and so causing faulty endless loop.
So why does it work with a call instruction?
The rcall
(relative call) instruction respectivelly the call
instruction and the ret
(return) instruction are super instructions, doing more things at the same time.
The rcall
works only in combination with the ret
instruction.
The rcall
instruction first saves the return address (address of the instruction following the rcall) in SRAM
memory (on the stack). Only after doing this the jump is executed. The ret
function fetches automatically the return address from the SRAM memory and jumps to this address.
The part in SRAM where the return addresses are stored is called the stack.
The stack is in great parts managed by the hardware. There exists a 2 byte SPR
register called the stack pointer. This register holds the address where e.g. a return address or a variable is stored in SRAM. The stack pointer is decremented and incremented automatically by the hardware, but the initialisation of the stack pointer must be done by the programmer! This is obligatory if we use:
rcall
, ret
),reti
, return from interrupt) push
and pop
instructions (see below).So it is possible to locate the stack at will in SRAM. But we have to consider an important fact. The stack pointer is decremented when storing data on the stack (red arrow). For the SRAM data, it is the contrary. Addresses are normally incremented when storing normal data to SRAM (green-blue arrow). So the only logically start address for the stack is the highest SRAM address (named RAMEND
im the .inc
files; files with defines for each controller).
An initialisation of the stack pointer in assembler would look like this:
; initialise the stack pointer
ldi r16,HIGH(RAMEND) ;RAMEND = 0x08FF for ATmega328
out SPH,r16
ldi r16,LOW(RAMEND)
out SPL,r16
The rcall
instruction saves the return address (2 byte) on the stack. The stack pointer is automatically decremented by the instruction when the byte is saved. So an rcall
decrements the stack pointer by two. The ret
instruction retrieves the return address and increments the stack pointer by two.
After the call and execution of a subroutine (function, method), the stack is again empty, even if the last address is still saved on the stack. The stack pointer is again on it's initial position.
Beside rcall
and ret
we have two other instructions to work with the stack. With the push
and pop
instructions we can save working registers to the stack. The push
instruction saves the content of a working register to the stack and decrements the stack pointer. The pop
instruction retrieves the content from the stack and saves it to a working register and increments the stack pointer.
The name "stack" comes from the analogy to a set of physical items stacked on top of each other. We can easily take an item off the top of the stack. To get an item deeper in the stack we have to take off multiple other items first.
So the stack works with the LIFO
method. LIFO
is standing for L
ast I
n, F
irst O
ut.
This can be easily seen in our Assembly DELAY
subroutine from the previous chapter. We needed 4 working register r26
-r29
(XL
, XH
, YL
, YH
) inside our DELAY
subroutine to kill time. But the same register could be needed by our main program. So we save the content of these register to the stack when entering the subroutine. Inside the subroutine we can now use the register as local variables at will. Before getting back to the main loop, we restore the 4 register and the main loop does not notice the register were used inside the subroutine.
The order of pushing and poping must be reversed!
DELAY: push XL ;save the 4 used (global) register to stack, so
push XH ;that they are not changed by the subroutine
push YL ;the 2 double reg. X and Y (16 bit) are now free
push YH ;for local use
...
pop YH ;recover 4 used (global) register from stack
pop YL
pop XH
pop XL
ret ;return to main loop (return address on stack)
Here a little example how the stack behaves with one subroutine (SR1
) and 3 local variables (ZH
, ZL
abd r16
):
A local variable is only valid inside a piece of code (e.g. subroutine) but can't be accessed by the main program or other subroutines or interrupt service routines. A local variable can also be only local to the main program!
A global variable can be accessed from everywhere.
This is a real danger when programming in Assembly language. If we forget one push
or pop
, the stack pointer will possibly address the complete SRAM and so overwrite the data memory, the SPR
and the GPR
causing the program to crash. This is called a stack buffer overflow.
Also if we don't take care of our data memory, the data could overwrite the stack, and so falsify e.g. return addresses, causing the program to crash.
The stack may also be a security risk because it stores return addresses. If a virus program gets the return address from stack it can inject own code and run after this the originally program. So the execution of the virus code is not noticed.
0x11
) of the register and return addresses step by step. Mark the position of the stack pointer and fill in the stack pointer addresses as shown above. ;-----------------------------------------------------------------
; main loop
;-----------------------------------------------------------------
002E E111 ldi r17,0x11 ; initialise r17
002F E222 ldi r18,0x22 ; initialise r18
0030 D002 rcall SR1 ; call subroutine SR1
0031 0000 nop ; no operation
0032 CFFB rjmp MAIN ; endless loop
;-----------------------------------------------------------------
; subroutine SR1
;-----------------------------------------------------------------
SR1: 0033 931F push r17 ; save main variables
0034 932F push r18
0035 E313 ldi r17,0x33 ; use r17 locally in SR1
0036 D003 rcall SR2 ; call subroutine SR2
0037 912F pop r18 ; restore main register
0038 911F pop r17
0039 9508 ret ; return to main
SR2: 003A 931F push r17 ; save SR1 variables
003B E414 ldi r17,0x44 ; use r17 locally in SR2
003C 911F pop r17 ; restore SR1 register
003D 9508 ret ; return to SR1
Test the following program. Then remove the two lines with the pop
instruction and call the subroutine one thousand times in a for loop. Document and comment the result.
// view stack pointer (Arduino Uno)
void setup() {
Serial.begin(115200);
delay(1000);
Serial.print("stack_pointer before:\t");
Serial.println(SP,HEX);
mysub();
Serial.print("stack_pointer after:\t");
Serial.println(SP,HEX);
}
void loop() {}
void mysub() {
asm("push r16");
asm("push r17");
Serial.print("stack_pointer inside:\t");
Serial.println(SP,HEX);
asm("pop r17");
asm("pop r16");
}