c.f. sending decimal using voltages.... long cables... interference.
DIGITAL CODE - 2
groups of bits can represent numbers, characters, computer instructions.
How we interpret a 32-bit word depends on what we expect! (Exercise)
Hardware has to be able to manipulate groups of bits differently depending on what they represent.
Amazingly, we can use combinations of Gates to do this
8-bit example
Possible interpretations:
Character ‘1/2’ assuming extended ASCII/ISO
The number 171
The number -85
NUMBER SYSTEMS
We use a "positional number system" #big-endian
in decimal numbers, the further left a digit is, the greater the power of 10 it is multiplied by.
e.g., 348 = 3 x 10^2 + 4 x 10^1 + 8 x 10^0
This system applied to other Radix (or Bases)
eg. Decimal radix = 10 digits 0..9
Binary radix = 2 digits 0..1
Octal radix = 8 digits 0..7
Hexadecimal radix = 16 digits 0..F
Binary numbers
Computers work in binary number
Convert to decimal exercise:
notation: in maths use 01110_2 to indicate a binary number
some systems use other ways such as %01110 or b01110
counting table
Decimal
Binary
0
0000
1
0001
2
0010
3
0011
4
0100
5
0101
6
0110
7
0111
8
1000
9
1001
10
1010
11
1011
12
1100
13
1101
14
1110
15
1111
16
10000
17
10001
Note: In binary, each digit represents an increasing power of 2, starting from the rightmost digit which is \( 2^0 \). The leftmost digit filled in for 16 and 17 represents \( 2^4 \), or 16 in decimal, which is why we move to a new column for these numbers.
HEXADECIMAL NUMBERS.
Hex is another abbreviation for binary numbers.
Each byte represented by 2 hex digits; each hex digit represents 4 bits in an 8-bit byte.
Hex to binary can be done one digit at a time: e.g., what is
in binary?
It then shows the conversion of hexadecimal 4E to its decimal equivalent:
HEXADECIMAL NUMBERS
0=0, 1=1, 2=2, 3=3, 4=4, 5=5, 6=6, 7=7, 8=8, 9=9
Memorise: A=10, B=11, C=12, D=13, E=14, F=15
mathematically use
to indicate a hexadecimal number
some systems use other ways such as 0x1110 or 01110h
Or sometimes hex is default (not decimal)
leading 0x means hex
HEX - BINARY (4 BITS TO A HEX DIGIT)
BIT-WISE OPERATIONS
Can base decisions on a single bit #button state
e.g. if (bit represents True) then...
Usually 1 means True, 0 False.
Unary operation: NOT
Complement ("invert", "flip") the value
Binary* operations: AND, NAND, OR, NOR, XOR
N-bit operations: e.g. 4-input AND, OR
All can be implemented using binary building blocks ("2-input gates")
THE GATE SYMBOLS REPRESENT BOOLEAN ARITHMETIC:
The symbol for AND gate followed by "> 1 AND 1" with the truth table "1.1=1, 1.0=0, 0.0=0, 0.1=0" and a red dot followed by "AND, &"
The symbol for OR gate followed by "> 1 OR 1" with the truth table "1+1=1, 1+0=1, 0+0=0, 0+1=1" and a red plus sign followed by "OR, |"
The symbol for XOR gate followed by "> 1 XOR 1" with the truth table "1^1=0, 1^0=1, 0^0=0, 0^1=1" and a red circled plus sign followed by "XOR, ^"
INPUTS, OUTPUTS
Wires (just drag from one gate pin to another) can connect pins on the gates to each other and to switches and lights.
The power supply is universal and invisible. We don't have to worry about it.
In real life there will be a +ve rail (Vcc) and a -ve rail (GND) with a connection of each to every chip on the board.
COMBINATIONAL CIRCUIT
A combinational circuit is a connected arrangement of logic gates with the set of inputs and outputs.
A combinational circuit can be described by a truth table showing the binary relationship between n input variables and m output variables.
Two examples: half adder, full adder
These circuits serve as basic building blocks for the construction of more complicated arithmetic circuits.
HALF-ADDER
The most basic arithmetic circuit is the addition of two binary digits.
A combinational circuit that performs the arithmetic addition of two bits is called a half-adder.
It consists of two inputs and two outputs:
Two inputs get summed
Two outputs: sum and carry bits
It is necessary to have two output bits because the sum 1 + 1 is binary 10, which has two digits.
FULL-ADDER
A full-adder is a combination circuit that forms the arithmetic sum of three input bits.
It consists of three inputs and two outputs.
Two of the inputs represent the two bits to be added.
The third input represents the carry from the previous lower significant position.
One of the outputs will be the carry into the next higher significant position, i.e. into the next full adder.
Of course two output bits are necessary because the arithmetic sum of three binary digits ranges from 0 to 3, and 2 and 3 need two bits.
Gates are not instantaneous
The image contains the following text:
Ci and the output of X XOR Y arrive at different times!!!
The circuit is unstable.
If only we could synchronise things !!
Clock
Could be something simple like a 555 timer (astable multivibrator using an RC timing element), a crystal oscillator, a phase-locked loop or an atomic clock. Probably just a chip
Clock feeds into the ALU
Choose which circuit forms the output
At the top, it says "(two logic controlled switches)"
There are two input labels "X" and "Y" connected to the inputs of the gates.
There is a label "Ci" circled in red, likely indicating the "Carry in" input.
The text near the bottom reads "AND output selected if Ci is low, OR output selected if Ci is high."
There is a label "Co" circled in blue, likely indicating the "Carry out" output.
Building a Computer? Things we need:
memory within the CPU = banks of flip-flops
Instruction Pointer_ (counter)
full adder + shift operations
Building a computer? Things we need:
Fast memory (RAM)
program and data
Central processing unit (CPU): registers, IP, ALU
Input / Output Channels
accel cards
Persistent memory (Disks)
Motherboard
Screen
Keyboard
Mouse
D-TYPE FLIP-FLOP
> Set data to high - Q goes high on next clock pulse. Stays high.
Set presets Q to high when pulled high
Reset clears Q when pulled high
D Flip Flop
clock data
D-FLIP-FLOPS AS A REGISTER OR LATCH
A register (many bits) or latch (usually one bit) can be made up from a series of D-Flip-Flops driven by a common clock.
The transfer from the D side to the Q side for all D flip flops occurs simultaneously as this clock changes.
BIG ENDIAN AND LITTLE ENDIAN DATA FORMAT
> Technical definition for the order of bytes:
Big Endian:
The most significant byte resides in the smallest memory address
Little Endian:
The least significant byte resides in the smallest memory address
SUMMARY Registers
D Flip Flops generally form the building blocks of registers
Registers:
allow us to store bit strings that represent data
reside close to the CPU, allowing fast and easy access
> Bit Endianness determines how we interpret bit strings:
Big Endian: most significant bit comes first
Lile Endian: least signifcant bit comes frst
> Next Lecture: ripple counters with JK Flip Flops
SHIFT Registers
A shift register takes input from one end, and at each clock change this value is moved to the next D-Flip-Flop.
This is used in serial data transfer when a byte (say) of data sent on a cable one bit after another can be collected in a series of D Flip-Flops to rebuild the whole data byte. This is called serial-to-parallel conversion.
SHIFT REGISTERS
Here is how a high travels through a 3 bit shift register. For this example we assume that each of the shift register bits is cleared at the start.
SHIFT LEFT - SHIFT RIGHT Registers
The shift registers shown so far shift data to the right. A simple rewiring gives a shift register that can move data left. Of course these may also have the ability to parallel load.
WHAT DOES A SHIFT REGISTER DO?
Moves a state (number such as 0, 1) from a low order bit to a higher order bit.
Multiplies a (binary) number by two.
Number of "shifts" depends on number of clock cycles.
Level counter (e.g. volume control)
Can use a counter to enable/disable clock, thereby programming the amount of shift.
Can shift in the other direction - divide numbers by 2.
Can have two clocks - one for left-shift, one for right-shift or use gates to determine the shift direction
SERIAL-TO-PARALLEL CONVERSION
Some shift registers allow all flip-flops to load at once,
This gives parallel-to-serial conversion
RIGHT-SHIFT SERIAL OR PARALLEL INPUT: 4-BIT
Flip-flops are connected (output to input) with a common clock to cascade input from high bytes to low bytes.
WHAT ABOUT BI-DIRECTIONAL SHIFTING (SELECTABLE)?
This takes some thinking!
You need a pin to select which direction
You need to allow inputs to any D Flip Flop to come from either direction
Try and design a 2 bit selectable direction Shift Register!
Serial input only
You will probably need some OR gates, AND gates, and a NOT gate
THE LOGISIM SHIFT REGISTER
COUNTERS IN DIGITAL LOGIC
> A circuit that stores and increments the number of occurrences of an event:
Most commonly clock pulses
> Two types we will consider:
Asynchronous (ripple) counter
Synchronous (common clock) counter
> J-K Flip Flops central to both
RIPPlE COUNTER
Ripple counters utilise the toggle setting of J-K Flip Flops:
Consider this circuit
What happens to Q, each clock pulse ?:
Ripple counters utilise the toggle setting of J-K Flip Flops:
And finally, four FFS
Q, Q,, Q2, Q,?
> The ripple effect of FF toggles exactly matches binary counts:
each oscillates at half the frequency of the FF before
The bits stored in the FFs thus represent incrementing binary values
BUT - on which edge of the clock pulse?
COUNTING DOWN?
What if I want to count down instead of up?
Requires nothing more than a change of FF trigger:
Change FF to trigger on rising edge of clock pulse
Lets take a look
SUMMARY Ripple Counter
Ripple counter achieves a simple binary counter
It utilises the J-K toggle setting:
J= 1, K=1,
> It is asynchronous because:
No common clock
State changes ripple through from least significant FF
› We can change directionality of counter using FF trigger:
Falling edge = increment
Rising edge = decrement
MODULO 6 COUNTER
> What do we need to do ?
An incrementing counter: falling edge trigger - easy!
A forced reset of all Flip Flops to zero when the counter reaches 6:*
some extra logic gates to monitor things, and trigger a FF reset - not too tricky
Detect the illegal state (6)
Reset the FFs to wrap around to zero
MODULO 6 COUNTER WITH A MOMENTARY ILLEGAL STATE
Detecting the first illegal state (6 in this case) and immediately resetting to 0 (don't wait for the clock) by using the asynchronous master reset (MR) or CLR'
This circuit uses a cascading clock
Summary Modulo 6 counter
Counters don't have to wrap around at powers of 2
Can use outputs of FFs and approrpaite logic gates to detect illegal state"
Slight problem though:
Master reset is asynchronous
The illegal state momentarily still happens !
ALTERNATIVE - COUNTER WITH COMMON CLOCK
We can avoid the illegal state by detecting the last legal state (eg., 5 in a modulo 6 counter), and then set to 0 on next clock pulse.
This requires a non-cascading counter.
We need a common clock
Lets build it up from 1 to 3 bits .....
MOD 6 COUNTER WITH COMMON CLOCK
First a 1 bit counter
1-bit (counts 0...1...0...1...) Enable (Set)
Set J and K to make it toggle
2-BIT COUNTER
2-bit (counts 00...01...10...11...00...)
Connect Q1 to J2 and K2
Now Q2 only changes state if Q1 is set (halves the frequency).
but there is no Co
(when state = 11)
Because we have a common clock
COUNTING WITH 3 BITS (LITTLE-ENDIAN)
The issue is that the gates are all using the same clock, so Q3 will toggle at 3, and then the counter will reset to O Try it, You will get the number sequence: 0, 1, 2, 7, 0...
> Add an AND gate to only toggle Q3 when both Q1 and Q2 are set. That's our Carry Out (Co).
NOW TO RESET WHEN IT GETS TO 6 (ILLEGAL STATE)...
When the output is 110 (Q1=0, Q2=1, Q3=1), use a 2-input AND gate to trigger the Reset on all of the Flip-Flops.
Issues The synchronous (common clock) counter
Currently the illegal state (6) still momentarily occurs
Solution:
add D-Flip-Flops as a buffer before displaying
> Adds a delay of 1 clock pulse
3 bit Common Clock - Little Endian - No illegal state
SUMMARY The synchronous (common clock) counter
> Common clock counter synchronises FF state transitions
ROM (read only memory): All reading at full speed. Just got the address and go there.
-contents built-in at time of manufacture.
PROM - Programmable read only memory. Programmed by a once-only irreversible operation, e.g. in factory.
EPROM - Erasable Programmable read only memory. Can be removed from the computer and erased and programmed (slowly) by using special apparatus (e.g. UV light).
Sometimes called "field-programmable", i.e. "in the field"
Bulk erased: every byte erased at the same time
Byte programmable: write bytes one by one.
EEPROM - Electrically erasable read only memory. Can be erased and reprogrammed byte by byte in situ, but writing is slower than normal reading.
TYPES OF MEMORY - 2
RAM random access memory.
A misnomer it should be RWM (read write memory) - both ROM and RAM
allow random access
Static RAM - retains information until power removed. Fast, larger area of silicon per byte, modest power requirement.
Dynamic RAM - retains information as long as the contents are refreshed frequently enough. Smaller area of silicon per byte, low power requirement.
Does not use flip-flops.
Uses tiny capacitors to store electric charge.
Because the charge leaks away have to rewrite ("refresh") every few milliseconds.
TYPES OF MEMORY - 3
> SDRAM (Synchronous DRAM)
Hybrid of dynamic and static technology
"clocked" by the main CPU clock
> DDR (double data rate) SDRAM
Chips produce data on rising and falling edges of the clock
Higher data rates, eg 64 bits / nsec
TYPES OF MEMORY - 3
FRAM - Ferroelectric RAM
• Uses atom position in unit cell (in theory). Actual density about the same as DRAM. Non-volatile (no power / refresh to keep state).
Used in specialist devices (where you never want to
"turn it off and on again" to fix it. Random access
Flash memory - (EEPROM)
Charge stored between insulators. Write bit by injecting electrons through a barrier layer (physically damaging it). Used in USB drives. Good for about 30,000 writes. Random access
Core memory
Magnetic "core" memory: each bit is stored as the direction of magnetisation of tiny rings (like doughnuts).
1955-1975. 1000 x slower than DRAM.
Memory Addressing
RAM consists of one or many chips
RAM is organised into words (of e.g. 32 bits)
Words are grouped into pages.
Words are selected by an address
A number of m bits
Control bits specify whether to read or write
Bits stored in the selected word go to or from the data bus
A byte is typically the smallest addressable chunk of memory. We can address the 4 bytes in a 32-bit word, but we can't address individual
bits in each byte. Exceptions include
Micron 3D-XPoint
RAM ROM SUMMARY
> Many different types of memory:
ROMS variants
RAM variants
Trade offs of speed, space, expense, longevity
> All slower to access than registers
STACKS
Random access memory requires knowing the address of every byte/word you want to access
Stacks offer a way of organising and accessing memory without random (indexed) access:
There are hardware stacks and software stacks.
Hardware stacks created out of dedicated shift registers
Software stacks typically defined in RAM using conventions (we'll come back to this):
IN SIMPLE TERMS...
A stack allows us to mothball/backup/hibernate a process/ task at will on the receipt of an interrupt or code invocation.
To do this, we
push instructions/data that we will need later onto the stack;
do the task;
and then pop the stored data back off the stack and
continue as before.
Stacks (6 deep)
Example:
a hardware stack with a depth of 6.
Let it hold one item (5) as shown
If 7 is now pushed onto the stack it becomes Push on a 0.
Pop off the O
Pop off the 7
Push a 2
Pop off the 2
If you remove another item you get the 5 and the stack is now empty.
SOFTWARE STACKS
We can also use ordinary RAM to implement a stack.
Stack space only limited by the amount of addressable RAM
The "top" of the stack moves "up" and "down" as things are pushed or popped, and the location of the "top" is stored in a register (Stack pointer).
ADD SOME LEDS TO SEE THE PARALLEL OUTPUT / REGISTER
STATE
> We can now modulate the Data input and see the on state propagate through the register (left to right).
HOW DO WE MAKE IT GO BACKWARDS?
Wire-up the cascade backwards
Data goes in the far end, each Q outputs to the D input of the previous Flip-Flop.
HOW DO WE MAKE THE DIRECTION SELECTABLE?
> A controlled gate
Controlled input gate: Selects A if high, B if low
PROGRAMMABLE WIRING
We can use one of these circuits for each input to a Flip-Flop, and use a common control signal to determine direction.
SELECTING THE OUTPUT WITH GATES
> We also need to determine which direction output from a
Flip Flop flows!
> This circuit has a common input, and selectable output.
> We can use one of these for each output from a Flip-Flop, and use a common control signal to determine direction.
LET'S PUT THIS ALL TOGETHER (JUST 2 FFS TO START WITH)
SELECT INPUT
ADD THE OUTPUT DIRECTION SELECTION
> R circuit enabled (red)
OTHER DIRECTION SELECTED?
WHAT COULD POSSIBLY GO WRONG?
Can't short the outputs of two gates together.
Have to Add with an OR aate.
Conflict
THE FIX
The OR gates combine the outputs of the controlling AND gates and pass through the signal from whichever one is enabled... to the D inputs on each Flip-Flop.
Generally need about 3 gates per flip-flop per bit
NEED MORE DEPTH?
This is a 4-deep shift register (
http://www.ee.usyd.edu.au/tutorials/digital_tutorial/part2/
register06.html).
As you can see, there are alternative ways of wiring it up, with different logic gates.
TO MAKE A STACK...
So far we have made a 2-stage bi-directional 1-bit shift register.
To make a proper stack:
Add depth (flip-flops with associated control logic)
Add width (bits) in parallel (common clock, control signals).
identical shift-registers - one for each bit.
SUMMARY build a stack
Hardware stacks formed using banks of shift registers
Simple input selective circuits allow direction selectability:
Programmable gates!
To make a proper stack:
Add depth (flip-flops with associated control logic)
Add width (bits) in parallel (common clock, control signals).
identical shift-registers - one for each bit.
Assembly
int answer = 10 + 4;
Step 1 – the values 10 and 4 need to be represented somewhere in memory – probably registers
int answer = 10+4;
Step 2 – this plus sign needs to be mapped to some sort of instruction that tells the CPU/ALU to use it’s Adder (remember the adder circuit!) on these values
int answer =10+4;
Step 3 – the output of this “adder” must be stored somewhere – again, probably a register.
int answer=10+4;
Step 4 – define a location in memory big enough to hold an integer, and give it a name.
int answer = 10 + 4;
Step 5 – store the result of the addition, currently in a register, in the location associated with the symbolic name “answer”
mov r1, #1 ;r1 = register 1, #: delimiter, #1: literal value(not a pointer)
;1: value in decimal
Add v1
add r2, #1 ;r2: to register 2
add 1 to r2
r2 = r2+1
r2++
Add v2
add r2, r6; r2: to register 2
add r6 to r2
r2 = r2 + r6
Add r3
add r2, r5, r6
add r6 and r5 and put into r2
r2 = r5 + r6
SUB (works the same as ADD)
sub r2, #1; sub: subtract, r2: to register 2
subtract 1 from r2
‘dec’ in some asm
LABELS
Labels are used to give meaningful names for:
the location of instructions for branching,
specify locations of data for loading and storing.
Labels can be made up of any upper or lower case alpha-numeric characters
They must start with a letter
Labels essentially allow programmers to assign meaningful names to memory address locations that will be used often.
This is how we define variables (amongst other things we'll talk about later) in assembly code.
Labels definitions are looked at by the assembler before the executable code
someplace: .word 0
: ->Colon indicates it’s a label definition
someplace → label name (whatever you want)
.word → Type (i.e. size) of data to be stored
0 →Value to initialise memory location with
Define “someplace” as a label for a memory location holding a word value of 0
Store Register
str r1, someplace
, → delimiter, reference (a pointer)
someplace: offset value in decimal
write value in r1 into the memory location associated with the label “someplace”
Summary ARM Basic
ASM programming represents the lowest level of human readable programming
Near 1-to-1 match with machine instructions
RISC versus CISC
ASM programming is useful for:
Hardware specific programming
Optimising code for performance
Reverse engineering and analysing code behaviour
if (x < 2): (3)
x = 0 ← (1)
else: (4)
x = 3 ← (2)
(1) (2) ⇒ First thing to notice: we have two mutually exclusive computations to perform!
(3) (4) ⇒ Second thing to notice: we have a condition that determines which instruction to branch to ← This requires the ability to branch only when a specific condition is true
IF Tests: CMP
In ARM assembly, the CMP (Compare) instruction allows values to be compared:
CMP R1, R2 subtracts the 2nd value (R2) from the 15t (R1)
The result of this subtraction is then used to update the Application Program
Status Register (APSR)
Performed by the ALU
Specifically, 4 flag bits are updated within the APSR:
N
ALU result was Negative
Z
ALU result was Zero
C
ALU set the Carry bit
V
ALU result caused oVerflow
The Program Status Register (PSR) combines:
Application Program Status Register (APSR)
Interrupt Program Status Register (IPSR)
Execution Program Status Register (EPSR).
These registers are mutually exclusive bitfields in the 32-bit PSR. The bit assignments are:
Table 2.4. APSR bit assignments
Bits
Name
Function
[31]
N
Negative flag
[30]
Z
Zero flag
[29]
C
Carry or borrow flag
[28]
V
Overflow flag
[27]
Q
Saturation flag
[26:0]
-
Reserved
Access these registers individually or as a combination of any two or all three registers, using the register name as an argument to the MSR or MRS instructions. For example:
read all of the registers using PSR with the MRS instruction
write to the APSR N, Z, C, V, and Q bits using APSR_nzcvq with the MSR instruction.
inspect the outcome of a CMP here
Details
cmp r2, #1234 ;cmp: compare, r2: r2: register 2
compare r2 and #1234
r2 - #1234 set APSR with ALU flags, store the ALU flags in the APSR
Conditional Branching using the APSR
Branch (B) reads the APSR and jumps according to the flags and the relevant suffix you give it.
In ARMlite, there are five variants of Branch:
B - unconditional branch
BEQ
'Branch if EQual'
BGT
'Branch if Greater Than'
BLT
'Branch if Less Than'
BNE
'Branch if Not Equal'
In the ARM instruction set more generally, there are many others
Determining the comparison from the flags
In ARM assembly, the condition code suffix can be added to many operations. e.g. movne r1,#12
Suffix
Flags
Meaning
EQ
Z set
Equal
NE
Z clear
Not equal
CS or HS
C set
Higher or same (unsigned >= )
CC or LO
C clear
Lower (unsigned < )
MI
N set
Negative
PL
N clear
Positive or zero
VS
V set
Overflow
VC
V clear
No overflow
HI
C set and Z clear
Higher (unsigned >)
LS
C clear or Z set
Lower or same (unsigned <=)
GE
N and V the same
Signed >=
LT
N and V differ
Signed <
GT
Z clear, N and V the same
Signed >
LE
Z set, N and V differ
Signed <=
how does this translate to assembly code
if (x <= 2):
x = 0
LDR R0,X // assume X is holding a value
CMP R0,#2 // compare contents of R0 with #2
BGT skip // if R0 > 2 then jump to skip
MOV R0,#0 // if R0 <= 2 then assign R0 #0
skip: // continue program from here ...
if (x <= 2):
x = 0
else:
x = 3
LDR R0,X // assume X is holding a value
CMP R0,#2 // compare contents of R0 with #2
BGT else // if R0 > 2 then jump to else
MOV R0,#0 // case for R0 <= 2
B cont // Branch to label cont
else:
MOV R0,#3 // case for R0 > 2
cont:
if (x <= 2):
x = 0
else if (x <= 4):
x = 2
else
x = 4
LDR R0,X // assume X is holding a value
CMP R0,#2 // compare contents of R0 with #2
BGT else1 // if R0 > 2 then jump to label else
MOV R0,#0 // case for R0 <= 2
B cont // Branch to label cont
else1: // if R0 > 2
CMP R0,#4 // compare contents of R0 with #4
BGT else2 // if R0 > 4 then jump to label else2
MOV R0, #2 // otherwise, handled case for R0 > 2 <= 4
B cont // Branch to label cont
else2: // if R0 > 4
MOV R0,#4 // case for R0 > 4
cont: // exit point of conditionals. Continue with program from here
Looping and Branching
Recall Labels
Labels are used to give meaningful names to locations in memory
Previously we saw this used to define variables:
Eg someplace: . word O
Defines "someplace" as a label for a memory location holding a word of value O
But we can also use labels to mark the location of instructions in our programs !
Instructions have addresses too !
Recall that before a program is executed, it is loaded into memory
Every instruction therefore occupies a word of memory, and has an address like any other data
Recall also that this is a core principle of the Von Neumann architecture
We can therefore use the address of specific instructions to jump from one location to another
often referred to as branching
The Program Counter
• To understand labels and branching we need to understand how the
CPU keeps track of which instruction to execute next
• It uses the special register, the Program Counter (PC) register
Counter
After each instruction is executed, the PC register is updated with the address of the next instruction to execute.
This update is performed by a hardware counter within the ALU
Labels as branch points
MOV R0, #0
Loop: ;Label referring to the address of the next instruction
ADD R0,R0,#1
B Loop;A Branch instruction, telling the program counter that the next
;instruction to execute is at address of label ‘Loop’
HALT
Unconditional Branch
B someplace; This can be any word aligned address,
;but is almost always given as a label, defined elsewhere in the program
Branch to the address represented by the label someplace
Bit-wise operators
Instructions that manipulate individual bits given to them in their
operands
Generally best understood by considering values in their binary form
Many of these will feel familiar from the first half of this unit
AND – Logical bit-wise AND
Syntax:
AND R3, R1, R2
Performs a bit-wise logical AND on the two inputs given in the second and third operands
Stores the result in the destination, (first operand)
Eg,
R1 "AND" R2":
ORR — Logical bit-wise OR
Syntax:
ORR R3, R1, R2
Performs a bit-wise logical OR on the two inputs given in the second and third operands
Stores the result in the destination, (first operand)
Eg,
R1 "ORR" R2":
EOR – Logical bit-wise Exclusive OR
Syntax:
EOR R3, R1, R2
Performs a bit-wise logical XOR on the two inputs given in the second and third operands
Stores the result in the destination, (first operand)
Eg,
H
R1 "EOR" R2":
Observations on AND,ORR,EOR
Have clear analogies to the logical gates we have seen previously
Generally useful for specific bit manipulations to control hardware, as well as bit masking (e.g, to extract part but not all of a 32 bit string).
Also for performing comparisons between, or merging bit strings.
LSL - Logical Shift Left
Syntax:
LSL R1, R1, #3
Shifts each bit of the input value to the left, by the number of places specified in the third operand, losing the leftmost bits, and adding zeros on the right.
Stores the result in the destination, (first operand)
Eg,
If R1 = #0b01101110, and shift = #3
R1 becomes #0b01110000
LS - Logical Shift Left
Syntax:
LSR R1, R1, #3
Shifts each bit of the input value to the right, by the number of places specified in the third operand, losing the rightmost bits, and adding zeros on the left.
Stores the result in the destination, (first operand)
Eg,
If R1 = #0b01101110, and shift = #3
R1 becomes #0b00001101
LSL and LSR Observations
Assuming Little Endian notation:
A single LSL doubles the value represented by the bit string
A single LSR halves the value represented by the bit string
Both instructions invoke a shift register to perform task in hardware
As with all bit-wise operators, they can be performed in one CPU cycle.
Form the backbone for implementing multiplication and division operations, which typically require multiple CPU cycles to calculate (we'll come back to this later)
Memory Addressing - LDR and STR
A common workflow
The CPU and ALU work with values stored locally in registers
A standard workflow for operating on values in memory:
Load value(s) into registers
Perform operations on values in registers
Write result to register(s)
Store value(s) back to memory
LDR and STR
LDR ('Load Register') copies a value from a memory location into a register
STR ('Store Register) copies a value from a register into a memory location
E.g.:
LDR RO, Oxfc loads the contents of the word starting at memory address 0x000fc
into RO.
STR R1,120 stores the contents of R1 into the word starting at memory address
(decimal) 120.
STR R3, myLabel stores the contents of R3 into the word at the address represented by label myLabel.
LDR R4, myLabel loads the contents of the word at the address represented by label myLabel
LDR and STR (cont)
The address passed to LDR/STR must be 'word aligned'
that is, it must be divisible by 4
Therefore:
LDR R3, 0x81 is an invalid instruction because address 0x00081 is not a valid word address (it is not divisible by four).
The Assembler will trap this.
LDRB and STRB
LDR and STR have byte addressable variants!
LDRB will load just a single byte of memory
into the least significant 8 bits of a register - setting the other bits to 0
STRB will store the least significant 8-bits of a register into a specified
byte of memory
not altering any neighbouring bytes
LDRB and STRB (cont)
LDRB and STRB do not require the specified memory address to be word aligned (divisible by four).
For example
LDRB R3,0x81 is a valid instruction, and will load the contents of the single byte at address 0x00081 into the least significant 8 bits of R3.
Indirect Addressing
LDR and STR also support 'indirect addressing'
Indirect addressing allows memory to be addressed via registers holding the address
In some programming languages (e.g., C), this is the basis of 'pointers'
a memory word that holds the address of another memory word
In ARM assembly:
LDR R0,[R1] will load into RO, the contents of the memory address that is currently held in R1.
STR R3,[R4] will store the value in R3 to the memory word with address currently held in R4
Indirect and Indexed Memory Addressing
LDR and STR memory addressing
We have previously discussed and seen how LDR and STR support
indirect addressing
allows memory to be addressed via registers holding the address
For example:
LDR RO, [R1] will load into RO, the contents of the memory address that is currently held in R1.
STR R3, [R4] will store the value in R3 to the memory word with address currently held in R4
LDR Indirect example
MOV R1, #0x00040
LDR RO, [R1]
R1 = 0x00040
R0 = 0xffffffff
STR Indirect example
MOV RO, #Oxffffffff
MOV R1, #0×00040
STR RO, [R1]
STR: taking value into R0 and storing it into memory location,
Indexed Memory Addressing
It is often desirable to address specific words/bytes of memory as an offset from some base address.
For example:
Accessing dedicated memory locations that control/interact with external hardware/systems
Pixel display
Arrays (blocks of memory used to collect values of the same type)
Structs (blocks of memory used to collect values of different types)
In such examples, there is typically a base address indicating the start of addresses dedicated to the specific use-case.
We can therefore index specific memory locations within as an offset from this base address
Indexed Addressing Syntax
In ARM assembly
LDR R0, [R1 + #4] ; Load the value at memory address [R1] + 4 bytes into R0
STR R2, [R1 + #16] ; Store the value in R2 in memory at address [R1] + 16 bytes
LDR R0, [R1 + R3] ; Load the value at memory address [R1] + R3 (no. of bytes)
;into R0
STR R2, [R1 + R4] ; Store the value in R2 in memory at address [R1] + R4
;(no. of bytes)
the offset number of bytes must be a multiple of 4 (ie., 4 bytes per word)
For byte index addressing (ie STRB and LDRB):
individual bytes can be addressed so no need for offset to be multiple of 4
Remember ARMlite uses little endian byte addressing
If a byte offset is larger than 4 then it will be addressing bytes in adjacent words
Busy Wait Timer (Flashing LED)
Let's simulate a flashing LED
We can simulate a flashing LED by drawing a single pixel in ARMLite's graphical display
ARMlite has three modes of pixel graphics display:
Low resolution (32 × 24 pixels)
Medium resolution (64 × 48 pixels)
High resolution (128 × 96 pixels)
For this we will use the low resolution modality
ARMlite Low Res Pixel Graphics
In ARMLite low-res modality, there are 32 × 24 = 768 pixels.
The display window is mapped to a contiguous block of memory.
Pixel colours are specified as 24 bit Red-Green-Blue values (8 bits per colour component)
Eg Bright Green is 0x000FF000, White is Ox00FFFFFF
In ARMlite only the least significant 24 bits of each word are used
Each pixel therefore has an address, and can be written to (STR) or read from (LDR) like any memory location
ARMlite Low Res Pixel Graphics
In low-res mode, each pixel is assigned a label, starting from top left and moving across each row:
Pixelo, Pixel1, Pixel2 .....Pixel767
For convenience, labels pre-exist for standard colour codes:
Eg., the colour green can be represented by the predefined label green
green == #0×00008000
ARMlite Low Res Pixel Graphics
Let's paint a green pixel in the middle of the display:
The index of the middle of the display will be 11*32 + 15 = 367
MOV RO, #.green // set RO to RGB for green
STR RO, •Pixel367 // write RO to address of Pixel 367
simulate a flashing LED
Issue - the flashing is too fast !
In fact, in a real system this would probably not produce a flash at all,
just a dimmer LED
MOV R0, #.green
MOV R1, #.white
flash:
STR R0, .Pixel367 ;(1)
STR R1, .Pixel367 ;(2)
;The issue is between (1) and (2) -> we need to insert a delay
;between the “ON” and the “OFF" ->we need a timer!
B flash
A "Busy Wait" Timer
Pseudocode:
Set up pixels for display
Loop1:
Turn LED on (paint green pixel)
Busy wait
Turn LED off (paint white pixel)
Busy wait
branch to loop1
A "Dumb" Busy Wait Timer
Busy wait:
a loop that executes until it reaches a certain value
keep CPU occupied
set limit value // some large number
initialise counter to 0
timer:
Add 1 to counter
Compare counter to limit value
branch to timer if counter < limit
So we need a conditional branch as part of the timer loop !
MOV R2, #5000000 // init number of iterations for timer
MOV R2,R3 // MOV iterations into working register R3
timer1:
SUB R3,R3,#1 // subtract 1 from R3
CMP R3, #0 // compare R3 with #0
BNE timer1 // keep looping until R3 reaches zero
Incorporating timer into LED flash
• Recall pseudo code:
Set up pixels for display
Loop1:
Turn LED on (paint green pixel)
Busy wait
Turn LED off (paint white pixel)
Busy wait
MOV R2, #5000000 // init number of iterations for timer
MOV R2,R3 // MOV iterations into working register R3
timer1:
SUB R3,R3,#1 // subtract 1 from R3
CMP R3, #0 // compare R3 with #0
BNE timer1 // keep looping until R3 reaches zero
Imagine you executed the code on two different CPUs?
EG
RPi 2B (900MHz quad-core ARM Cortex-A7 processor)
RPi 4B (1.5GHz quad core ARM Cortex-A72 processor)
Will the delay be the same on both?
No !
A Better Busy Wait
Most CPUs have access to a register that maintains a real time count.
On the Raspberry Pi for example, a Timer register counts in 1 microsecond intervals (106 per second)
In ARMlite, a register maintains a 1 second counter
accessible via the pre-defined .Time label
Eg LDR RO, •Time loads the current time in seconds (since Jan 1 2000)
into RO
We can utilise this to produce a hardware independent real-time
delay
A Better Busy Wait - psuedocode
Set desired delay time in seconds
start_time = current time in seconds
loop:
now = current time in seconds
remaining_time = (now - start_time)
compare remaining_time, delay loop if remaining_time <= delay
A Better Busy Wait
Let's allocate some variables to registers
R2 - desired delay
R3 - now (where the current time will be stored)
R4 - start time
R5 - elapsed time (R3-R4)
Solution
1 mov r0, #.green
2 mov r1, #.white
3 mov r2, #1 ; 1 sec delay time
4 flash:
5 str r0, Pixel367 ; flash on
6 LDR r3, .Time ; start time
7 timer1:
8 LDR r4, .Time ; current time
9 sub r5, r4, r3 ; elapsed time = current time - start time
10 CMP r5, r2
11 BLT timer1
12 str r1, Pixel367 ; flash off
13 LDR r3, .Time ; start time
14 timer2:
15 LDR r4, .Time ; current time
16 sub r5, r4, r3 ; elapsed time = current time - start time
17 CMP r5, r2
18 BLT timer2
19 B flash
20 halt
A Better Busy Wait
Provides an absolute timer (no longer dependent on processor speed!)
However, our approach is still a form of busy wait!
CPU still occupied for duration of delay
Essentially this is polling the timer
Is there a potentially even better approach ?
Well yeah! We could implement an interrupt-based timer
We will come back to this later in semester!
Revisiting our Flashing LED Example
Recall our flashing LED program from last week, and in particular the delay function
;; *****************************************
;; delay function
;; inputs R0 - time delay in seconds
delay:
push {R3,R4,R5,R6}
MOV R3, R0 ; move delay time param into R3
LDR R4, .Time ; get start time
timer:
LDR R5, .Time ; update time
SUB R6, R5, R4 ; calc elapsed time
CMP R6, R3 ; compare elapsed to delay time
BLT timer
pop {R3,R4,R5,R6}
RET
;; *****************************************
This is a busy-wait timer, or in other words, an example of polling
The CPU constants checks for a state – in this case, that the time exceeds a delay time
There is a better way ! Using Interrupts
Now back to our Flashing LED
Lets use ARMlite's Clock Interrupt for this !
To set it up the interrupt handling:
The handler will be called everytime the clock raises an interrupt
So each time it will draw either a green ("LED on") pixel or a white ("LED off") depending on what state the LED is currently in
This is quite different to previously where we inserted a delay between each state change!
Now we're waiting to be told when to change state:
Event driven !
Flashing LED with INT: Interrupt Handler
/////////////////////////////////
/// interrupt handler: toggleLED
// toggles state of LED (on "green" or off "white")
//
toggleLED:
PUSH {RO}
CMP R3, #O // check state
BNE off
MOV R3, #1 // if R3 0, then 1
MOV RO, #.green
B drawpixel
off:
MOV R3, #0 // if R3 1, then O
MOV RO, #.white
drawpixel:
STR RO, .Pixel367 // draw the pixel
POP {RO}
RFE
Flashing LED with INT: Interrupt Handler
CMP R3, #O ; check state -> Check next state to enact (on or off)
BNE off
;next state is “off” ?
;Then turn LED off
off:
MOV R3, #0 ; if R3 1, then 0
MOV R0, #.white
MOV R3, #1 ; if R3 0, then 1 - next state is “ON” ?
;Then turn LED ON
MOV R0, #.green
drawpixel:
STR R0, .Pixel367 ; draw the pixel - Draw the pixel (R0 will be set as
;either green or white)
PUSH {R0}
POP {R0}
We push and pop R0 so we can restore it after the function is finished.
Flashing LED with INT: Full Program
1|// Set up Interrupt handling
2| MOV R0,#toggleLED
3| STR R0,.ClockISR
4| MOV R0,#1000
5| STR R0,.ClockInterruptFrequency
6| MOV R0,#1
7| STR R0,.InterruptRegister //Enable all interrupts
8| MOV R3, #1 // R3 keeps track of state of LED
9|mainProgram:
10| B mainProgram //Here, just an empty loop!
11|/////////////////////////////////
12|/// interrupt handler: toggleLED
13|// toggles state of LED (on "green" or off "white")
14|//
15|toggleLED:
16| PUSH {R0}
17| CMP R3, #0 // check state
18| BNE off
19| MOV R3, #1 // if R3 0, then 1
20| MOV R0, #.green
21| B drawpixel
22|off:
23| MOV R3, #0 // if R3 1, then 0
24| MOV R0, #.white
25|drawpixel:
26| STR R0, .Pixel367 // draw the pixel
27| POP {R0}
28| RFE
Line 9 and 10: This is just an infinite loop because we are only flashing LED and there is nothing else to do but wait for clock interrupts, however this could potentially be doing other useful things
FUNCTIONS
Functions/methods/procedures/sub-routines:
A callable block of organised, re-usable code
Typically single action
accepts arguments (ie parameters)
Eg in C:
int add(int x, inty)
{
int sum = x + y;
return(sum);
}
FUNCTION BASICS
When a function is called:
Arguments need to be placed somewhere the function can access
program control shifts to the function's instructions
When a function completes:
Return value needs to placed somewhere for the calling function to retrieve
program control shifts back to the instruction immediately after where it was called from
Managing this requires a some house keeping needed !
High level programming languages hide most of this !
Not ASM!
FUNCTIONS IN ASM
Not 'native' to assembly
We need to do a lot of the management ourselves
Argument passing:
How do we pass arguments from one function to another
Storing and recalling register values
each function we call will want to use the same registers (only 13 general purpose registers !)
How do we manage this?
Managing the program control
Jumping from one function to another, and then returning back !
RegIster MANAGEMENT
Application Binary Interface (ABI) sets standard way of using ARM registers.
ro-r3 used for function arguments and return values
r4-r12 promised not to be altered by functions
Ir and sp used for stack management
pc is the next instruction - we can use it to exit a function call
ABI
Register
Brief
Preserved
Rules
ro
Argument and result
No
r0 and r1 are used for passing the first two arguments to functions, and returning the results
r1
Argument and result
No
of functions. If a function does not use them for a return value, they can take any value after a function.
r2
Argument
No
r2 and r3 are used for passing the second two arguments to functions. There values after a
r3
Argument
No
function is called can be anything.
CALLING FUNCTIONS
By convention, the first two function arguments are loaded into r0 and r1.
The next two are put into r2 and r3.
The return value of the function is written into r0 and r1 (lowest word in ro).
The function promises not to alter r4-r12.
… but suppose the function needs to use many registers to do calculations...
SUMMARY FUNCTION
Functions are the building blocks of programs:
Organised, re-usable blocks of code
Higher level programming languages have built in support for functions:
Not ASM!
One thing we need to manage is register use
Application Binary Interface (ABI) defines conventions for the use of registers
Next lecture:
How do we store and recall register values ? With a stack of course !
FUNCTIONS IN ASM
Not 'native' to assembly
We need to do a lot of the management ourselves
Argument passing:
How do we pass arguments from one function to another
Storing and recalling register values
each function we call will want to use the same registers (only 13 general purpose registers !)
How do we manage this?
Managing the program control
Jumping from one function to another, and then returning back !
PUSH, POP AND THE STACK
ARM computers have a software stack*.
A separate area of RAM is available for temporary values.
A value in a register can be pushed onto the stack to preserve it for later.
It can be popped off later (in LIFO order).
We can get the memory location (a pointer to it) by checking the SP (R13) register.
SOFTWARE STACK?
A section of RAM managed by the SP (stack pointer)
register.
A sort of 32-bit (64-bit in ARM8) wide array which starts (element 0) high in RAM and grows down as values are added to it.
The stack pointer stores the memory location of the last value added (pushed) to the stack.
Each push decrements SP by 4 (4 bytes per word).
A pop operation removes the last value in the stack and increments the SP by 4 (4 bytes per word)
Software stack (depth only limited by RAM)
Example:
x = don’t care.
1.[SP] points to start of stack.
2.push 5 – SP decremented by 4. [SP] points to 5
3.push 7 - SP decremented by 4, [SP] points to 7
4.push 0 – SP decremented by 4. [SP] points to 0
5.pop 0 – SP incremented by 4. [SP] points to 7
6.pop 7 – SP incremented by 4. [SP] points to 5
7.push 2 – SP decremented by 4. [SP] points to 2.
8.pop 2 – SP incremented by 4. [SP] points to 5.
pop 5 - SP incremented back to it’s starting value.
The stack is now empty. [SP] points to end of stack.
EXAMPLE SYNTAX
Push and pop accept multiple registers if in a 1, ,,...y list
push {r4,r5} ;back them up onto the stack ;use r4 and r5 for something else pop {r4,r5} ;restore them from the stack
Alternatively, do one at a time (but pop in reverse order)
stack: Correct order is preserved for {lists}
RECALL THE ABI
Application Binary Interface (ABI) sets standard way of using ARM registers.
r0-r3 used for function arguments and return values
r4-r12 promised not to be altered by functions
Ir and sp used for stack management
pc is the next instruction - we can use it to exit a function call
ABI CONVENTIONS
ABI compliant functions:
Use ro-r3 for passing and returning values to functions
Promise not to alter r4-r12
... but suppose the function needs to use many registers to do calculations ??
We can use the stack to store and recall register values !
PassIng ARGUMENTS TO FUNCTIONS
To re-use the registers we need to:
Back up registers we need to re-use in a function
Store arguments for the function in r0-r3
Call the function
Read the return values from r0-r1 (optional)
Restore the registers we backed up.
SUMMARY
Software Stack:
Dedicated RAM used to store values FILO
Special register "sp" used to store address of start of the stack
Stacks allow us to store and recall register values efficiently
Stacks integral to functions:
We need to store and recall register values so we don't run out of registers to use!
FUNCTIONS IN ASM
Not 'native' to assembly
We need to do a lot of the management ourselves
Argument passing:
How do we pass arguments from one function to another
Storing and recalling register values
each function we call will want to use the same registers (only 13 general purpose registers !)
How do we manage this?
Managing the program control
Jumping from one function to another, and then returning back !
RECALL OUR “FLASHING LED” PROGRAM
1 mov r0, #.green
2 mov r1, #.white
3 mov r2, #1 ; 1sec delay time
4 flash:
5 str r0, Pixel367 ; flash on
6 LDR r3, .Time ; start time
7 timer1:
8 LDR r4, .Time ; current time
9 sub r5, r4, r3 ; elapsed time = current time - start time
10 CMP r5, r2
11 BLT timer1
12 str r1, Pixel367 ; flash off
13 LDR r3, .Time ; start time
14 timer2:
15 LDR r4, .Time ; current time
16 sub r5, r4, r3 ; elapsed time = current time - start time
17 CMP r5, r2
18 BLT timer2
19 B flash
20 halt
DEFINING FUNCTIONS
We're obviously duplicating code in this example, which in general we want to avoid
Why not write a function to do this !
Imagine we had already written a function called “delay”:
1 mov r0, #.green
2 mov r1, #.white
3 mov r2, #1 ; 1sec delay time
4 flash:
5 str r0, Pixel367 ; flash on
6 LDR r3, .Time ; start time
;BL delay: Calling function “delay” Program control jumps to Instruction
;address represented by the label delay
7 BL delay ; call delay function
8 str r1, Pixel367 ; flash off - str: Once the function is complete,
;program control returns to instruction after function call
9 LDR r3, .Time ; start time
10 BL delay ; call delay function
11 B flash
12 halt
Now we have replaced the duplicated code with a function call
KEY REGISTERS
Program counter (pc, also r15):
Holds the address of the next instruction to execute
Link Register (Ir, also r14):
Holds the address of instruction to return to after a function is complete
HOW ARE THEY USED FOR FUNCTION CALLS?
Program counter (рс):
Is updated when a branch to label (BL) is encountered
Link Register (Ir):
holds what was in po register before it was changed
i.e., address of the next instruction after the function call
brings us back to where we came from (we'd be lost otherwise!
HELPFUL INSTRUCTION - BL
BL label$:
causes program control to jump to labelS,
but also ....
copies next instruction to Ir so we know how to get back!
DEFINING FUNCTIONS
Imagine we had already written a function called “delay”:
7 BL delay
10 BL delay
BL delay sets: • the PC register to be address of delay • the LR to register to the address of next instruction
DEFINING DELAY FUNCTION
So far we have only talked about how we might call a delay function
Lets now think about writing the actual function
Lets the write the function so that it takes a single argument:
The number of seconds to delay
DELAY FUNCTION
delay: ;Label defining start of function in memory
push {R3,R4,R5,R6} ;begin Push all registers we are about to use onto stack
;so we can restore them at the completion of function
MOV R3, R0 ; move delay time param into R3
LDR R4, .Time ; get start time
timer:
LDR R5, .Time ; update time
SUB R6, R5, R4 ; calc elapsed time
CMP R6, R3 ; check elapsed time
BLT timer
pop {R3,R4,R5,R6} ;end Push all registers we are about to use onto stack
;so we can restore them at the completion of function
RET
MOV R3, R0: This is where we accept the parameter passed in via R0. We move it into a register to work with locally
LDR R4, .Time ; get start time
timer:
LDR R5, .Time ; update time
SUB R6, R5, R4 ; calc elapsed time
CMP R6, R3 ; check elapsed time
BLT timer
This implements the Delay just like we saw in (the “better dumb timer”)
RET
An ARMlite instruction specifically for returning from Functions. It makes sure the Program Counter register is updated using the address in Link Register
BL AND RET
BL and RET instructions work as a pair
BL (Branch to Label) essentially does this:
MOV LR,PC; copy current next instruction to LR
MOV PC, #delay; set PC to make "delay" the next instruction
MOV LR,PC; copy current next instruction to LR
We do this so we know where to return to after the function is finished
MOV PC, #delay; set PC to make "delay" the next instruction
We do this so we know where to jump to next to execute the function
BL AND RET
RET (ie., RETurn), is how we exit functions, and return where we came from.
RET essentially does this:
MOV PC, LR
It copies the original next instruction (ie., the one we first copied into LR before we called the function) back to PC so that it is the next instruction executed.
WHOLE PROGRAM Program Counter and Link Register
1 mov R0, #.green
2 mov R1, #.white
3 mov R2, #1 ; 1sec delay time
4 flash:
5 str R0, Pixel367 ; flash on
6 LDR R3, .Time ; start time
7 push {R0}
8 MOV R0, R2
9 BL delay ; call delay function
10 pop {R0}
11 str R1, Pixel367 ; flash off
12 LDR R3, .Time ; start time
13 push {R0}
14 MOV R0, R2
15 BL delay ; call delay function
16 pop {R0}
17 B flash
18 halt
19 delay:
20 push {R3,R4,R5,R6}
21 MOV R3, R0 ; move delay time param into R3
22 LDR R4, .Time ; get start time
23 timer:
24 LDR R5, .Time ; update time
25 SUB R6, R5, R4 ; calc elapsed time
26 CMP R6, R3
27 BLT timer
28 pop {R3,R4,R5,R6}
29 RET
SUMMARY Program Counter and Link Register
Function calls require branching to a different instruction address
Use bl to branch to a label
Use RET to return back to calling code
Program counter (pc) and Link Registers:
pc: address of the next instruction to execute
Ir: address of instruction to return to after a function is complete
FUNCTIONS IN ASM
Managing the program control
Jumping from one function to another, and then returning back!
RECALL OUR “FLASHING LED” PROGRAM
1 mov R0, #.green
2 mov R1, #.white
3 mov R2, #1 ; 1sec delay time
4 flash:
5 str R0, .Pixel367 ; flash on
6 LDR R3, .Time ; start time
7 push {R0}
8 MOV R0, R2
9 BL delay ; call delay function
10 pop {R0}
11 str R1, .Pixel367 ; flash off
12 LDR R3, .Time ; start time
13 push {R0}
14 MOV R0, R2
15 BL delay ; call delay function
16 pop {R0}
17 B flash
18 halt
19 delay:
20 push {R3,R4,R5,R6}
21 MOV R3, R0 ; move delay time param into R3
22 LDR R4, .Time ; get start time
23 timer:
24 LDR R5, .Time ; update time
25 SUB R6, R5, R4 ; calc elapsed time
26 CMP R6, R3 ; compare elapsed to delay time
27 BLT timer
28 pop {R3,R4,R5,R6}
29 RET
Last lecture we created a function to call for delays
Line 9 and 11, between line 19 and 29
Between 4 and 18 and inside that: between 5 and 10 and between 11 and 16
But we can decompose this further!
Let's write a flash function, and pass it a colour to flash, and a time delay between flashes.
The only difference between these blocks is the colour of the pixel being drawn.
We can write this as a function and pass the colour (and delay time) as parameters
DRAW PIXEL FUNCTION
drawpixel:
PUSH (R3, R4}
MOV R3, RO ; copy pixel colour to R3
MOV R4, R1 ; copy delay time to R4 STR R3, .Pixel367; draw pixel with colour
PUSH {RO}
MOV RO, R4 ; pass delay time to delay function
BL delay ; call delay function
Pop {RO} RET ; job done
FLASH LOOP (NOW CALLING DRAWPIXEL)
mov R2, #1 ; 1sec delay time
flash:
MOV R0, #.green ; pass green to drawpixel
MOV R1, R2 ; pass time delay
BL drawpixel ; "flash on“
MOV R0, #.white ; pass white to drawpixel
MOV R1, R2 ; pass time delay
BL drawpixel ; "flash off"
B flash
HALT
DRAW PIXEL FUNCTION (FIXED)
drawpixel:
PUSH {R3,R4}
MOV R3, R0 ; copy pixel colour to R3
MOV R4, R1 ; copy delay time to R4
STR R3, .Pixel367 ; draw pixel with colour
PUSH {R0}
MOV R0, R4 ; pass delay time to delay function
PUSH {LR} ; backup LR before BL delay overwrites it!
BL delay ; call delay function
POP {LR} ; restore LR after we’ve returned from delay
Pop {R0}
RET ; job done
SUMMARY - Nested Function Calls
All of this is interesting because this is exactly what compilers have to handle when translating your code in higher level languages
Critical components include LR and PC registers,
But also the stack !
The stack allows functions to do what they do using all registers, and then restores what was there previously
This is also critical to interrupt handling
we'll come back to this
Array Basics
Array basics
Arrays are dedicated blocks of memory for storing lists of values of the same type (ie size)
Almost all programming languages support arrays:
E.G in C programming:
int somearray|10];
Defines a block of memory referred to as "somearray", which holds 10 integer (32 bit)
Array basics
int somearray[10];
Using memory in ARMlite, the above could translate to a block of memory like this.
“somearray” refers to an address, marking the start of the allocated block
somearray[3] = 5;
How does this translate to ARM assembly ?
We simply define a label!
int somearray[10];
somearray: .Word
0
0
0
0
0
0
0
0
0
0
somearray[3] = 5;
We have the address of the array which needs to be stored in a register
MOV R0, #somearray
somearray[3] = 5;
We have an index to a 32 bit integer within the array.
Address of this value will be 3 x 4 = 12 bytes from start of array
We can store this in a register
MOV R0, #somearray
MOV R1, #12
somearray[3] = 5;
We can keep the value to store in another register
Only need to convert if read character is actually uppercase alphabetic
Convert only if ASCIl value is between 65 and 90
If character is uppercase alphabetic:
Add 32 to the value
Write the new value back to the array
If character is not uppercase alphabetic:
Leave as is (i.e. skip the above)
Determine if its uppercase alphabetic
toLower: // start of loop
LDRB R2, [R0+R1] // read byte from array
CMP R2, #65 // check lower value limit
BLT skipConversion
CMP R2, #90 // check upper value limit
BGT skipConversion
skipConversion: // jump to here if value not uppercase
ADD R1,R1,#1 // increment index
CMP R2, #0
BNE toLower
if it’s uppercase, convert to lowercase
toLower: // start of loop
LDRB R2, [R0+R1] // read byte from array
CMP R2, #65 // check lower value limit
BLT skipConversion
CMP R2, #90 // check upper value limit
BGT skipConversion
ADD R2,R2,#32 // convert to lowercase value
STRB R2, [R0+R1] // write it back to the array
skipConversion: // jump to here if value not uppercase
ADD R1,R1,#1 // increment index
CMP R2, #0
BNE toLower
All together
MOV R0, #stringData // store array base address
STR R0, .ReadString // read in input
MOV R1, #0 // initialise index
toLower: // start of loop
LDRB R2, [R0+R1] // read byte from array
CMP R2, #65 // check lower value limit
BLT skipConversion // if lower than uppercase range, skip conversion
CMP R2, #90 // check upper value limit
BGT skipConversion // if greater than uppercase range, skip conversion
ADD R2,R2,#32 // convert to lowercase value
STRB R2, [R0+R1] // write it back to the array
skipConversion: // branch destination if value was not uppercase
ADD R1,R1,#1 // increment index
CMP R2, #0 // check if NULL character
BNE toLower // Keep looping if not NULL character
STR R0, .WriteString // write converted string to output
HALT
.Align 256
stringData: .BLOCK 128