Entertainment at it's peak. The news is by your side.

Lecture notes on RISC-V assembly


Studying Dreams

  • Be ready to resolve a squawk utilizing integer assembly instructions.
  • Be ready to resolve a conditional assertion utilizing branches.
  • Realize the a variety of components of assembly source code.
  • Realize what assembly sections store what recordsdata.
  • Realize the weight and store instructions and recordsdata sizes.
  • Realize the design to make impart of the stack for local storage.

RISC-V Meeting

RISC-V assembly is admire all other assembly and resembles MIPS assembly. Pleasant admire all assembly, we comprise a listing of instructions that incrementally glean us closer to our resolution.

We would possibly per chance be utilizing the riscv-g++ compiler and linking C++ files with assembly files. You can write the assembly files, and the C++ files lend a hand fabricate the lab a minute bit more straightforward.

Meeting Recordsdata

Meeting files destroy in a .S (capital S). The compiler involves all stages of compiling, assembling, and linking, but when we pass a file with a capital S, the compiler will skip correct to the assembling stage. Though, the capital S lets in us to make impart of the pre-processor, whereas a lowercase s will skip all of that.

vim myfile.S

RISC-V Register File

RISC-V comprises 32 integer registers and 32 floating level registers. By the ABI names, we reserve these forms of registers for certain functions. As an instance, all registers that initiate with a t for temporary would possibly per chance per chance be ancient for any functions. All registers that initiate with an a for argument are ancient for arguments passed to a feature. All registers that initiate with s (rather then sp) for saved are registers that are preserved all over feature calls.

Integer Instructions

RISC-V comprises integer and good judgment instructions as smartly as a few memory instructions. RISC-V is a load/store structure, so integer instruction operands need to be registers.

Instruction Example Description
lb t0, 8(sp) Hundreds (dereferences) from memory contend with (sp + 8) into register t0. lb = load byte, lh = load halfword, lw = load notice, ld = load doubleword.
sb t0, 8(sp) Stores (dereferences) from register t0 into memory contend with (sp + 8). sb = store byte, sh = store halfword, sw = store notice, sd = store doubleword.
add a0, t0, t1 Provides brand of t0 to the worth of t1 and stores the sum into a0.
addi a0, t0, -10 Provides brand of t0 to the worth -10 and stores the sum into a0.
sub a0, t0, t1 Subtracts brand of t1 from brand of t0 and stores the adaptation in a0.
mul a0, t0, t1 Multiplies the worth of t0 to the worth of t1 and stores the product in a0.
div a1, s3, t3 Dividies the worth of t3 (denominator) from the worth of s3 (numerator) and stores the quotient into the register a1.
rem a1, s3, t3 Divides the worth of t3 (denominator) from the worth of s3 (numerator) and stores the relaxation into the register a1.
and a3, t3, s3 Performs logical AND on operands t3 and s3 and stores the destroy outcome into the register a3.
or a3, t3, s3 Performs logical OR on operands t3 and s3 and stores the destroy outcome into the register a3.
xor a3, t3, s3 Performs logical XOR on operands t3 and s3 and stores the destroy outcome into the register a3.

Since RISC-V is a lowered instruction living, many instructions that would possibly per chance per chance be accomplished by utilizing one other instruction are left off. As an instance, the neg a0, a1 (two’s complement) instruction does no longer exist. On the opposite hand, that is linked to sub a0, zero, a1. In other phrases, 0 - a1 is the linked as -a1.

Pseudo Instructions

The assembler provides for lots of pseudoinstructions, which fabricate bigger into precise instructions. As an instance, neg above is a pseudoinstruction. Whenever the assembler reads this instruction, it routinely expands it to be the sub instruction. Below is a listing of all pseudoinstructions and their feature.

Floating Level Instructions

The floating level instructions are prefixed with an f, comparable to fld, fsw, for floating-level load doubleword and floating level store notice, respectively. The floating level instructions near in two flavors: (1) single-precision and (2) double-precision. You would grab out which recordsdata size you grab to comprise by adding a suffix, which is both .s (for single-precision) or .d (for double-precision).

# Load a double-precision brand
flw     ft0, 0(sp)
# ft0 now comprises whatever we loaded from memory + 0
flw     ft1, 4(sp)
# ft1 now comprises whatever we loaded from memory + 4
fadd.s  ft2, ft0, ft1
# ft2 is now ft0 + ft1

Observe within the code above, we ancient the fadd.s instruction to present an evidence for the RISC-V processor to add two single-precision values (ft0 and ft1) and store it as a single precision brand into ft2.

We can convert between double and single precision utilizing the instructions fcvt.d.s (convert from single into double) or the fcvt.s.d (convert from double to single).

Branching Instructions

Branching instructions are a mode to leap to totally different components of your code. If we did no longer comprise branching instructions, the CPU would correct be ready to construct one instruction after one other. With jumps and branches, we can streak to any instruction, even out of uncover!

Branching instructions are how feature calls and conditionals are implemented in assembly. Branching refers back to the “conditional leap” instructions, comparable to beq, bne, bgt, bge, blt, ble for division-if equals, no longer equals, increased than, increased than or equals, lower than, and lower than or equals, respectively.

The branching instructions make a selection three parameters: the 2 operands (registers) to review, after which if that comparability holds trusty, a memory mark of the instruction you grab to need to construct. If the division condition is counterfeit, the division instruction is uncared for and the CPU goes to the next instruction below.

# t0 = 0
li      t0, 0
li      t2, 10
bge     t0, t2, loop_end
# Repeated code goes right here
addi    t0, t0, 1
j		loop_head

The assembly code above implements the next C++ loop.

for (int i = 0;i < 10;i++) {
    // Repeated code goes here.

Notice that I used the "contrary" view of the condition. In a for loop, as long as the condition holds true, we execute the body of the loop. In assembly, I took the opposite. I'm saying if t0 is greater than or equal to t2 (>= is the reverse of <), then leap OUT of the loop and be completed.

Taking the contrary realizing can assign us some instructions.

The usage of the Stack

The stack is ancient for local memory storage. The stack grows from bottom (excessive memory) to prime (low memory), and the bottom of the stack has a dedicated register referred to as sp for stack pointer.

Whenever we impart the saved registers or if we are attempting to support a temporary register all over a feature call, we need to assign it apart on the stack. To allocate from the stack, we subtract. To deallocate, we add. Observe we do no longer "neat" the stack. That is why uninitialized variables in C++ are belief of "garbage", since something left on the stack is silent there.

The stack MUST be aligned to 8, that system we need to continuously subtract and add a a pair of of 8 from/to the stack.

addi    sp, sp, -8
sd      ra, 0(sp)
call    printf
ld      ra, 0(sp)
addi    sp, sp, 8

The code above saves the return contend with on the stack, calls printf, after which when printf returns, we load the previous brand of the return contend with encourage off the stack, after which deallocate by adding 8.

C++ to Meeting Conversion

A compiler's job is to remodel .cpp files into assembly files, where an assembler will assemble an assembly file into machine code as an object file. A linker then links all object files together into an executable or into a library.

We know that our C++ code boils down into assembly, so whatever we can construct in C++, we would possibly per chance construct in assembly. I've shown some examples above on the design to jot down a for loop, but let's make a selection a stumble on at the opposite C++ constructs.


Functions are correct a memory mark to the very first instruction. The utility binary interface (ABI) specifies what registers glean what parameters and the design to conclude encourage issues . On the opposite hand, all functions comprise a preamble, which is mainly developing a stack frame for local storage, and an epilogue, which on the total entails loading saved registers and return contend with and sharp the stack pointer sooner than returning.

void my_function();

    # Prologue
    addi    sp, sp, -32
    sd      ra, 0(sp)
    sd      a0, 8(sp)
    sd      s0, 16(sp)
    sd      s1, 24(sp)

    # Epilogue
    ld      ra, 0(sp)
    ld      a0, 8(sp)
    ld      s0, 16(sp)
    ld      s1, 24(sp)
    addi    sp, sp, 32

This code shows that we first allocate 32 bytes from the stack, which is the scale of 4 registers. You would look that I subtract all of the important house off of the stack first, store the values, travel my code, after which construct the epilogue. This become once the most important neutral for adding an offset to the store and cargo instructions.

Another ingredient to mark is that I'm storing all caller saved registers. Once all over again, we must comprise in thoughts all caller saved registers to be destroyed. That involves all temporary, argument, and return contend with registers. I did steer clear of losing saved registers above, but prefer, if we impart the saved registers, we are required to position their fashioned values encourage in them sooner than we return.

We need one prologue and one epilogue. When we call extra functions, we need our stack to be framed. In programming languages functions, you will hear about stack frames. So, we allocate ourselves ALL of the house important for the feature, then store to it.

bne     t0, zero, 1f
# Code goes right here if t0 == 0
j       2f   
bne     t1, zero, 1f
# Code goes right here if t1 == 0
j       2f
# Code goes right here if t0 != 0 and t1 != 0
# Dumping level is right here.

The assembly code above mocks the next C++ code.

if (!t0) {
    // Code goes right here if t0 == 0
else if (!t1) {
    // Code goes right here if t1 == 0
else {
    // Code goes right here if t0 != 0 and t1 != 0
// Dumping level is right here.

Ought to you don't take into accout, the mark 1f system to head to the numeric mark 1 FORWARD of the given situation. That is the reverse of 1b, which looks to be for a numeric mark 1 BACKWARDS of the given situation.

The usage of Printf

Printf requires that the first parameter be a c-style, null-terminated string, which we can rating utilizing the .asciz assembler directive. The next code provides an example of the design to make impart of printf.

.piece .rodata
instructed: .asciz "Worth of t0 = %ld and worth of t1 = %ldn"
.piece .text
    addi    sp, sp, -8
    sd      ra, 0(sp)
    la      a0, instructed
    mv      a1, t0
    mv      a2, t1
    call    printf
    ld      ra, 0(sp)
    addi    sp, sp, 8

The code above shows that we assign the first parameter to printf in a0, which is the string we are attempting to output. Then we are attempting to output the values of t0 and t1, so these need to be moved into the opposite parameter registers a1 and a2, respectively.

Anytime you look a feature call, you need to be troubled about saving the return contend with register, admire I did above. I'd no longer initiate off by utilizing the stack, but everytime I kind "call", my fingers routinely quiz to initiate typing something to assign the RA (return contend with) register. Also, you have to indubitably continuously deallocate sooner than you near!

Utility Binary Interface (ABI)

We comprise 8 argument registers a0 by a7. These ceaselessly is the 8 NON-FLOAT parameters passed to a feature. This involves pointers, in which aX will occupy a memory contend with, or pass-by-brand, in which aX will occupy the precise brand. For floating level values handiest, you will impart fa0 by fa7.

The ABI additional states that we need to near encourage an integer brand by a0 or a floating level brand by fa0.

Ought to that you would possibly comprise a feature that mixes integer and floating level, you exhaust whatever quantity comes first that hasn't been taken. As an instance, comprise in thoughts the next prototype.

float func(int a, int *b, float c);

This option requires that int a be within the register a0, int *b comprise the memory contend with that b functions to in a1, and the worth of float c in fa0. Since we return a float, the destroy outcome need to be assign into fa0 sooner than executing the ret instruction.


Obtain mark that we impart a0, a1, ..., a7. This goes for all sizes, byte, notice, doubleword, and so forth. Spend in thoughts that we parse out the strategies size by picking the instruction. For float versus double, we defend instruction.s versus instruction.d. As an instance, fadd.s fa0, ft0, ft1 provides single-precision values and fadd.d fa0, ft0, ft1 provides double-precision values.

Read More

Leave A Reply

Your email address will not be published.