Operating System in 1,000 Lines - RISC-V 101

  1. Intro
  2. Getting Started
  3. RISC-V 101
  4. Overview
  5. Boot
  6. Hello World!
  7. C Standard Library
  8. Kernel Panic
  9. Exception
  10. Memory Allocation
  11. Process
  12. Page Table
  13. Application
  14. User Mode
  15. System Call
  16. Disk I/O
  17. File System
  18. Outro

RISC-V

Just like web browsers hide the differences between Windows/macOS/Linux, operating systems hide the differences between CPUs. In other words, operating system is a program which controls the CPU to provide an abstraction layer for applications.

In this book, I chose RISC-V as the target CPU because:

We will write an OS for 32-bit RISC-V. Of course you can write for 64-bit RISC-V with only a few changes. However, the wider bit width makes it slightly more complex, and the longer addresses can be tedious to read.

QEMU virt machine

Computers are composed of various devices: CPU, memory, network cards, hard disks, and so on. For example, although iPhone and Raspberry Pi use Arm CPUs, it's natural to consider them as different computers.

In this book, we support the QEMU virt machine (documentation) because:

RISC-V assembly 101

RISC-V, or RISC-V ISA (Instruction Set Architecture), defines the instructions that the CPU can execute. It's smilar to APIs or programming language specifications for programmers. When you write a C program, the compiler translates it into RISC-V assembly. Unfortunately, you need to write some assembly code to write an OS. But don't worry! Assembly is not as difficult as you might think.

Try Compiler Explorer!

A useful tool for learning assembly is Compiler Explorer, an online compiler. As you type C code, it shows the corresponding assembly code.

By default, Compiler Explorer uses x86-64 CPU assembly. Specify RISC-V rv32gc clang (trunk) in the right pane to output 32-bit RISC-V assembly.

Also, it would be interesting to specify optimization options like -O0 (optimization off) or -O2 (optimization level 2) in the compiler options and see how the assembly changes.

Assembly language basics

Assembly language is a (mostly) direct representation of machine code. Let's take a look at a simple example:

addi a0, a1, 123

Typically, each line of assembly code corresponds to a single instruction. The first column (addi) is the instruction name, also known as the opcode. The following columns (a0, a1, 123) are the operands, the arguments for the instruction. In this case, the addi instruction adds the value 123 to the value in register a1, and stores the result in register a0.

Registers

Registers are like temporary variables in the CPU, and they are way faster than memory. CPU reads data from memory into registers, does arithmetic operations on registers, and writes the results back to memory/registers.

Here are some common registers in RISC-V:

Register ABI Name (alias) Description
pc pc Program counter (where the next instruction is)
x0 zero Hardwired zero (always reads as zero)
x1 ra Return address
x2 sp Stack pointer
x5 - x7 t0 - t2 Temporary registers
x8 fp Stack frame pointer
x10 - x11 a0 - a1 Function arguments/return values
x12 - x17 a2 - a7 Function arguments
x18 - x27 s0 - s11 Temporary registers saved across calls
x28 - x31 t3 - t6 Temporary registers

Calling convention:

Generally, you may use CPU registers as you like, but for the sake of interoperability with other software, how registers are used is well defined - this is called the calling convention.

For example, x10 - x11 registers are used for function arguments and return values. For human readability, they are given aliases like a0 - a1 in the ABI. Check the spec for more details.

Memory access

Registers are really fast, but they are limited in number. Most of data are stored in memory, and programs reads/writes data from/to memory using the lw (load word) and sw (store word) instructions:

lw a0, (a1)  // Read a word (32-bits) from address in a1
             // and store it in a0. In C, this would be: a0 = *a1;
sw a0, (a1)  // Store a word in a0 to the address in a1.
             // In C, this would be: *a1 = a0;

You can consider (...) as a pointer dereference in C language. In this case, a1 is a pointer to a 32-bits-wide value.

Branch instructions

Branch instructions change the control flow of the program. They are used to implement if, for, and while statements,

    bnez    a0, <label>   // Go to <label> if a0 is not zero
    // If a0 is zero, continue here

<label>:
    // If a0 is not zero, continue here

bnez stands for "branch if not equal to zero". Other common branch instructions include beq (branch if equal) and blt (branch if less than). They are similar to goto in C, but with conditions.

Function calls

jal (jump and link) and ret (return) instructions are used for calling functions and returning from them:

    li  a0, 123      // Load 123 to a0 register (function argument)
    jal ra, <label>  // Jump to <label> and store the return address
                     // in the ra register.

    // After the function call, continue here...

// int func(int a) {
//   a += 1;
//   return a;
// }
<label>:
    addi a0, a0, 1    // Increment a0 (first argument) by 1

    ret               // Return to the address stored in ra.
                      // a0 register has the return value.

Function arguments are passed in a0 - a7 registers, and the return value is stored in a0 register, as per the calling convention.

Stack

Stack is a Last-In-First-Out (LIFO) memory space used for function calls and local variables. It grows downwards, and the stack pointer sp points to the top of the stack.

To save a value into the stack, decrement the stack pointer and store the value (aka. push operation):

    addi sp, sp, -4  // Move the stack pointer down by 4 bytes
                     // (i.e. stack allocation).

    sw   a0, (sp)    // Store a0 to the stack

To load a value from the stack, load the value and increment the stack pointer (aka. pop operation):

    lw   a0, (sp)    // Load a0 from the stack
    addi sp, sp, 4   // Move the stack pointer up by 4 bytes
                     // (i.e. stack deallocation).

In C, stack operations are generated by the compiler, so you don't have to write them manually.

CPU modes

CPU has multiple modes, each with different privileges. In RISC-V, there are three modes:

Mode Overview
M-mode Mode in which OpenSBI (i.e. BIOS) operates.
S-mode Mode in which the kernel operates, aka. "kernel mode".
U-mode Mode in which applications operate, aka. "user mode".

Privileged instructions

Among CPU instructions, there are types called privileged instructions that applications (user mode) cannot execute. In this book, we use the following privileged instructions:

Opcode and operands Overview Pseudocode
csrr rd, csr Read from CSR rd = csr;
csrw csr, rs Write to CSR csr = rs;
csrrw rd, csr, rs Read from and write to CSR at once tmp = csr; csr = rs; rd = tmp;
sret Return from trap handler (restoring program counter, operation mode, etc.)
sfence.vma Clear Translation Lookaside Buffer (TLB)

CSR (Control and Status Register) is a register that stores CPU settings. The list of CSRs can be found in RISC-V Privileged Specification.

Some instructions like sret do some somewhat complex operations. To understand what actually happens, reading RISC-V emulator source code might be helpful. Particularly, rvemu is written in a intuitive and easy-to-understand way (e.g. sret implementation).

Inline assembly

In following chapters, you'll encounter special C language syntax like this:

uint32_t value;
__asm__ __volatile__("csrr %0, sepc" : "=r"(value));

This is "inline assembly", a syntax for embedding assembly into C code. While you can write assembly in a separate file (.S extension), using inline assembly are generally preferred because:

How to write inline assembly

Inline assembly is written in the following format:

__asm__ __volatile__("assembly" : output operands : input operands : clobbered registers);
Part Description
__asm__ Indicates it's an inline assembly.
__volatile__ Tell the compiler not optimize the "assembly" code.
"assembly" Assembly code written as a string literal.
output operands C variables to store the results of the assembly.
input operands C expressions (e.g. 123, x) to be used in the assembly.
clobbered registers Registers whose contents are destroyed in the assembly. If forgotten, the C compiler won't preserve the contents of these registers and would cause a bug.

Output and input operands are comma-separated, and each operand is written in the format constraint (C expression). Constraints are used to specify the type of operand, and usually =r (register) for output operands, and r for input operands.

Output and input operands can be accessed in the assembly using %0, %1, %2, etc., in order starting from the output operands.

Examples

uint32_t value;
__asm__ __volatile__("csrr %0, sepc" : "=r"(value));

This reads the value of the sepc CSR using the csrr instruction, and assigns it to the value variable. %0 corresponds to the value variable.

__asm__ __volatile__("csrw sscratch, %0" : : "r"(123));

This writes 123 to the sscratch CSR, using the csrw instruction. %0 corresponds to the register containing 123 (r constraint), and it would actually look like:

li    a0, 123        // Set 123 to a0 register
csrw  sscratch, a0   // Write the value of a0 register to sscratch register

Although only the csrw instruction is written in the inline assembly, the li instruction is automatically inserted by the compiler to satisfy the "r" constraint (value in a register). It's super convenient!

Inline assembly is a compiler-specific extension not included in the C language specification. You can check detailed usage in the GCC documentation. However, it takes time to understand because constraint syntax differs depending on CPU architecture, and it has many complex functionalities.

For beginners, I recommend to search for real-world examples. For instance, HinaOS and xv6-riscv are good references.