ArvernOS in 2021Clermont-Fd, France
ArvernOS is a side project I started to learn more about operating systems and low level development. I usually work on it for a few weeks in a row, then pause it for a long time until I come back to it again.
At the begin of the year, I tried to make ArvernOS run on a Raspberry Pi 2. It was challenging because the original code was written for x86_64 and QEMU. It kinda worked and I learned a few things along the way but the code didn’t get merged into the main branch because I wasn’t happy with the changes.
Fast forward to November: I really wanted to work on this project again! I cleaned up the codebase, re-reading pretty much every single line of code I had written so far. That gave me a new perspective on the code and I decided to port the ArvernOS kernel to real hardware again.
New architectures supported
250+ commits later, I was able to run the same code, without hacks, on 3 different architectures (x86_64, AArch32 and AArch64) and 3 different real devices: Raspberry Pi 2, Raspberry Pi 3 and LicheePi Nano. Each feature I had to implement for ARM was an opportunity to transform some of the “legacy” x86_64 code into shared code.
Some implementation details
In order to have a mostly generic kernel, some architecture-specific functions have been identified and documented in the
include/kernel/arch/ folder. Each “target” (that is, architecture and/or board) has to implement these functions. That being said, most of them can be left empty and the kernel will still boot.
The boot sequence is target-specific, i.e. there is no unified bootloader. For x86_64, GRUB and Multiboot2 are used, the LicheePi Nano uses U-Boot (SPL) and the default Raspberry Pi bootloader is used for the RPi boards. On these last 3 boards, ARM tags (ATAGs) are used to pass information because of the lack of a Device Tree parser.
Each target follows the same boot sequence:
- the bootloader calls
boot.asmfor each target. This function contains the super-early initialization code required to execute the C code. For ARM, I tried to follow the ARM Linux convention.
- the assembly code calls the
kmain()C function, which is also architecture-specific. Its main task is to set up the hardware before the generic code gets executed (example).
kmain_start()is invoked. This is the ArvernOS generic kernel code, which ends the boot sequence by running a process (the kernel shell for example).
New build system
The build system has been refactored too, although it is still based on non-recursive Makefiles. A base Makefile, which contains most of the configuration, includes the target architecture Makefile and, optionally, the Makefile for the board. This isn’t too complicated and it works well so far.
Besides that, the biggest change has been the switch to LLVM, which greatly simplified compilation for different architectures.
The documentation has been improved as well and you can browse it at: williamdurand.fr/ArvernOS/. I use Doxygen with a more modern theme and a few scripts and hidden HTML comments in markdown files to improve the final content (e.g., the “Note” cards, TODOs, etc.).
The documentation isn’t perfect yet, in part because writing good docs takes time and also because there are still lots of moving parts. Ideally, all public header files should be extensively documented.
With two new architectures supported and more code written, it was time to refactor the Continuous Integration pipeline as well. There are now more than 20 jobs performing all kind of checks and running tests with different configuration flags.
___ ____ _____ / | ______ _____ _________ / __ \/ ___/ / /| | / ___/ | / / _ \/ ___/ __ \/ / / /\__ \ / ___ |/ / | |/ / __/ / / / / / /_/ /___/ / /_/ |_/_/ |___/\___/_/ /_/ /_/\____//____/ [ 0.000000] ArvernOS 0.0.3 (46839059) for aarch64+raspi3 has started [ 0.000000] kmain: cmdline is 'kshell selftest' [ 0.000000] core: initialize interrupt service routines [ 0.000000] time: initialize timer [ 0.006342] time: initialize clock [ 0.006620] sys: initialize syscalls [ 0.006880] fs: initialize virtual file system [ 0.007368] fs: mount tarfs (init ramdisk) - addr= 0x8000000 [ 0.008999] fs: mount devfs [ 0.010348] fs: mount devfs [ 0.010765] fs: mount sockfs [ 0.012040] kmain: loading kshell... [kernel syscalls] Hello, kshell! [kernel stacktrace] [memory] simple test with malloc()/free(): pointer before malloc() = 0000000000000042 success! p=0000000000400190 and value is: it works now allocating a large chunk of memory... Pointer before malloc() = 0000000000400190 success! [filesystem] This message should be written to the console. [aarch64] Got expected exception level: 1 Hello, syscall! Got expected syscall return value: 42 done. [ 0.079482] powering off system: exit_code=0 Powering off system now...
Currently, only the x86_64 architecture is able to load external programs (that can be found in userland). One of these programs is named
testsuite and the idea is to test the whole OS: the boot sequence, the switch to usermode and some system calls.
This project is an experiment that allows me to learn a lot of things and it also challenges me sometimes. I don’t intend to make it production-ready, fully featured or anything like that. Afterall, it is yet another Unix-like monolithic kernel, built on top of various tutorials, studies of various projects (including the Linux kernel), and trials and errors.
There are, however, things I’d like to get done:
- Multi-tasking: it should be possible to create, run and terminate multiple tasks. I don’t want to go fancy on that, just something that is both simple and functional.
- Usermode: the kernel should run at a higher more privileged level than regular (userland) programs. That kinda works on x86_64 already.
- Virtio drivers: in order to test more kernel code, I’d like to implement some virtio drivers like virtio-net.
- Linux compatibility: I’d like to explore how to execute (simple) unmodified Linux binaries on ArvernOS.
- Graphical User Interface (GUI): from what I’ve seen, that does not look like a fun thing to do because I don’t know much about it but maybe it is? In any case, that’s something I’d like to explore in the future.
- Symmetric Multiprocessing (SMP): this would allow the kernel to leverage all the CPU cores. There are interesting challenges, here, e.g., to avoid synchronization problems.
I have some WIP patches for some of these things and the rest is kinda unknown to me. We’ll see in 2022 I guess.
Feel free to fork and edit this post if you find a typo, thank you so much! This post is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.