To begin collaborating with others, we've open sourced several components for our secure operating system, called KataOS, on GitHub, as well as partnered with Antmicro on their Renode simulator and related frameworks. As the foundation for this new operating system, we chose seL4 as the microkernel because it puts security front and center; it is mathematically proven secure, with guaranteed confidentiality, integrity, and availability. Through the seL4 CAmkES framework, we're also able to provide statically-defined and analyzable system components. KataOS provides a verifiably-secure platform that protects the user's privacy because it is logically impossible for applications to breach the kernel's hardware security protections and the system components are verifiably secure. KataOS is also implemented almost entirely in Rust, which provides a strong starting point for software security, since it eliminates entire classes of bugs, such as off-by-one errors and buffer overflows.
Operating SystemsUsing Rust to build or explore operating systems.
Welcome to a new issue of "This Month in Rust OSDev". In these posts, we give a regular overview of notable changes in the Rust operating system development ecosystem.
We have some new sections this month, we hope you like the content!
During the 2022 Linux Maintainers Summit in Dublin, Linus Torvalds asked CI systems to start testing the new Rust infrastructure. So, with that in mind, we are excited to announce that as of today, Rust testing has now been added to KernelCI!
For more than a decade, memory safety vulnerabilities have consistently represented more than 65% of vulnerabilities across products, and across the industry. On Android, we’re now seeing something different - a significant drop in memory safety vulnerabilities and an associated drop in the severity of our vulnerabilities. This drop coincides with a shift in programming language usage away from memory unsafe languages. Android 13 is the first Android release where a majority of new code added to the release is in a memory safe language.
For the past two decades the RPM package manager software has relied upon its own OpenPGP parser implementation for dealing with package keys and signatures. With Fedora 38 they plan to have their RPM package shifted to use the Rust-written "Sequoia" parser instead.
We have a lot to show since the 0.7.0 release! This release, care has been taken to ensure real hardware is working, i686 support has been added, features like audio and preliminary multi-display support have been enabled, and the boot and install infrastructure has been simplified and made more robust. I highly recommend skimming through the changes listed below before jumping into the images, if you want more details.
There have been a lot of significant changes merged into the mainline for the 6.1 release, but one of the changes that has received the most attention will also have the least short-term effect for users of the kernel: the introduction of support for the Rust programming language. No system with a production 6.1 kernel will be running any Rust code, but this change does give kernel developers a chance to play with the language in the kernel context and get a sense for how Rust development feels. Perhaps the most likely conclusion for most developers, though, will be that there isn't yet enough Rust in the kernel to do much of anything interesting.
My first post-Figma hobby project is a win32 emulator I've called retrowin32. It is now barely capable of executing a few unmodified Windows exe files in a browser (see the site for some links).
There are other projects to run old Windows programs. WoW64 is the name of the system within 64-bit Windows that makes old 32-bit Windows programs run. Wine shims the Windows API onto your host system — see the great How Wine works for a deep dive on what that means. And system emulator projects like qemu emulate a full x86 machine such that you can install Windows onto them. But Wow64 requires running 64-bit Windows, Wine requires x86 hardware, and qemu requires installing the full Windows OS into the emulator to run a Windows program.
In contrast, my toy emulates an x86 and enough of the Windows API to take a plain exe file and run it directly in my browser.
Parts of the Rust language may look familiar to C programmers, but the two languages differ in fundamental ways. One difference that turns out to be problematic for kernel programming is the stability of data in memory — or the lack thereof. A challenging session at the 2022 Kangrejos conference wrestled with ways to deal with objects that should not be moved behind the programmer's back.
The idea of being able to write kernel code in the Rust language has a certain appeal, but it is hard to judge how well that would actually work in the absence of examples to look at. Those examples, especially for modules beyond the "hello world" level of complexity, have been somewhat scarce, but that is beginning to change. At the 2022 Kangrejos gathering in Oviedo, Spain, two developers presented the modules they have developed and some lessons that have been learned from this exercise.
While the Rust language has appeal for kernel development, many developers are concerned by the fact that there is only one compiler available; there are many reasons why a second implementation would be desirable. At the 2022 Kangrejos gathering, three developers described projects to build Rust programs with GCC in two different ways. A fully featured, GCC-based Rust implementation is still going to take some time, but rapid progress is being made.
In addition to ongoing and upcoming efforts to improve detection of memory bugs, we are ramping up efforts to prevent them in the first place. Memory-safe languages are the most cost-effective means for preventing memory bugs. In addition to memory-safe languages like Kotlin and Java, we’re excited to announce that the Android Open Source Project (AOSP) now supports the Rust programming language for developing the OS itself.
We're announcing the start of the Portable SIMD Project Group within the Libs team. This group is dedicated to making a portable SIMD API available to stable Rust users.
For a long time I have been maintaining the build of the Rust compiler and development tools on Haiku. For this purpose, I maintain a separate tree with the Rust source, with some patches and specific build instructions. My ultimate end goal is to have Rust build on Haiku from the original source, without any specific patches or workarounds. Instead we are in the situation where we cannot build rust on Haiku itself (instead we need to cross-compile it), and we need a customization to be able to run the Rust compiler (rustc) and package manager (cargo) on Haiku. This summer my goal would be to find out the underlying issue, and fix it so that the patch will no longer be necessary in the future. Let’s go!
The Rust programming language has long aimed to be a suitable replacement for C in operating-system kernel development. As Rust has matured, many developers have expressed growing interest in using it in the Linux kernel. At the 2020 (virtual) Linux Plumbers Conference, the LLVM microconference track hosted a session on open questions about and obstacles to accepting Rust upstream in the Linux kernel. The interest in this topic can be seen in the fact that this was the single most heavily attended session at the 2020 event.
As mentioned back in July, upstream Linux developers have been working to figure out a path for adding Rust code to the Linux kernel. That topic is now being further explored at this week's virtual Linux Plumbers Conference and it's still looking like it will happen, it's just a matter of when the initial infrastructure will be in place and how slowly the rollout will be.
Printing is important. If something doesn’t work, you want to know why (e.g. by looking at the console output). When I first wrote the log macro for my kernel driver I didn’t think much about the security. I just thought: “Surely nobody will call it with the wrong format specifiers or the wrong number of arguments, because the usage is simple and straightforward”.
As you probably can guess, this is against all the principles of Rust. If you use unsafe you need to ensure that the implementation is safe. This way you can be sure that the driver will be fine when there are no compiler warnings.
So let’s make it a little safer.
As part of our ongoing efforts towards safer systems programming, we’re pleased to announce that Windows Control Flow Guard (CFG) support is now available in the Clang C/C++ compiler and Rust.
Lately, I’ve been working on a Rust clone of fnm. I like Rust and it is a great way to learn the language. Thanks to the Rust cross-platform tooling, I got a simple proof-of-concept for Windows support quickly.
The first thing I got to do was a install_node_dist method, which takes a borrowed Version struct and a path to install into. I wrote a simple test to install Node 12.0.0 and it seemed to work on Linux, MacOS and even Windows! I was excited, but since I don’t really use Windows and fnm works perfectly well for me, I dropped it (no pun intended.)
Operating system differences can cause your Rust binaries to break when run in a different environment than they were compiled in. Here are the most common things to watch out for.
Thanks to PR #70740 and a lot of work of Vadim Petrochenkov, for current rust nightly the default binary output of the x86_64-unknown-linux-musl target is a static position independent executable (static-pie) with address space layout randomization (ASLR) on execution.
My KVM Forum 2018 presentation titled Security in QEMU: How Virtual Machines provide Isolation (pdf) (video) reviewed security bugs in QEMU and found the most common causes were C programming bugs. This includes buffer overflows, use-after-free, uninitialized memory, and more. In this post I will argue for using Rust as a safer language that prevents these classes of bugs.
In 2018 the choice of a safer language was not clear. C++ offered safe abstractions without an effective way to prohibit unsafe language features. Go also offered safety but with concerns about runtime costs. Rust looked promising but few people had deep experience with it. In 2018 I was not able to argue confidently for moving away from C in QEMU.
Now in 2020 the situation is clearer. C programming bugs are still the main cause of CVEs in QEMU. Rust has matured, its ecosystem is growing and healthy, and there are virtualization projects like Crosvm, Firecracker, and cloud-hypervisor that prove Rust is an effective language for writing Virtual Machine Monitors (VMM). In the QEMU community Paolo Bonzini and Sergio Lopez's work on rust-vmm and vhost-user code inspired me to look more closely at moving away from C.
This post describes how Ebbflow vends its client which is written in Rust to its Linux users, describing the tools used to build the various packages for popular distributions. In a future post, we will discuss how these packages are ultimately vended to users.
An operating system is used to make our job easier when using graphics. In our instance, in addition to everything else. In this post, we will be writing a GPU (graphics processing unit) driver using the VirtIO specification. In here, we will allow user applications to have a portion of the screen as RAM–with what is commonly known as a framebuffer.
Welcome to a new issue of "This Month in Rust OSDev". In these posts, we give a regular overview of notable changes in the Rust operating system development ecosystem.
In July, we switched the bootloader/bootimage crates from `cargo-xbuild` to cargo's `build-std` feature. There was also some great progress on the acpi crate and much more.
Today I’m publishing tihle, a new emulator targeting TI graphing calculators (currently only the 83+, but maybe others later). There’s rather a lot to say about it, but here I will discuss the motivation for a new emulator and the state of the art followed by technical notes on the design and initial development process.
Data produced by programs need to be stored somewhere for future reference, and there must be some sort of organisation so we can quickly retrieve the desired information. A file system (FS) is responsible for this task and provides an abstraction over the storage devices where the data is physically stored.
In this post, we will learn more about the concepts used by file systems, and how they fit together when writing your own.
If you’ve been following my Redox Summer of Code progress, you might have noticed a long break after the last post. At first, the reason was that I just lost track of time. My previous years of RSoC have followed a similar inconsistent schedule, which I now refer to as an interval of one blog post per “programmer week”, where a “programmer week” is anywhere from 3 days to a month…
Now, the reason for not finishing is that I’m basically done! That’s right, GDB has served us reliably for the past few weeks, where we’ve been able to debug our dynamic linker (ld.so) and find problems with shared libraries.
This week has been mostly about advancing the interface as much as possible, with the goal of being the default for pcid, xhcid, and usbscisd, as I previously mentioned. With the introduction of the AsyncScheme trait, I have now actually been able to operate the pci: scheme socket (well, :pci) completely asynchronously and with io_uring, by making the in-kernel RootScheme async too.
This post is about how I booted to bare metal Rust on x86_64. My goal is to describe my learning path and hopefully get you interested in things I talk about. I’ll be very happy if you find this content useful. Note that I’m a beginner and I may be wrong about many things. If you want to learn more, I’ll put links to many resources.
Nixpkgs recently merged PR #93568, allowing the Nix package manager to cross-compile packages to Redox.
Introduction After the last week where I was mainly blocked by the bug about blocking init, I’ve now been able to make further progress with the io_uring design. I have improved the redox-iou crate, which is Redox’s own liburing alternative, to support a fully-features buffer pool allocator meant for userspace-to-userspace io_urings (where the kernel can’t manage memory); to work with multiple secondary rings other than the main kernel ring; and to support spawning which you would expect from a proper executor in tokio or async-std.
Getting timer interrupt is a common task in todo list of OS developer. Although it is very simple task on some architectures, to have it on AArch64 you need to configure so called Interrupt Controller. From this post you will know how to initialize Generic Interrupt Controller (GIC), control priorities and target an interrupt to specific core.
Welcome to a new issue of "This Month in Rust OSDev". In these posts, we will give a regular overview of notable changes in the Rust operating system development ecosystem.
I read the official Rust book already in the end of 2019 but never had a project idea. That’s why I decided to rewrite one of my already existing C++ projects. A few months after I started I already gained lots of experience and began to wonder whether it’s possible to rewrite my Windows Kernel Drivers in Rust. A quick search lead me to many unanswered questions and two Github repositories. One of these repositories is winapi-kmd-rs which is unfortunately really complicated and outdated. I almost gave up until I stumbled upon win_driver_example which made me realize that a lot has changed and that it’s not even that hard. This post summarize what went wrong and what I learned.
I write a ton of articles about rust. And in those articles, the main focus is about writing Rust code that compiles. Once it compiles, well, we're basically in the clear! Especially if it compiles to a single executable, that's made up entirely of Rust code.
That works great for short tutorials, or one-off explorations.
Unfortunately, "in the real world", our code often has to share the stage with other code. And Rust is great at that. Compiling Go code to a static library, for example, is relatively finnicky. It insists on being built with GCC (and no other compiler), and linked with GNU ld (and no other linker).
In contrast, Rust lends itself very well to "just write a bit of fast and safe code and integrate it into something else". It uses LLVM for codegen, which, as detractors will point out, doesn't support as many targets as GCC does (but there's always work in progress to address that), and it supports using GCC, Clang, and MSVC to compile C dependencies, and GNU ld, the LLVM linker, and the Microsoft linker to link the result.
This week has initially been mostly minor bug fixes for the redox_syscall and kernel parts. I began the week by trying to get pcid to properly do all of its scheme logic, which it hasn’t previously done (its IPC is currently, only based on passing command line arguments, or pipes). This meant that the kernel could no longer simply process the syscalls immediately (which I managed to do with non-blocking syscalls such as SYS_OPEN and SYS_CLOSE) by invoking the scheme functions directly from the kernel. So for the FilesUpdate opcode, I then tinkered a bit with the built-in event queues in the kernel, by adding a method to register interest of a context that will block on the event, and by allowing non-blocking polls of the event queues.
This week has been quite productive for the most part. I continued updating the RFC, with some newer ideas that I came up while working on the implementation, most imporantly how the kernel is going to be involved in io_uring operation.
I also came up with a set of standard opcodes, that schemes are meant to use when using io_uring, unless in some special scenarios (like general-purpose IPC between processes). The opcodes at this point in time, can be found here.
Introduction Yesterday at 15:08 I sent this image excitedly to the Redox chat, along with the message “Debugging on Redox… We’re soon, soon, there.”
Earlier this year, we used the C2Rust framework to translate applications such as Quake 3 to Rust. In this post, we’ll show you that it is also possible to translate privileged software such as modules that are loaded by the Linux kenel. We’ll use a small, 3-file kernel module which is part of the Bareflank Hypervisor SDK developed by Assured Information Security but you can use the same techniques to translate other kernel modules.
With Apple’s recent announcement that they are moving away from Intel X86 CPU’s to their own ARM CPU’s for future laptops and desktops I thought it would be a good time to take a look at the some differences that can affect systems programmers working in Rust.
This is my first year of Redox Summer of Code, and my intent is continuing my prior work (outside of RSoC) on improving the Redox drivers and the kernel. I started this week by quite a minor change: implementing a more advanced syscall for allocating physical memory, namely physalloc3. Unlike the more basic physalloc which only takes a size as parameter, physalloc3 also takes a flags and minimal size; this allows a driver to request a large range and fall back to multiple small ranges, if the physical memory space were to be too fragmented, by using scatter-gather lists (a form of vectored I/O like preadv for hardware). It also adds support for 32-bit-only allocation for devices that do not support the entire 64-bit physical address space.
As you might know, last year I spent the summer implementing a ptrace-alternative for Redox OS. It’s a powerful system where the tracing is done using a file handle. You can read all about the design over at the RFC. Thanks to this system I also got strace working, and then I started working on a simple gdbserver in Rust, for both Linux and Redox, but mainly Linux at that point, to lay the foundation for debugging on Redox using a Rust-based program.
This week, I’ve been using the remnants of last year to work on porting this debugging server to Redox. To do this, I had to make some more changes to the kernel side of things.
Diosix 2.0 strives to be a lightweight, fast, and secure multiprocessor bare-metal hypervisor for 32-bit and 64-bit RISC-V systems. It is written in Rust, which is a C/C++-like systems programming language focused on memory and thread safety as well as performance and reliability.
The ultimate goal is to build fully open-source packages containing everything needed to configure FPGA-based systems with RISC-V cores and peripheral controllers, and boot a stack of software customized for a particular task, all generated on demand if necessary. This software should also run on supported ASICs and system-on-chips.
Right now, Diosix is a work in progress. It can bring up a RISC-V system, load a Linux kernel and minimal filesystem into a virtualized environment called a capsule, and begin executing it.
This is the moment we've all been waiting for. Ten chapters of setup have led us to this moment--to finally be able to load a process from the disk and run it. The file format for executables is called ELF (executable and linkable format). I will go into some detail about it, but there are plenty of avenues you can explore with this one file type.
Welcome back and thanks for joining us for the reads notes… the thirteenth installment of our series on ELF files, what they are, what they can do, what does the dynamic linker do to them, and how can we do it ourselves.
I've been pretty successfully avoiding talking about TLS so far (no, not that one) but I guess we've reached a point where it cannot be delayed any further, so.
So. Thread-local storage.
Let’s do another dive into packaging Rust for Debian with a slightly more complicated example.
Welcome to the second issue of "This Month in Rust OSDev". In these posts, we will give a regular overview of notable changes in the Rust operating system development ecosystem.
Did you know that Rust has a Tier 2 target called i586-pc-windows-msvc? I didn't either, until a few days ago. This target disables SSE2 support and only emits instructions available on the original Intel Pentium from 1993.
So, for fun, I wanted to try compiling a binary that works on similarly old systems. My retro Windows of choice is Windows 98 Second Edition, so that is what I have settled for as the initial target for this project.
Storage is an important part of an operating system. When we run a shell, execute another program, we're loading from some sort of secondary storage, such as a hard drive or USB stick. We talked about the block driver in the last chapter, but that only reads and writes to the storage. The storage itself arranges its 0s and 1s in a certain order. This order is called the file system. The file system I opted to use is the Minix 3 filesystem, which I will describe the practical applications here. For more overview of the Minix 3 file system or file systems in general, please refer to the course notes and/or video that I posted above.
I will go through each part of the Minix 3 file system, but the following diagram depicts all aspects and the structure of the Minix 3 file system.
The VirtIO protocol is a way to communicate with virtualized devices, such as a block device (hard drive) or input device (mouse/keyboard). For this post, I will show you how to write a block driver using the VirtIO protocol.
The first thing we must understand is that VirtIO is just a generic I/O communication protocol. Then, we have to look at the block device section to see the communication protocol specifically for block devices.
Welcome to the first issue of "This Month in Rust OSDev". In these posts, we will give a regular overview of notable changes in the Rust operating system development community.
These posts are the successor of the "Status Update" posts on the "Writing an OS in Rust" blog. Instead of only focusing on the updates to the blog and the directly related crates, we try to give an overview of the full Rust OSDev ecosystem in this new series. This includes all the projects under the rust-osdev GitHub organization, relevant projects of other organizations, and also personal OS projects.
Last fall I was working on a library to make a safe API for driving futures on top of an an io-uring instance. Though I released bindings to liburing called iou, the futures integration, called ostkreuz, was never released. I don’t know if I will pick this work up again in the future but several different people have started writing other libraries with similar goals, so I wanted to write up some notes on what I learned working with io-uring and Rust’s futures model.
In Part 11, we spent some time clarifying mechanisms we had previously glossed over: how variables and functions from other ELF objects were accessed at runtime.
We saw that doing so “proper” required the cooperation of the compiler, the assembler, the linker, and the dynamic loader. We also learned that the mechanism for functions was actually quite complicated! And sorta clever!
And finally, we ignored all the cleverness and “made things work” with a three-line change, adding support for both GlobDat and JumpSlot relocations.
We're not done with relocations yet, of course - but I think we've earned ourselves a little break. There's plenty of other things we've been ignoring so far!
For example… how are command-line arguments passed to an executable?
In our last installment of “Making our own executable packer”, we did some code cleanups. We got rid of a bunch of unsafe code, and found a way to represent memory-mapped data structures safely.
But that article was merely a break in our otherwise colorful saga of “trying to get as many executables to run with our own dynamic loader”. The last thing we got running was the ifunc-nolibc program.
In this post we explore cooperative multitasking and the async/await feature of Rust. We take a detailed look how async/await works in Rust, including the design of the Future trait, the state machine transformation, and pinning. We then add basic support for async/await to our kernel by creating an asynchronous keyboard task and a basic executor.
Starting a process is what we've all been waiting for. The operating system's job is essentially to support running processes. In this post, we will look at a process from the OS's perspective as well as the CPU's perspective.
We looked at the process memory in the last chapter, but some of that has been modified so that we have a resident memory space (on the heap). Also, I will show you how to go from kernel mode into user mode. Right now, we've erased supervisor mode, but we will fix that when we revisit system calls in order to support processes.
Welcome back to the “Making our own executable packer” series, where digressions are our bread and butter.
Last time, we implemented indirect functions in a no-libc C program. Of course, we got lost on the way and accidentally implemented a couple of useful elk-powered GDB functions - with only the minimal required amount of Python code.
The article got pretty long, and we could use a nice distraction. And I have just the thing! A little while ago, a member of the Rust language design team stumbled upon this series and gave me some feedback.
It has been a while since the last Redox OS news, and I think it is good to provide an update on how things are progressing. The dynamic linking support in relibc got to the point where rustc could be loaded, but hangs occur after loading the LLVM codegen library. Debugging this issue has been difficult, so I am taking some time to consider other aspects of Redox OS. Recently, I have been working on a new package format, called pkgar.
Bottlerocket is a free and open-source Linux-based operating system meant for hosting containers. Bottlerocket focuses on security and maintainability, providing a reliable, consistent, and safe platform for container-based workloads. This is a reflection of what we've learned building operating systems and services at Amazon. You can read more about what drives us in our charter.
The base operating system has just what you need to run containers reliably, and is built with standard open-source components. Bottlerocket-specific additions focus on reliable updates and on the API. Instead of making configuration changes manually, you can change settings with an API call, and these changes are automatically migrated through updates.
In the last article, we cleaned up our dynamic linker a little. We even implemented the Dynamic relocation. But it's still pretty far away from running real-world applications.
In the last article, we managed to load a program (hello-dl) that uses a single dynamic library (libmsg.so) containing a single exported symbol, msg. So… we got one application to load. Does it work on other applications?
Let's pick up where we left off: we had just taught elk to load not only an executable, but also its dependencies, and then their dependencies as well.
We discovered that ld-linux walked the dependency graph breadth-first, and so we did that too. Of course, it's a little bit overkill since we only have one dependency, but, nevertheless, elk happily loads our executable and its one dependency.
My Rust adventure continues as I have been furiously working on Rust/WinRT for the last five months or so. I am looking forward to opening it up to the community as soon as possible. Even then, it will be early days and much still do. I remember chatting with Martyn Lovell about this a few years ago and we basically agreed that it takes about three years to build a language projection. Naturally, you can get value out of it before then but that’s what you need to keep in mind when you consider completeness.
Still, I’m starting to be able to make API calls with Rust/WinRT and its very satisfying to see this come together. So, I’ll leave you with a sneak peek to give you sense of what calling Windows APIs looks like in Rust.
Up until now, we've been loading a single ELF file, and there wasn't much structure to how we did it: everything just kinda happened in main, in no particular order.
But now that shared libraries are in the picture, we have to load multiple ELF files, with search paths, and keep them around so we can resolve symbols, and apply relocations across different objects.
After a long period of trawling through references, painstakingly turning C struct definitions into nom parsers and hunting down valid enum values… it's time for some graphs.
In our last article, we managed to load and execute a PIE (position-independent executable). The big improvement in that article was that we started caring about relocations. It was enough for the code and data segments to be in the right place relative to each other, because it used RIP-relative addressing.
We've seen that Relative relocations mean to replace the 64-bit integer at offset with the result of base + addend, where addend is specified in the relocation entry itself, and base is the address we chose to load the executable at. Well, that's all well and good. But what if we have two .asm files?
The last article, Position-independent code, was a mess. But who could blame us? We looked at the world, and found it to be a chaotic and seemingly nonsensical place. So, in order to blend in, we had to let go of a little bit of sanity. The time has come to reclaim it.
Short of faulty memory sticks, memory locations don't magically turn from 0x0 into valid addresses. Someone is doing the turning, and we're going to find out who, if it takes the rest of the series.
While this is inspired by DOSBox, it is not a direct port. Many features are implemented differently or not at all. The goal was just to implement enough to play one of my favorite games and learn some Rust and emulation principles along the way.
In the last article, we found where code was hiding in our samples/hello executable, by disassembling the whole file and then looking for syscalls.
Later on, we learned how to inspect which memory ranges are mapped for a given PID (process identifier). We saw that memory areas weren't all equal: they can be readable, writable, and/or executable.
Finally, we learned about program headers and how they specified which parts of the executable file should be mapped to which memory areas.
System calls are a way for unprivileged, user applications to request services from the kernel. In the RISC-V architecture, we invoke the call using the ecall instruction. This will cause the CPU to halt what it's doing, elevate privilege modes, and then jump to whatever function handler is stored in the mtvec (machine trap vector) register. Remember, this is the "funnel" where all traps are handled, including our system calls.
We have to set up our convention for handling system calls. We can use a convention that already exists, so we can interface with a library, such as newlib. But, let's make this ours! We get to say what the system call numbers are, and where they will be when we execute a system call.
I have started to package rust things for Debian, and the process have been pretty smooth so far, but it was very hard finding information on how to start, so here is a small writeup on how I packaged my first rust crate for Debian.
In part 1, we've looked at three executables: sample, an assembly program that prints “hi there” using the write system call. entry_point, a C program that prints the address of main using printf. The /bin/true executable, probably also a C program (because it's part of GNU coreutils), and which just exits with code 0.
We noticed that entry_point printed different addresses when run with GDB, but always the same address when run directly.
What happens if we run it ourselves?
This post explains how to implement heap allocators from scratch. It presents and discusses different allocator designs, including bump allocation, linked list allocation, and fixed-size block allocation. For each of the three designs, we will create a basic implementation that can be used for our kernel.
In this post, we will implement cooperative multitasking. For simplicity, we will use a round-robin scheduler, where each thread will be run in a FIFO order.
What is a cooperative scheduler? Threads can run as long they want, and can let other threads run by yielding to them. The problem? If threads refuse to yield, other threads will be unable to run.
epoll kqueue iocp
This book aims to explain how Epoll, Kqueue and IOCP works, and how we can use this for efficient, high performance I/O. The book is divided into three parts:
Part 1 - An express explanation: is probably what you want to read if you're interested in a short introduction.
The Appendix contains some additional references and small articles explaining some concepts that I found interesting and which is related to the kind of code we write here.
Part 2 is special. 99% of readers should not even go there. You'll find page up and down with code and explanations just to implement the simplest example of a cross-platform-eventloop that actually works. Turns out that there is no "express" way of doing this.
There are few times when I'm so excited about a new feature that I'll write about it before multiple PRs are merged in multiple repos. Typically one would wait and have the patience until everything is fully merged to master yet I can't wait to talk about this one cause it's just too damn cool.
What this new branch offers is a way to instantly reboot cloud vms whenever your application dies a horrible death. Let's say a bunch of bad packets from the wrong side of town arrive and decide to shoot your vm full of lead. In a typical linux setup that instance is probably dead, Jim. Your load balancer might start re-routing around it. In a container setup you might get the same sort of deal. Sure, if it was just the process that died systemd might be configured to restart on failure but the whole box?
However, what if you weren't running a full blown linux as your base vm? What if your base vm was only a single application that your vm booted straight into and your application was re-spawned in seconds as if it was a process instead of a vm?
Executables have been fascinating to me ever since I discovered, as a kid, that they were just files. If you renamed a .exe to something else, you could open it in notepad! And if you renamed something else to a .exe, you'd get a neat error dialog.
Clearly, something was different about these files. Seen from notepad, they were mostly gibberish, but there had to be order in that chaos. 12-year-old me knew that, although he didn't quite know how or where to dig to make sense of it all.
So, this series is dedicated to my past self. In it we'll attempt to understand how Linux executables are organized, how they are executed, and how to make a program that takes an executable fresh off the linker and compresses it - just because we can.
Since the last big series, Making our own ping, was all about Windows, this one will be focused on 64-bit Linux.
OxidizedOS is a multicore, x86-64 kernel written in Rust. In this Series, we will be discussing the implementation of kernel threads and a scheduler in Rust.
Krabs is an experimental x86 bootloader written in Rust. Krabs can load and start the ELF format Linux kernel compressed with bzip2. Some of the source code uses libbzip2 C library for decompressing, but the rest is completely Rust only.
This is chapter 6 of a multi-part series on writing a RISC-V OS in Rust. Processes are the whole point of the operating system. We want to start doing "stuff", which we'll fit into a process and get it going. We will update the process structure in the future as we add features to a process. For now, we need a program counter (which instruction is executing), and a stack for local memory.
We will not create our standard library for processes. In this chapter, we're just going to write kernel functions and wrap them into a process. When we start creating our user processes, we will need to read from the block device and start executing instructions. That's quite a ways a way, since we will need system calls and so forth.
In the past few months I’ve been working with Red Sift on RedBPF, a BPF toolkit for Rust. Red Sift uses RedBPF to power the security monitoring agent InGRAINd. Peter recently blogged about RedBPF and InGRAINd, and ran a workshop at RustFest Barcelona. We’ve continued to improve RedBPF since, fixing bugs, improving and adding new APIs, adding support for Google Kubernetes Engine kernels and more. We’ve also completed the relicensing of the project to Apache2/MIT – the licensing scheme used by many of the most prominent crates in the Rust ecosystem – which will hopefully make it even easier to adopt RedBPF.
In this post I’m going to go into some details into what RedBPF is, what its main components are, and what the full process of writing a BPF program looks like.
As a follow up to my post on distribution packaging, it was commented by Fraser Tweedale (@hackuador) that traditionally the “security” aspects of distribution packaging was a compelling reason to use distribution packages over “upstreams”. I want to dig into this further.
This post gives an overview of the recent updates to the Writing an OS in Rust blog and the used libraries and tools.
I moved to a new apartment mid-October and had lots of work to do there, so I didn't have the time for creating the October status update post. Therefore, this post lists the changes from both October and November. I'm slowly picking up speed again, but I still have a lot of mails in my backlog. Sorry if you haven't received an answer yet!
Microsoft can't throw away old Windows code, but the company's research under Project Verona is aiming to make Windows 10 more secure with its recent work on integrating Mozilla-developed Rust for low-level Windows components.
A few years back, I wrote up a detailed blog post on Docker's process 1, orphans, zombies, and signal handling. The solution from three years ago was a Haskell executable providing this functionality and a Docker image based on Ubuntu.
A few of the Haskellers on the FP Complete team have batted around the idea of rewriting pid1 in Rust as an educational exercise, and to have a nice comparison with Haskell. No one got around to it. However, when Rust 1.39 came out with async/await support, I was looking for a good use case to demonstrate, and decided I'd do this with pid1.
After the addition of the NVMe driver a couple months ago, I have been running Redox OS permanently (from an install to disk) on a System76 Galago Pro (galp3-c), with System76 Open Firmware as well as the un-announced, in-development, GPLv3 System76 EC firmware . This particular hardware has full support for the keyboard, touchpad, storage, and ethernet, making it easy to use with Redox.
Moonrise is a Linux init system written in Lua with Rust support code. An init system is a software suite responsible for bringing the userspace components of an operating system online and, in most cases, managing long-running components such as background services.
When I was writing a fingerd daemon in Rust (why? because I could), one thing that took me a little while to figure out was how to drop root privileges after I bound to port 79.
Neotron is an attempt to make computers simple again, whilst also taking advantage of the very latest in programming language development. It is based around four simple concepts: The ARM Thumb-v7M instruction set, A standardised OS interface, A standardised BIOS interface, and Use of the Rust Programming Language.
Today I’m releasing a library called iou. This library provides idiomatic Rust bindings to the C library called liburing, which itself is a higher interface for interacting with the io_uring Linux kernel interface. Here are the answers to some questions I expect that may provoke.
What is io_uring? io_uring is an interface added to the Linux kernel in version 5.1. Concurrent with that, the primary maintainer of that interface has also been publishing a library for interacting with it called liburing.
This blog describes part of the story of Rust adoption at Microsoft. Recently, I’ve been tasked with an experimental rewrite of a low-level system component of the Windows codebase (sorry, we can’t say which one yet). Instead of rewriting the code in C++, I was asked to use Rust, a memory-safe alternative. Though the project is not yet finished, I can say that my experience with Rust has been generally positive. It’s a good choice for those looking to avoid common mistakes that often lead to security vulnerabilities in C++ code bases.
During the product development process monitoring our pipelines proved challenging, and we wanted more visibility into our containers. After a short period of exploration, we found that eBPF would address most of the pain points and dark spots we were encountering.
There was one catch: no eBPF tooling would help us deploy and maintain new probes within our small, but focused ops team. BCC, while great for tinkering, requires significant effort to roll out to production. It also makes it difficult to integrate our toolkit into our usual CI/CD deployment models.
Faced with this dilemma, we decided the only option was for us to write our own Rust-based agent that integrated well with our testing and deployment strategies.
I have come to the point with C++/WinRT where I am largely satisfied with how it works and leverages C++ to the best of its ability. There is always room for improvement and I will continue to evolve and optimize C++/WinRT as the C++ language itself advances. But as a technology, the Windows Runtime has always been about more than just one language and we have started working on a few different projects to add support for various languages. None of these efforts could however draw me away from C++… that is until Rust showed up on my radar.
Rust is an intriguing language for me. It closely resembles C++ in many ways, hitting all the right notes when it comes to compilation and runtime model, type system and deterministic finalization, that I could not help but get a little excited about this fresh new take on language design. And so it is that I have started building the WinRT language projection for Rust.
Recently, a new Linux kernel interface, called io_uring, appeared. I have been looking into it a little bit and I can’t help but wondering about it. Unfortunately, I’ve had only enough time to keep thinking and reading about it. Nevertheless, I’ve decided to share what I’ve been thinking about so far in case someone wants to write some actual code and experiment. Basically, I have an idea for a crate and I’d love someone else to write it 😇.
In the world of systems programming where one may find themselves writing hardware drivers or interacting directly with memory-mapped devices, that interaction is almost always through memory-mapped registers provided by the hardware. We typically interact with these things through bitwise operations on some fixed-width numeric type.
QEMU and libvirt form the backend of the Red Hat userspace virtualization stack: they are used by our KVM-based products and by several applications included in Red Hat Enterprise Linux, such as virt-manager, libguestfs and GNOME Boxes.
Play with Linux process termination exploring such interesting features as PR_SET_CHILD_SUBREAPER and PR_SET_PDEATHSIG.
RISC-V ("risk five") and the Rust programming language both start with an R, so naturally they fit together. In this blog, we will write an operating system targeting the RISC-V architecture in Rust (mostly). If you have a sane development environment for RISC-V, you can skip the setup parts right to bootloading. Otherwise, it'll be fairly difficult to get started.
This tutorial will progressively build an operating system from start to something that you can show your friends or parents -- if they're significantly young enough. Since I'm rather new at this I decided to make it a "feature" that each blog post will mature as time goes on. More details will be added and some will be clarified.
We designed a framework to help developers to quickly build device drivers in Rust. We also utilized Rust’s security features to provide several useful infrastructures for developers so that they can easily handle kernel memory allocation and concurrency management, at the same time, some common bugs (e.g. use-after-free) can be alleviated.
We demonstrate the generality of our framework by implementing a real-world device driver on Raspberry Pi 3, and our evaluation shows that device drivers generated by our framework have acceptable binary size for canonical embedded systems and the runtime overhead is negligible.
The Redox official website
Over the past few months, System76 has been developing a simple, easy-to-use tool for updating firmware on Pop!_OS and System76 hardware. Today, we’re excited to announce that you can now check and update firmware through Settings on Pop!_OS, and through the firmware manager GTK application on System76 hardware running other Debian-based distributions.
In the last few weeks, I've been working on a new solution to firmware management on the Linux desktop. A generic framework which combines fwupd and system76-firmware; with a GTK frontend library and application; that is written in Rust.
The Redox official website
This week I’ve decided to skip trying to get GDB working for now (there are so many issues it’ll take forever to solve them), and instead decided to finally give focus to the final concerns I had about ptrace. Most changes this week was related to getting decent behavior of child processes, although the design feels… suboptimal, somehow (not sure why), so I feel I must be able to improve it better later.
Another change was security: Tracers running as a non-root user can now in addition to only tracing processes running as the same user, only trace processes that are directly or indirectly children of the tracer. In the future this can easily be allowed with some kind of capability, but currently in Redox there isn’t a capability-like system other than the simple (but really powerful) namespacing system which sadly I don’t think can be used for this.
Once again, last weeks action was merged, which means the full ptrace feature was merged, and it’s time to start tackling the final issues which I have delayed for so long. But, before that, I decided to try to get some basic ptrace compatibility in relibc, so we could see just how far away software like gdb is from being ported, and what concerns I haven’t thought about yet. redox-nix update: That said, I took a little break from the madness, to instead lay my focus on another interesting problem: Newer redoxer couldn’t be compiled using carnix, because of some dependency that used a cargo feature carnix didn’t support. Let me first explain what carnix is, and why this is a problem.
Before I dive in to this week’s actions, I am pleased to announce that all the last weeks’ work is merged! This merge means you can now experiment with basic ptrace functionality using only basic registers and PTRACE_SYSCALL/PTRACE_SINGLESTEP. I have already opened the second PR in the batch: Ptrace memory reading and floating point registers support which will supply the “final bits” of the initial implementation, before all the nitpicking of final concerns can start (not to underestimate the importance and difficulty of these nitpicks - there are some areas of ptrace that aren’t even thought about yet and those will need tending to)! I will comment on these changes in this blog post, as there are some interesting things going on!
Wrapping up the Ion as a library project. It is now possible to embed Ion in any Rust application. Ion takes any Read instance and can execute it (so yes, it is possible to run Ion without ever collecting the script’s binary stream). It takes care of expanding the input and managing the running applications in an efficient manner, with a comprehensive set of errors. Ion is now the rust-based, pipe-oriented liblua alternative.
The next step in the journey of ptrace was to bite the bullet (or at least I thought) and implement system-call tracing. Since the kernel must be able to handle system-calls of processes, it’s quite obvious that the way to set a breakpoint should involve the kernel, running in the context of the tracee, should notify the tracer and wait. So the biggest challenge would be to figure out how kernel synchronization worked.
This post adds support for heap allocation to our kernel. First, it gives an introduction to dynamic memory and shows how the borrow checker prevents common allocation errors. It then implements the basic allocation interface of Rust, creates a heap memory region, and sets up an allocator crate. At the end of this post all the allocation and collection types of the built-in alloc crate will be available to our kernel.
After having a pretty clear goal to meet specified by the RFC, time to get things moving. I started with what I thought would be low hanging fruit: Reading the registers of another process. It ended up being more difficult than I thought, but it ended up being really interesting and I want to share it with you :)
How to fetch batteries information from the macOS APIs with Rust
I will quickly show how I got bindgen (https://rust-lang.github.io/rust-bindgen) to generate the bindings to Fuse (libfuse) with the current stable1 release of Rust. By doing so, this should demonstrate how to bootstrap writing your own Fuse file system in Rust.
I do realise that there are some crates that already exist that aid in making Fuse drivers in Rust, but this was more or less an excuse to also try out bindgen, which I don't believe those existing libraries utilise.
Manticore is a research operating system, written in Rust, with the aim of exploring the parakernel OS architecture.
The OS is increasingly a bottleneck for server applications that want to take maximum advantage of the hardware. Many traditional kernel interfaces (such as in POSIX) were designed when I/O was significantly slower than the CPU. However, today I/O is getting faster, but single-threaded CPU performance has stagnated. For example, a 40 GbE NIC can receive a cache-line sized packet faster than the CPU can access its last-level cache (LLC), which makes it tricky for an OS to keep up with packets arriving from the network. Similarly, non-volatile memory (NVM) access speed is getting closer to DRAM speeds, which challenges OS abstractions for storage.
To address this OS bottleneck, server applications are increasingly adopting kernel-bypass techniques. For example, the Seastar framework is an OS implemented in userspace, which implements its own CPU and I/O scheduler, and bypasses the Linux kernel as much as it can. Parakernel is an OS architecture that eliminates many OS abstractions (similar to exokernels) and partitions hardware resources (similar to multikernels) to facilitate high-performance server application with increased application-level parallelism and predictable tail latency.
This repository contains a simple KVM firmware that is designed to be launched from anything that supports loading ELF binaries and running them with the Linux kernel loading standard. The ultimate goal is to be able to use this "firmware" to be able to load a bootloader from within a disk image.
This post explores unit and integration testing in no_std executables. We will use Rust's support for custom test frameworks to execute test functions inside our kernel. To report the results out of QEMU, we will use different features of QEMU and the bootimage tool.
Recently, x86_64-unknown-uefi target was added into Rust mainline (https://github.com/rust-lang/rust/pull/56769). So, I tried to write UEFI application with this update. There exists an awesome crate, uefi-rs, which provides Rust interface for UEFI application. However, this is my first time to write UEFI application, so to understand what happens in it, I didn’t use any existing crate.
It has been one year and four days since the last release of Redox OS! In this time, we have been hard at work improving the Redox ecosystem. Much of this work was related to relibc, a new C library written in Rust and maintained by the Redox OS project, and adding new packages to the cookbook. We are proud to report that we have now far exceeded the capabilities of newlib, which we were using as our system C library before. We have added many important libraries and programs, which you can see listed below.
This post shows how to implement paging support in our kernel. It first explores different techniques to make the physical page table frames accessible to the kernel and discusses their respective advantages and drawbacks. It then implements an address translation function and a function to create a new mapping.
This will be the first in a series of weekly updates on progress made in the development of Pop!_OS. Thus, this will only contain content pertaining specifically to Pop!_OS, though at times there may be some overlap with the hardware side of System76.
I’ve decided to take a look at Minix, which is an interesting microkernel OS. Naturally after building Minix from git, the first thing I decided to try was porting Rust’s std to Minix so I could cross-compile Rust programs from Linux to run under Minix. Okay, I suppose I could have started with something else, but porting Rust software and modifying the platform-depending part of std is something I have experience with from working on Redox OS. And Rust really isn’t that hard to port.
We are going to make a demo linux web-server with systemd, config file and installable .deb binary in Rust.
This post introduces paging, a very common memory management scheme that we will also use for our operating system. It explains why memory isolation is needed, how segmentation works, what virtual memory is, and how paging solves memory fragmentation issues. It also explores the layout of multilevel page tables on the x86_64 architecture.
It has been a long-standing tradition to develop a language far enough to be able to write the language's compiler in the same language, and Rust does the same. Rust is nowadays written in Rust. We've tracked down the earlier Rust versions, which were written in OCaml, and were planning to use these to bootstrap Rust. But in parallel, John Hudge (Mutabah) developed a Rust compiler, called "mrustc", written in C++. mrustc is now good enough to compile Rust 1.19.0. Using mrustc, we were able to build Rust entirely from source with a bootstrap chain
In this post we set up the programmable interrupt controller to correctly forward hardware interrupts to the CPU. To handle these interrupts we add new entries to our interrupt descriptor table, just like we did for our exception handlers. We will learn how to get periodic timer interrupts and how to get input from the keyboard.
Stratis 1.0 was quietly released last week with the 1.0 version marking its initial stable release and where also the on-disk meta-data format has been stabilized. Red Hat engineers believe Stratis is now ready for more widespread testing.
Time for me to pack up and never ever contribute to Redox ever again… Just kidding. This isn’t goodbye, you can’t get rid of me that easily I’m afraid. I’ll definitely want to contribute more, can’t however say with certainty how much time I’ll get, for school is approaching, quickly
The previous blog post discusses how raw disk reads were implemented in the loader stub. The next step was to implement a clean read API which can be used by different filesystem libraries in order to read their respective filesystems. Since the raw reads from the BIOS interrupt had a granularity in terms of sectors(each sector being 512 bytes), the reads had to be translated in order to provide byte level granularity. The clone_from_slice function ensures that a direct call to memcopy is not required. The refined read function is here.
At the time of writing the previous blog the plan was to target the Raspberry Pi 3 (Cortex A53) as a development platform because of its availability, popularity and community. Sadly, it seems that Broadcom went through a lot of shortcuts while implementing this specific design, which means features like GIC are half-there or completely missing, like in this case.
After a discussion with @microcolonel, he proposed and kindly sent me a HiKey960 reference SoC from the awesome Linaro 96Boards initiative. The quality of this board is definitely a lot better than the Raspberry Pi and the documentation is detailed and open. Great stuff.
With the recent addition of Rust 1.27.0 in the HaikuPorts repository, I thought it would be good to do a short, public write-up of the current state of Rust on Haiku, and some insight into the future.
This is the second blog post about implementing a FAT32 filesystem in Redox.
As promised in the previous article (thanks for all the valuable feedback ‒ I didn’t have the time to act on it yet, but I will), this talks about Unix signal handling.
Long story short, I wasn’t happy about the signal handling story in Rust and this is my attempt at improving it.
Over the last couple of weeks, Nebulet has progressed signifigantly. Because of that, I think it’s time to talk about why I made certain decisions when designing and writing Nebulet.
All excited. A first calendar entry to describe my attempt on arm64 support in Redox OS. Specifically, looking into the Raspberry Pi2/3b/3+(all of them having a Cortex-A53 ARMv8 64-bit microprocessor, although for all my experiments I am going to use the Raspberry Pi 3b.
In this post we explore double faults in detail. We also set up an Interrupt Stack Table to catch double faults on a separate kernel stack. This way, we can completely prevent triple faults, even on kernel stack overflow.
In this post, we start exploring CPU exceptions. Exceptions occur in various erroneous situations, for example when accessing an invalid memory address or when dividing by zero. To catch them, we have to set up an interrupt descriptor table that provides handler functions. At the end of this post, our kernel will be able to catch breakpoint exceptions and to resume normal execution afterwards.
In this post we complete the testing picture by implementing a basic integration test framework, which allows us to run tests on the target system. The idea is to run tests inside QEMU and report the results back to the host through the serial port.
Last week I ended off stating that the redox netstack might soon switch to an edge-triggered model. Well, I ended up feeling bad about the idea of letting others do my work and decided to stop being lazy and just do it myself.
Rust is an extremely interesting language for the development of system software. This was the motivation to evaluate Rust for HermitCore and to develop an experimental version of our libOS in Rust. Components like the IP stack and uhyve (our unikernel hypervisor) are still written in C. In addition, the user applications are still compiled by our cross-compiler, which is based on gcc and supports C, C++, Fortran, and Go. The core of the kernel, however, is now written in Rust and published at GitHub. Our experiences so far are really good and we are looking into possibly new Rust activities, e.g., the support for Rust’s userland.
A first calendar entry to describe my attempt on ARM64 support in Redox OS. Specifically, looking into the Raspberry Pi2/3(B)/3+ (all of them having a Cortex-A53 ARMv8 64-bit microprocessor, although for all my experiments I am going to use the Raspberry Pi 3(B)).
This is a blog post about the work which I have done so far in implementing a FAT32 filesystem in Redox. Currently the Redox bootloader as well as the userspace filesystem daemon supports only RedoxFS.
This is the weekly summary for my Redox Summer of Code project: Porting tokio to redox. Most of the time was spent on one bug, and after that one was figured out and fixed it ended up being relatively easy! As of now, 11⁄13 tokio examples seem to work on redox. The remaining examples are UDP and seem to fail because of something either with the rust standard library or my setup.
This post explores unit testing in no_std executables using Rust's built-in test framework. We will adjust our code so that cargo test works and add some basic unit tests to our VGA buffer module.
Redox OS is running its own Summer of Code this year, after the Microkernel devroom did not get accepted into GSoC 2018. We are looking for both Students and Sponsors who want to help Redox OS grow. At the moment, Redox OS has $10,800 in donations from various platforms to use to fund students. This will give us three students working for three months, if each student requests $1200 per month on average as described in Payment.
In order to fund more students, we are looking for sponsors who are willing to fund RSoC. Donations can be made on the Donate page. All donations will be used to fund Redox OS activities, with about 90% of those over the past year currently allocated to RSoC.
Our second iteration of the 18.04 ISO is ready for testing. Testing the new installer and Optimus switching is our priority for this test release. Please test installing on a variety of hardware and provide feedback on any issues you encounter. If you run into any bugs, you can file them at https://github.com/pop-os/pop/issues.
Installing a toolchain for Rust is very easy, as support for CloudABI has been upstreamed into the Rust codebase. Automated builds are performed by the Rust developers. As there hasn’t been a stable release of Rust to include CloudABI support yet, you must for now make use of Rust’s nightly track.
Over the past six months we've been working on a second edition of this blog. Our goals for this new version are numerous and we are still not done yet, but today we reached a major milestone: It is now possible to build the OS natively on Windows, macOS, and Linux without any non-Rust dependendencies.
Writing eBPF tracing tools in Rust
I have been playing with eBPF (extended Berkeley Packet Filters), a neat feature present in recent Linux versions (it evolved from the much older BPF filters). It is a virtual machine running in th…