This post explains how to implement heap allocators from scratch. It presents and discusses different allocator designs, including bump allocation, linked list allocation, and fixed-size block allocation. For each of the three designs, we will create a basic implementation that can be used for our kernel.
Operating SystemsUsing Rust to build or explore operating systems.
In this post, we will implement cooperative multitasking. For simplicity, we will use a round-robin scheduler, where each thread will be run in a FIFO order.
What is a cooperative scheduler? Threads can run as long they want, and can let other threads run by yielding to them. The problem? If threads refuse to yield, other threads will be unable to run.
iocp epoll kqueue
This book aims to explain how Epoll, Kqueue and IOCP works, and how we can use this for efficient, high performance I/O. The book is divided into three parts:
Part 1 - An express explanation: is probably what you want to read if you're interested in a short introduction.
The Appendix contains some additional references and small articles explaining some concepts that I found interesting and which is related to the kind of code we write here.
Part 2 is special. 99% of readers should not even go there. You'll find page up and down with code and explanations just to implement the simplest example of a cross-platform-eventloop that actually works. Turns out that there is no "express" way of doing this.
There are few times when I'm so excited about a new feature that I'll write about it before multiple PRs are merged in multiple repos. Typically one would wait and have the patience until everything is fully merged to master yet I can't wait to talk about this one cause it's just too damn cool.
What this new branch offers is a way to instantly reboot cloud vms whenever your application dies a horrible death. Let's say a bunch of bad packets from the wrong side of town arrive and decide to shoot your vm full of lead. In a typical linux setup that instance is probably dead, Jim. Your load balancer might start re-routing around it. In a container setup you might get the same sort of deal. Sure, if it was just the process that died systemd might be configured to restart on failure but the whole box?
However, what if you weren't running a full blown linux as your base vm? What if your base vm was only a single application that your vm booted straight into and your application was re-spawned in seconds as if it was a process instead of a vm?
Executables have been fascinating to me ever since I discovered, as a kid, that they were just files. If you renamed a .exe to something else, you could open it in notepad! And if you renamed something else to a .exe, you'd get a neat error dialog.
Clearly, something was different about these files. Seen from notepad, they were mostly gibberish, but there had to be order in that chaos. 12-year-old me knew that, although he didn't quite know how or where to dig to make sense of it all.
So, this series is dedicated to my past self. In it we'll attempt to understand how Linux executables are organized, how they are executed, and how to make a program that takes an executable fresh off the linker and compresses it - just because we can.
Since the last big series, Making our own ping, was all about Windows, this one will be focused on 64-bit Linux.
OxidizedOS is a multicore, x86-64 kernel written in Rust. In this Series, we will be discussing the implementation of kernel threads and a scheduler in Rust.
Krabs is an experimental x86 bootloader written in Rust. Krabs can load and start the ELF format Linux kernel compressed with bzip2. Some of the source code uses libbzip2 C library for decompressing, but the rest is completely Rust only.
This is chapter 6 of a multi-part series on writing a RISC-V OS in Rust. Processes are the whole point of the operating system. We want to start doing "stuff", which we'll fit into a process and get it going. We will update the process structure in the future as we add features to a process. For now, we need a program counter (which instruction is executing), and a stack for local memory.
We will not create our standard library for processes. In this chapter, we're just going to write kernel functions and wrap them into a process. When we start creating our user processes, we will need to read from the block device and start executing instructions. That's quite a ways a way, since we will need system calls and so forth.
In the past few months I’ve been working with Red Sift on RedBPF, a BPF toolkit for Rust. Red Sift uses RedBPF to power the security monitoring agent InGRAINd. Peter recently blogged about RedBPF and InGRAINd, and ran a workshop at RustFest Barcelona. We’ve continued to improve RedBPF since, fixing bugs, improving and adding new APIs, adding support for Google Kubernetes Engine kernels and more. We’ve also completed the relicensing of the project to Apache2/MIT – the licensing scheme used by many of the most prominent crates in the Rust ecosystem – which will hopefully make it even easier to adopt RedBPF.
In this post I’m going to go into some details into what RedBPF is, what its main components are, and what the full process of writing a BPF program looks like.
As a follow up to my post on distribution packaging, it was commented by Fraser Tweedale (@hackuador) that traditionally the “security” aspects of distribution packaging was a compelling reason to use distribution packages over “upstreams”. I want to dig into this further.
This post gives an overview of the recent updates to the Writing an OS in Rust blog and the used libraries and tools.
I moved to a new apartment mid-October and had lots of work to do there, so I didn't have the time for creating the October status update post. Therefore, this post lists the changes from both October and November. I'm slowly picking up speed again, but I still have a lot of mails in my backlog. Sorry if you haven't received an answer yet!
Microsoft can't throw away old Windows code, but the company's research under Project Verona is aiming to make Windows 10 more secure with its recent work on integrating Mozilla-developed Rust for low-level Windows components.
A few years back, I wrote up a detailed blog post on Docker's process 1, orphans, zombies, and signal handling. The solution from three years ago was a Haskell executable providing this functionality and a Docker image based on Ubuntu.
A few of the Haskellers on the FP Complete team have batted around the idea of rewriting pid1 in Rust as an educational exercise, and to have a nice comparison with Haskell. No one got around to it. However, when Rust 1.39 came out with async/await support, I was looking for a good use case to demonstrate, and decided I'd do this with pid1.
After the addition of the NVMe driver a couple months ago, I have been running Redox OS permanently (from an install to disk) on a System76 Galago Pro (galp3-c), with System76 Open Firmware as well as the un-announced, in-development, GPLv3 System76 EC firmware . This particular hardware has full support for the keyboard, touchpad, storage, and ethernet, making it easy to use with Redox.
Moonrise is a Linux init system written in Lua with Rust support code. An init system is a software suite responsible for bringing the userspace components of an operating system online and, in most cases, managing long-running components such as background services.
When I was writing a fingerd daemon in Rust (why? because I could), one thing that took me a little while to figure out was how to drop root privileges after I bound to port 79.
Neotron is an attempt to make computers simple again, whilst also taking advantage of the very latest in programming language development. It is based around four simple concepts: The ARM Thumb-v7M instruction set, A standardised OS interface, A standardised BIOS interface, and Use of the Rust Programming Language.
Today I’m releasing a library called iou. This library provides idiomatic Rust bindings to the C library called liburing, which itself is a higher interface for interacting with the io_uring Linux kernel interface. Here are the answers to some questions I expect that may provoke.
What is io_uring? io_uring is an interface added to the Linux kernel in version 5.1. Concurrent with that, the primary maintainer of that interface has also been publishing a library for interacting with it called liburing.
This blog describes part of the story of Rust adoption at Microsoft. Recently, I’ve been tasked with an experimental rewrite of a low-level system component of the Windows codebase (sorry, we can’t say which one yet). Instead of rewriting the code in C++, I was asked to use Rust, a memory-safe alternative. Though the project is not yet finished, I can say that my experience with Rust has been generally positive. It’s a good choice for those looking to avoid common mistakes that often lead to security vulnerabilities in C++ code bases.
During the product development process monitoring our pipelines proved challenging, and we wanted more visibility into our containers. After a short period of exploration, we found that eBPF would address most of the pain points and dark spots we were encountering.
There was one catch: no eBPF tooling would help us deploy and maintain new probes within our small, but focused ops team. BCC, while great for tinkering, requires significant effort to roll out to production. It also makes it difficult to integrate our toolkit into our usual CI/CD deployment models.
Faced with this dilemma, we decided the only option was for us to write our own Rust-based agent that integrated well with our testing and deployment strategies.
I have come to the point with C++/WinRT where I am largely satisfied with how it works and leverages C++ to the best of its ability. There is always room for improvement and I will continue to evolve and optimize C++/WinRT as the C++ language itself advances. But as a technology, the Windows Runtime has always been about more than just one language and we have started working on a few different projects to add support for various languages. None of these efforts could however draw me away from C++… that is until Rust showed up on my radar.
Rust is an intriguing language for me. It closely resembles C++ in many ways, hitting all the right notes when it comes to compilation and runtime model, type system and deterministic finalization, that I could not help but get a little excited about this fresh new take on language design. And so it is that I have started building the WinRT language projection for Rust.
Recently, a new Linux kernel interface, called io_uring, appeared. I have been looking into it a little bit and I can’t help but wondering about it. Unfortunately, I’ve had only enough time to keep thinking and reading about it. Nevertheless, I’ve decided to share what I’ve been thinking about so far in case someone wants to write some actual code and experiment. Basically, I have an idea for a crate and I’d love someone else to write it 😇.
In the world of systems programming where one may find themselves writing hardware drivers or interacting directly with memory-mapped devices, that interaction is almost always through memory-mapped registers provided by the hardware. We typically interact with these things through bitwise operations on some fixed-width numeric type.
QEMU and libvirt form the backend of the Red Hat userspace virtualization stack: they are used by our KVM-based products and by several applications included in Red Hat Enterprise Linux, such as virt-manager, libguestfs and GNOME Boxes.
Play with Linux process termination exploring such interesting features as PR_SET_CHILD_SUBREAPER and PR_SET_PDEATHSIG.
RISC-V ("risk five") and the Rust programming language both start with an R, so naturally they fit together. In this blog, we will write an operating system targeting the RISC-V architecture in Rust (mostly). If you have a sane development environment for RISC-V, you can skip the setup parts right to bootloading. Otherwise, it'll be fairly difficult to get started.
This tutorial will progressively build an operating system from start to something that you can show your friends or parents -- if they're significantly young enough. Since I'm rather new at this I decided to make it a "feature" that each blog post will mature as time goes on. More details will be added and some will be clarified.
We designed a framework to help developers to quickly build device drivers in Rust. We also utilized Rust’s security features to provide several useful infrastructures for developers so that they can easily handle kernel memory allocation and concurrency management, at the same time, some common bugs (e.g. use-after-free) can be alleviated.
We demonstrate the generality of our framework by implementing a real-world device driver on Raspberry Pi 3, and our evaluation shows that device drivers generated by our framework have acceptable binary size for canonical embedded systems and the runtime overhead is negligible.
The Redox official website
Over the past few months, System76 has been developing a simple, easy-to-use tool for updating firmware on Pop!_OS and System76 hardware. Today, we’re excited to announce that you can now check and update firmware through Settings on Pop!_OS, and through the firmware manager GTK application on System76 hardware running other Debian-based distributions.
In the last few weeks, I've been working on a new solution to firmware management on the Linux desktop. A generic framework which combines fwupd and system76-firmware; with a GTK frontend library and application; that is written in Rust.
The Redox official website
This week I’ve decided to skip trying to get GDB working for now (there are so many issues it’ll take forever to solve them), and instead decided to finally give focus to the final concerns I had about ptrace. Most changes this week was related to getting decent behavior of child processes, although the design feels… suboptimal, somehow (not sure why), so I feel I must be able to improve it better later.
Another change was security: Tracers running as a non-root user can now in addition to only tracing processes running as the same user, only trace processes that are directly or indirectly children of the tracer. In the future this can easily be allowed with some kind of capability, but currently in Redox there isn’t a capability-like system other than the simple (but really powerful) namespacing system which sadly I don’t think can be used for this.
Once again, last weeks action was merged, which means the full ptrace feature was merged, and it’s time to start tackling the final issues which I have delayed for so long. But, before that, I decided to try to get some basic ptrace compatibility in relibc, so we could see just how far away software like gdb is from being ported, and what concerns I haven’t thought about yet. redox-nix update: That said, I took a little break from the madness, to instead lay my focus on another interesting problem: Newer redoxer couldn’t be compiled using carnix, because of some dependency that used a cargo feature carnix didn’t support. Let me first explain what carnix is, and why this is a problem.
Before I dive in to this week’s actions, I am pleased to announce that all the last weeks’ work is merged! This merge means you can now experiment with basic ptrace functionality using only basic registers and PTRACE_SYSCALL/PTRACE_SINGLESTEP. I have already opened the second PR in the batch: Ptrace memory reading and floating point registers support which will supply the “final bits” of the initial implementation, before all the nitpicking of final concerns can start (not to underestimate the importance and difficulty of these nitpicks - there are some areas of ptrace that aren’t even thought about yet and those will need tending to)! I will comment on these changes in this blog post, as there are some interesting things going on!
Wrapping up the Ion as a library project. It is now possible to embed Ion in any Rust application. Ion takes any Read instance and can execute it (so yes, it is possible to run Ion without ever collecting the script’s binary stream). It takes care of expanding the input and managing the running applications in an efficient manner, with a comprehensive set of errors. Ion is now the rust-based, pipe-oriented liblua alternative.
The next step in the journey of ptrace was to bite the bullet (or at least I thought) and implement system-call tracing. Since the kernel must be able to handle system-calls of processes, it’s quite obvious that the way to set a breakpoint should involve the kernel, running in the context of the tracee, should notify the tracer and wait. So the biggest challenge would be to figure out how kernel synchronization worked.
This post adds support for heap allocation to our kernel. First, it gives an introduction to dynamic memory and shows how the borrow checker prevents common allocation errors. It then implements the basic allocation interface of Rust, creates a heap memory region, and sets up an allocator crate. At the end of this post all the allocation and collection types of the built-in alloc crate will be available to our kernel.
After having a pretty clear goal to meet specified by the RFC, time to get things moving. I started with what I thought would be low hanging fruit: Reading the registers of another process. It ended up being more difficult than I thought, but it ended up being really interesting and I want to share it with you :)
How to fetch batteries information from the macOS APIs with Rust
I will quickly show how I got bindgen (https://rust-lang.github.io/rust-bindgen) to generate the bindings to Fuse (libfuse) with the current stable1 release of Rust. By doing so, this should demonstrate how to bootstrap writing your own Fuse file system in Rust.
I do realise that there are some crates that already exist that aid in making Fuse drivers in Rust, but this was more or less an excuse to also try out bindgen, which I don't believe those existing libraries utilise.
Manticore is a research operating system, written in Rust, with the aim of exploring the parakernel OS architecture.
The OS is increasingly a bottleneck for server applications that want to take maximum advantage of the hardware. Many traditional kernel interfaces (such as in POSIX) were designed when I/O was significantly slower than the CPU. However, today I/O is getting faster, but single-threaded CPU performance has stagnated. For example, a 40 GbE NIC can receive a cache-line sized packet faster than the CPU can access its last-level cache (LLC), which makes it tricky for an OS to keep up with packets arriving from the network. Similarly, non-volatile memory (NVM) access speed is getting closer to DRAM speeds, which challenges OS abstractions for storage.
To address this OS bottleneck, server applications are increasingly adopting kernel-bypass techniques. For example, the Seastar framework is an OS implemented in userspace, which implements its own CPU and I/O scheduler, and bypasses the Linux kernel as much as it can. Parakernel is an OS architecture that eliminates many OS abstractions (similar to exokernels) and partitions hardware resources (similar to multikernels) to facilitate high-performance server application with increased application-level parallelism and predictable tail latency.
This repository contains a simple KVM firmware that is designed to be launched from anything that supports loading ELF binaries and running them with the Linux kernel loading standard. The ultimate goal is to be able to use this "firmware" to be able to load a bootloader from within a disk image.
This post explores unit and integration testing in no_std executables. We will use Rust's support for custom test frameworks to execute test functions inside our kernel. To report the results out of QEMU, we will use different features of QEMU and the bootimage tool.
Recently, x86_64-unknown-uefi target was added into Rust mainline (https://github.com/rust-lang/rust/pull/56769). So, I tried to write UEFI application with this update. There exists an awesome crate, uefi-rs, which provides Rust interface for UEFI application. However, this is my first time to write UEFI application, so to understand what happens in it, I didn’t use any existing crate.
It has been one year and four days since the last release of Redox OS! In this time, we have been hard at work improving the Redox ecosystem. Much of this work was related to relibc, a new C library written in Rust and maintained by the Redox OS project, and adding new packages to the cookbook. We are proud to report that we have now far exceeded the capabilities of newlib, which we were using as our system C library before. We have added many important libraries and programs, which you can see listed below.
This post shows how to implement paging support in our kernel. It first explores different techniques to make the physical page table frames accessible to the kernel and discusses their respective advantages and drawbacks. It then implements an address translation function and a function to create a new mapping.
This will be the first in a series of weekly updates on progress made in the development of Pop!_OS. Thus, this will only contain content pertaining specifically to Pop!_OS, though at times there may be some overlap with the hardware side of System76.
I’ve decided to take a look at Minix, which is an interesting microkernel OS. Naturally after building Minix from git, the first thing I decided to try was porting Rust’s std to Minix so I could cross-compile Rust programs from Linux to run under Minix. Okay, I suppose I could have started with something else, but porting Rust software and modifying the platform-depending part of std is something I have experience with from working on Redox OS. And Rust really isn’t that hard to port.
We are going to make a demo linux web-server with systemd, config file and installable .deb binary in Rust.
This post introduces paging, a very common memory management scheme that we will also use for our operating system. It explains why memory isolation is needed, how segmentation works, what virtual memory is, and how paging solves memory fragmentation issues. It also explores the layout of multilevel page tables on the x86_64 architecture.
It has been a long-standing tradition to develop a language far enough to be able to write the language's compiler in the same language, and Rust does the same. Rust is nowadays written in Rust. We've tracked down the earlier Rust versions, which were written in OCaml, and were planning to use these to bootstrap Rust. But in parallel, John Hudge (Mutabah) developed a Rust compiler, called "mrustc", written in C++. mrustc is now good enough to compile Rust 1.19.0. Using mrustc, we were able to build Rust entirely from source with a bootstrap chain
In this post we set up the programmable interrupt controller to correctly forward hardware interrupts to the CPU. To handle these interrupts we add new entries to our interrupt descriptor table, just like we did for our exception handlers. We will learn how to get periodic timer interrupts and how to get input from the keyboard.
Stratis 1.0 was quietly released last week with the 1.0 version marking its initial stable release and where also the on-disk meta-data format has been stabilized. Red Hat engineers believe Stratis is now ready for more widespread testing.
Time for me to pack up and never ever contribute to Redox ever again… Just kidding. This isn’t goodbye, you can’t get rid of me that easily I’m afraid. I’ll definitely want to contribute more, can’t however say with certainty how much time I’ll get, for school is approaching, quickly
The previous blog post discusses how raw disk reads were implemented in the loader stub. The next step was to implement a clean read API which can be used by different filesystem libraries in order to read their respective filesystems. Since the raw reads from the BIOS interrupt had a granularity in terms of sectors(each sector being 512 bytes), the reads had to be translated in order to provide byte level granularity. The clone_from_slice function ensures that a direct call to memcopy is not required. The refined read function is here.
At the time of writing the previous blog the plan was to target the Raspberry Pi 3 (Cortex A53) as a development platform because of its availability, popularity and community. Sadly, it seems that Broadcom went through a lot of shortcuts while implementing this specific design, which means features like GIC are half-there or completely missing, like in this case.
After a discussion with @microcolonel, he proposed and kindly sent me a HiKey960 reference SoC from the awesome Linaro 96Boards initiative. The quality of this board is definitely a lot better than the Raspberry Pi and the documentation is detailed and open. Great stuff.
With the recent addition of Rust 1.27.0 in the HaikuPorts repository, I thought it would be good to do a short, public write-up of the current state of Rust on Haiku, and some insight into the future.
This is the second blog post about implementing a FAT32 filesystem in Redox.
As promised in the previous article (thanks for all the valuable feedback ‒ I didn’t have the time to act on it yet, but I will), this talks about Unix signal handling.
Long story short, I wasn’t happy about the signal handling story in Rust and this is my attempt at improving it.
Over the last couple of weeks, Nebulet has progressed signifigantly. Because of that, I think it’s time to talk about why I made certain decisions when designing and writing Nebulet.
All excited. A first calendar entry to describe my attempt on arm64 support in Redox OS. Specifically, looking into the Raspberry Pi2/3b/3+(all of them having a Cortex-A53 ARMv8 64-bit microprocessor, although for all my experiments I am going to use the Raspberry Pi 3b.
In this post we explore double faults in detail. We also set up an Interrupt Stack Table to catch double faults on a separate kernel stack. This way, we can completely prevent triple faults, even on kernel stack overflow.
In this post, we start exploring CPU exceptions. Exceptions occur in various erroneous situations, for example when accessing an invalid memory address or when dividing by zero. To catch them, we have to set up an interrupt descriptor table that provides handler functions. At the end of this post, our kernel will be able to catch breakpoint exceptions and to resume normal execution afterwards.
In this post we complete the testing picture by implementing a basic integration test framework, which allows us to run tests on the target system. The idea is to run tests inside QEMU and report the results back to the host through the serial port.
Last week I ended off stating that the redox netstack might soon switch to an edge-triggered model. Well, I ended up feeling bad about the idea of letting others do my work and decided to stop being lazy and just do it myself.
Rust is an extremely interesting language for the development of system software. This was the motivation to evaluate Rust for HermitCore and to develop an experimental version of our libOS in Rust. Components like the IP stack and uhyve (our unikernel hypervisor) are still written in C. In addition, the user applications are still compiled by our cross-compiler, which is based on gcc and supports C, C++, Fortran, and Go. The core of the kernel, however, is now written in Rust and published at GitHub. Our experiences so far are really good and we are looking into possibly new Rust activities, e.g., the support for Rust’s userland.
A first calendar entry to describe my attempt on ARM64 support in Redox OS. Specifically, looking into the Raspberry Pi2/3(B)/3+ (all of them having a Cortex-A53 ARMv8 64-bit microprocessor, although for all my experiments I am going to use the Raspberry Pi 3(B)).
This is a blog post about the work which I have done so far in implementing a FAT32 filesystem in Redox. Currently the Redox bootloader as well as the userspace filesystem daemon supports only RedoxFS.
This is the weekly summary for my Redox Summer of Code project: Porting tokio to redox. Most of the time was spent on one bug, and after that one was figured out and fixed it ended up being relatively easy! As of now, 11⁄13 tokio examples seem to work on redox. The remaining examples are UDP and seem to fail because of something either with the rust standard library or my setup.
This post explores unit testing in no_std executables using Rust's built-in test framework. We will adjust our code so that cargo test works and add some basic unit tests to our VGA buffer module.
Redox OS is running its own Summer of Code this year, after the Microkernel devroom did not get accepted into GSoC 2018. We are looking for both Students and Sponsors who want to help Redox OS grow. At the moment, Redox OS has $10,800 in donations from various platforms to use to fund students. This will give us three students working for three months, if each student requests $1200 per month on average as described in Payment.
In order to fund more students, we are looking for sponsors who are willing to fund RSoC. Donations can be made on the Donate page. All donations will be used to fund Redox OS activities, with about 90% of those over the past year currently allocated to RSoC.
Our second iteration of the 18.04 ISO is ready for testing. Testing the new installer and Optimus switching is our priority for this test release. Please test installing on a variety of hardware and provide feedback on any issues you encounter. If you run into any bugs, you can file them at https://github.com/pop-os/pop/issues.
Installing a toolchain for Rust is very easy, as support for CloudABI has been upstreamed into the Rust codebase. Automated builds are performed by the Rust developers. As there hasn’t been a stable release of Rust to include CloudABI support yet, you must for now make use of Rust’s nightly track.
Over the past six months we've been working on a second edition of this blog. Our goals for this new version are numerous and we are still not done yet, but today we reached a major milestone: It is now possible to build the OS natively on Windows, macOS, and Linux without any non-Rust dependendencies.
Writing eBPF tracing tools in Rust
I have been playing with eBPF (extended Berkeley Packet Filters), a neat feature present in recent Linux versions (it evolved from the much older BPF filters). It is a virtual machine running in th…