In the eternal quest to rewrite everything in Rust, even the C standard library isn’t safe from carcinisation.
Modern Rust programs are, for the most part, written mostly in Rust. For networking applications, the entire asynchronous stack is Rusty; no libuv in sight, only mio
and polling
. There is a robust rendering stack based on tiny-skia
and cosmic-text
. Even if you need FFI, the story is still pretty good. x11rb
provides a robust wrapper around libxcb
with a fully Rust-based alternative, and wayland-rs
is the same with Wayland.
Still, if you want to write pure Rust programs, there is one annoying dependency that nearly every Rust program has. Let’s take a basic smol
-based program, written top-to-bottom in Rust. Or so I think. Let’s see what ldd
says.
$ ldd linux-timerfd
linux-vdso.so.1 (0x00007fff58fae000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe7681bb000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe767e00000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe7682d4000)
Blech, disgusting! Let’ go over each of those libraries:
linux-vdso
is the vDSO, which is used to implement certain system calls that can be reasonably implemented in user space. This object is used to prevent needing to incur syscall overhead unless it’s needed.libgcc
provides implementations for certain operations, like floating point operations and exception handling. The “_s” stands for “shared”, since it is a shared library.ld-linux-x86-64
is the dynamic linker runtime. It’s what holds everything together.libc
is the C standard library, which contains wrappers around every relevant system call as well as a handful of C-oriented routines.
We’re going to spend most of our time talking about libc
.
What is libc?
libc
wears a lot of hats. It provides an extensive library of functions that are useful for C programmers, like fopen
and memchr
. It also contains more platform-specific wrappers around OS-specific functionality, like kqueue
. In addition, for many operating systems, it’s the only stable interface between the user space and the kernel.
This part is important. Usually system calls, special interrupt instructions, are used to tell the kernel to do something important. However, these system call interfaces are usually unstable and prone to rapid, undocumented changes. This means that anyone trying to access kernel functionality has to go through libc
, even if they aren’t actually C. This isn’t a suggestion; Go tried to use direct system calls on macOS a while back and got burned by it. It turns out, when they say “unstable”, they actually do mean “will change in inconsistent, backwards-incompatible ways”.
There are two important exceptions to this. The first is Windows, which has its own Win32 API that its libc
is just a thin-ish wrapper over. This wrapper is how C programs written for Linux can sometimes still be used on Windows if it just uses the portable parts of libc
. For our purposes it isn’t important. The second exception is Linux, which actually does have a stable system-call interface. You can call it from anywhere without going through libc
, and you don’t have to worry about the actual calls changing out from under you.
Musl Melee
Because the userspace interface for Linux is the system calls and not the libc
, you actually have a choice in what implementation of libc
to use aside from “whatever the OS developer wants you to use”. There are two prominent implementations:
- The GNU C Library (
glibc
), which is the battle-tested full-featured implementation. - Musl
libc
, which aims to be a simpler implementation focused on static linking.
In fact, if you were wondering what the “gnu
” at the end of “x86_64-unknown-linux-gnu
” means, it stands for the GNU C Library that Rust is using as an interface to the system. If we have the x86_64-unknown-linux-musl
target installed, we can switch that out for Musl pretty easily.
$ rustup target add x86_64-unknown-linux-musl
$ cargo build --target x86_64-unknown-linux-musl
I wonder what ldd
says instead now?
$ ldd linux-timerfd
statically linked
Well, look at that! By default, the *-musl
option automatically statically links the binary. No more dependencies! Everything is good, forever!
Rustix Revelation
Hmm, there’s something gnawing at the edge of my mind, like there’s something still wrong with this program. I just can’t put my finger on it.
It must be that there’s still C in there. Even though it’s statically linked, Musl is still a massive blob of unsafe, unsound, filthy C code.
Well, actually, Musl, and glibc
for that matter, are extraordinarily well tested. Being that it’s a C standard library, it’s actually held to a very high standard for security and soundness.
But after my exposure to the Rustonomicon, my sanity has begun to decay like a flower wilting in winter. So let’s leave the realm of rational thought and imagine what can be. A veil lifts in my mind, revealing the Platonic ideal of a perfect program:
All Rust. Down to the very last bit. Perfect, clean Rust.
If only there was some way that we could tear that C code out by the teeth and leave it rotting at the wayside. But alas, our program needs that little bit of C code to run. Even if we were to rewrite our Rust code to use only syscalls, there would still need to be some glue code for program initialization, signal handling and threading. All written in Assembly and dirty, dirty C.
No, I see something wondrous. The answer to my prayers. What will purify my unclean executables and let them ascend into His Light!
The answer is eyra
.
eyra
is a set of libraries that aim to replace the role of the traditional libc
in modern programs. It is written entirely in Rust, not counting the bits of Assembly necessary to tie the entire thing together. Not even a trace of C.
eyra
was written by Dan Gohman, who is also the primary author of rustix
. rustix
is a safe wrapper around either raw system calls on Linux or libc
on other platforms, and is a very fascinating piece of software that deserves its own blogpost. The point is, eyra
is rustix
taken to its logical conclusion: a complete replacement for libc
.
The main drawback of eyra
is that the process of integrating it into your program is more involved than just setting --target
. But, it’s not so bad. Let’s write an eyra
example program.
Enabling Eyra
First things first, let’s let cargo
take care of scaffolding for us.
$ cargo new --bin eyra-example
Created binary (application) `eyra-example` package
Let’s make it a little bit more complex than a “Hello, world!” program. Say, a smol
-based TCP server that tells bad jokes.
$ cargo add smol fastrand eyre
Updating crates.io index
Adding smol v1.3.0 to dependencies.
Adding fastrand v2.0.1 to dependencies.
Features:
+ alloc
+ std
- getrandom
- js
Adding eyre v0.6.8 to dependencies.
Features:
+ auto-install
+ track-caller
- pyo3
Updating crates.io index
In src/main.rs
, we write a simple TCP server:
// in src/main.rs
use eyre::Result;
use smol::io::BufReader;
use smol::net::{TcpListener, TcpStream};
use smol::prelude::*;
const BAD_JOKES: &[&str] = &[
"What do you call a fly without wings? A walk.",
"Did you hear about the dull pencil? It was pointless.",
"Why did the golfer bring two pairs of pants? In case he got a hole in one."
];
/// Handle an incoming connection.
async fn handle_connection(stream: TcpStream) -> Result<()> {
// Wrap the stream in a BufReader to ease reading lines.
let mut stream = BufReader::new(stream);
// Read a line from the stream.
let mut command = String::new();
stream.read_line(&mut command).await?;
// Remove the newline at the end if there is one.
if command.ends_with('\n') {
command.pop();
}
// Send a joke if the user asked for one.
command.make_ascii_lowercase();
if command == "tell me a joke" {
// Choose a joke and send it.
let joke = format!("{}\n", fastrand::choice(BAD_JOKES).unwrap());
stream.get_mut().write_all(joke.as_bytes()).await?;
} else {
// Otherwise, send an error message.
let message = "I only know how to tell jokes.\n";
stream.get_mut().write_all(message.as_bytes()).await?;
}
Ok(())
}
fn main() -> eyre::Result<()> {
smol::block_on(async {
// Listen on a random port.
let listener = TcpListener::bind("127.0.0.1:0").await?;
let addr = listener.local_addr()?;
println!("Listening at address {:?}", addr);
// Start running an executor.
let ex = smol::Executor::new();
ex.run(async {
loop {
// Accept a new connection.
let (stream, _) = listener.accept().await?;
// Spawn a task to handle the connection.
ex.spawn(async move {
// If an error occurs while running the task, print it.
if let Err(e) = handle_connection(stream).await {
eprintln!("An error occurred: {}", e);
}
}).detach();
}
}).await
})
}
See the comments for a breakdown of how the program works, for those unfamiliar with smol
’s API.
When we run the program, it tells us the IP address that it’s listening on:
$ cargo run
Compiling eyra-example v0.1.0 (/home/jtnunley/Programming/eyra-example)
Finished dev [unoptimized + debuginfo] target(s) in 0.64s
Running `/home/jtnunley/Programming/CargoTarget/debug/eyra-example`
Listening at address 127.0.0.1:44439
In lieu of a dedicated client, we can use netcat
to test out the server.
$ echo "tell me a joke" | nc 127.0.0.1 44439
Why did the golfer bring two pairs of pants? In case he got a hole in one.
By checking with ldd
, we can see that we’ve compiled this program against glibc
.
$ ldd /home/jtnunley/Programming/CargoTarget/debug/eyra-example
linux-vdso.so.1 (0x00007ffe8334d000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f4bb8376000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4bb8000000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4bb850a000)
Let’s try to integrate with eyra
! (Not eyre
, which I use to simplify error handling.) First, we need to add the latest version of eyra
to our project. Let’s also add logging so we can see what eyra
is doing under the hood.
$ cargo add eyra -F log,env_logger
Updating crates.io index
Adding eyra v0.15.2 to dependencies.
Features:
+ env_logger
+ log
- experimental-relocate
- max_level_off
Using cargo tree
, we can see that this pulls in c-gull
, c-scape
and a bunch of other things.
$ cargo tree
eyra-example v0.1.0 (/home/jtnunley/Programming/eyra-example)
├── eyra v0.15.2
│ └── c-gull v0.15.3
│ ├── c-scape v0.15.3
│ │ ├── < like, a lot of packages >
< snip rest out the output >
Then, we add extern crate eyra
to the top of the main.rs
file so that Rust knows to link to eyra
, even if we don’t directly use anything from it.
// in src/main.rs
extern crate eyra;
// <snip rest of file>
Finally, we have to add a build.rs
file, which is a little build script that runs before your Rust crate is compiled. We ue this to tell Rust to link using the -nostartfiles
argument, which tells Rust not to bring in any of the C runtime. This is because eyra
has its own initializing runtime, written in Rust.
// in build.rs
fn main() {
println!("cargo:rustc-link-arg=-nostartfiles");
}
Now, we can run cargo build
, which builds a significantly greater number of dependencies. Afterwards, we still have the eyra-example
executable. Let’s see what’s inside.
$ ldd /home/jtnunley/Programming/CargoTarget/debug/eyra-example
statically linked
Nice! It’s been statically linked, hopefully with 100% Rust code. Let’s run the executable with RUST_LOG=trace
and see how it works.
$ RUST_LOG=trace cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.03s
Running `/home/jtnunley/Programming/CargoTarget/debug/eyra-example`
[TRACE origin::program] Program started
[TRACE origin::thread] Main Thread[Pid(89539)] initialized
[TRACE origin::program] Calling `.init_array`-registered function `0x563e1d8a5600(1, 0x7ffc4fff5d78, 0x7ffc4fff5d88)`
[TRACE origin::program] Calling `origin_main(1, 0x7ffc4fff5d78, 0x7ffc4fff5d88)`
[TRACE async_io::driver] block_on()
[TRACE origin::thread] Thread[Pid(89539)] launched thread Thread[89541] with stack_size=2097152 and guard_size=16384
[TRACE origin::thread] Thread[89541] marked as detached by Thread[Pid(89539)]
[TRACE polling::epoll] add: epoll_fd=4, fd=6, ev=Event { key: 18446744073709551615, readable: false, writable: false }
[TRACE polling::epoll] add: epoll_fd=4, fd=5, ev=Event { key: 18446744073709551615, readable: true, writable: false }
< snip: lots of logs from smol being initialized >
Let’s break it down:
[TRACE origin::program] Program started
[TRACE origin::thread] Main Thread[Pid(89539)] initialized
[TRACE origin::program] Calling `.init_array`-registered function `0x563e1d8a5600(1, 0x7ffc4fff5d78, 0x7ffc4fff5d88)`
[TRACE origin::program] Calling `origin_main(1, 0x7ffc4fff5d78, 0x7ffc4fff5d88)`
These logs come from the program starting up and setting everything up. It initializes the main threads, calls the program constructors (see the ctor
crate if you want to know more about that), and launches the program’s entry point, origin_main
.
[TRACE async_io::driver] block_on()
[TRACE origin::thread] Thread[Pid(89539)] launched thread Thread[89541] with stack_size=2097152 and guard_size=16384
[TRACE origin::thread] Thread[89541] marked as detached by Thread[Pid(89539)]
async-io
, smol
’s I/O driver, works by spawning a thread and then running epoll
from that. This is used to deliver events throughout the program. Here we can see the driver being started, then a thread being launched to run epoll
on.
As we can see, our program is now running on pure-Rust (and a little assembly) software. Does it work?
$ echo "tell me a joke" | nc 127.0.0.1 37279
What do you call a fly without wings? A walk.
Works like a charm!
Final Tally
Although it’s certainly a neat project that’s treading a lot of new ground, I probably wouldn’t recommend using eyra
for a production grade project. It’s still wet behind the ears and it doesn’t add much practical value to projects. Still, it’s cool to be able to say that my project is 100% Rust.
The setup is somewhat convoluted. It would be nice if there was some subcommand that set up eyra
for a project temporarily, like cargo-hack
does.
Also, eyra
still doesn’t support every libc
function. It’s a slow uphill battle. They are open for contributions if you’re missing something important.
Until then, I’m very excited for what eyra
will bring for Rust programs in the future.