pantheon: Creating a seamless cross-compiling build system

In the previous part I mentioned that we were going on a long yak shave, well we are definitively not ready to go back on track. This part will build upon the argument parsing library in order to write a brand-new binary, a build system!

I named this build system vishvakarma (or vvk) for the Hindu god of craftsman (and a divine architect). As it is completely overkill it defines its own language to declare build targets. In addition, this build system should seamlessly build propagate the correct build architecture in the dependency graph to never have to worry about it. The complete repository for pantheon may become quite large, and as such we want to avoid building parts of it that are not needed. A nice way (but completely overkill for a build-system) to do this is to use a lazy language, such that nothing is evaluated if it is not needed.

This article will go into the different parts that go into making a build system, like parsing the build description, scheduling the builds or driving the compiler.

The goal of the pantheon project is to develop an entire OS without external dependencies. In addition, I try to not look at too much existing code or designs, in order to find the pitfalls myself.

The goal

Let's say we have a shared library lib, and we want both hades and vvk to use it. We want to be able to write this:

bare-metal {
    name: 'hades',
    root: 'hades/src/main.rs',
    linker_script: 'hades/hades.x',
    language: 'rust',
    dependencies: [::lib],
}

executable {
    name: 'vvk',
    root: 'vishvakarma/src/main.rs',
    language: 'rust',
    dependencies: [::lib],
}

lib = library {
    name: 'lib',
    root: 'lib/src/lib.rs',
    language: 'rust',
}

The library will be automatically built both for RISC-V 64 to be linked into hades, and for the native architecture to be linked into vvk. In addition, if we modify hades/src/main.rs we want to only rebuild hades, whereas if lib/src/lib.rs is modified we want to rebuild both hades and vvk

Parsing

Parsing will be divided in two parts, as usual:

Lexing, splitting the input string into tokens (semantic building blocks)
Parsing, building an abstract syntax tree of tokens

Parsing utilities

Both parts will be built upon a simple type used to generate errors, a SpannedValue:

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct Location {
    pub start: usize,
    pub end: usize,
    pub source: Source,
}

#[derive(Clone, Debug)]
struct SpannedValue<T> {
    pub location: Location,
    pub v: T,
}

In order to make this type a bit more ergonomic we can define a few utilities of this type:

impl<T> SpannedValue<T> {
    pub fn span(&self) -> Location {
        self.location.clone()
    }
}

pub trait Spanned: Sized {
    fn spanned<W: Into<Location>>(self, span: W) -> SpannedValue<Self>;
}

impl<T> Spanned for T {
    fn spanned<W: Into<Location>>(self, span: W) -> SpannedValue<Self> {
        SpannedValue {
            location: span.into(),
            v: self,
        }
    }
}

Those values allow us to tag an AST node using the correct span, and then allows us to derive values using the same location

The Source type definition is a bit more involved, and a bit more interesting. We could define it as follows:

#[derive(Debug)]
struct SourceInner {
    pub path: PathBuf,
    pub data: String,
}

#[derive(Clone, Debug)]
struct Source(Rc<SourceInner>);

Using a Rc allows us to cheaply create references to the Source in order to generate diagnostics on errors.

But this has a bit of an issue. Sources will be the input to our lexer, and the main operation of a lexer is to read some part of the data string. To reach this string we first need to dereference the Rc, then the String, a nice pointer chasing in the hot path (do note that performance is absolutely not necessary as the runtime will be dominated by calling rustc).

We can instead define the types as follows:

#[derive(Debug)]
struct SourceInner {
    pub path: PathBuf,
    pub data: str, // <--- This is no longer a pointer
}

#[derive(Clone, Debug)]
struct Source(Rc<SourceInner>); // This has not changed

This type has much better locality, we only need to dereference one pointer in the hot path, the same as a String. But this definition is not really sufficient, how would we go about creating such a type? You can try to create a function fn(path: PathBuf) -> Source that allows it, but it’s not easy. This is because SourceInner is not a "normal" types, it is instead a Dynamically Sized Type (or DST).

Such types are a bit special because pointers to them are not simply an address, they also contain metadata that allows to dynamically know some property about them. For example a str pointer contains the length of the string. As our type contains a str we want to forward the same information to the Rc.

This will require us to slightly tweak the definition:

#[derive(Debug)]
#[repr(C)] // <--- new
struct SourceInner {
    pub path: PathBuf,
    end: (), // <--- new
    pub data: str,
}

// A prefix of the SourceInner struct
#[repr(C)]
struct SourceInnerSized {
    path: PathBuf,
}

This will allow us to rely on unsafe operations to construct a SourceInner

pub fn new(path: PathBuf) -> Result<Source, Error> {
    let data = std::fs::read_to_string(&path)?;
    let layout = Layout::form_value(&path)
        .extend(Layout::from_size_align(data.len(), 1).unwrap())
        .unwrap()
        .0
        .pad_to_align(); // This is important to guarantee that the struct is correctly sized

    let inner_alloc = unsafe { std::alloc::alloc(layout) as *mut SourceInnerSized };

    unsafe {
        let path_ptr = &raw mut (*inner_alloc).path;
        path_ptr.write(path);
    }

    let str_dst_ptr = unsafe {
        let str_ptr = inner_alloc.byte_add(std::mem::offset_of!(SourceInner, end)) as *mut u8;
        std::ptr::copy_nonoverlapping(data.as_ptr(), str_ptr, data.len());
        // This is the only way to obtain a DST with the metadata we want
        std::ptr::slice_from_raw_parts_mut(str_ptr, data.len())
    };

    let dst_box = unsafe {
        let p =
            // Keep the DST metadata while changing the type to be correct
            (str_dst_ptr as *mut SourceInner).byte_sub(std::mem::offset_of!(SourceInner, end));
        Box::from_raw(p)
    };

    // Convert the dyn Box to a dyn Rc
    Ok(Source(dst_box.into()))
}

Note that there are some strange things to do:

We must create a slice in order to obtain the correct metadata on our fat pointer
We must first allocate a box, as there is no way to create a dyn Rc directly

Lexing

Lexing will be quite simple, we will define the following list of tokens:

'string': String literal
mod: Module keyword
<identifier>: Identifier
@<directive>: Directive
;, {, }, [, ], ,, :, =, ::: Symbols

The lexer structure itself will store the source, and the current offset in the source. When a new token is requested it can begin to match the tokens, by priority:

Firstly the symbols are searched, using str::starts_with. This requires ordering the tokens such that if one is a prefix of another it comes later (:: must be checked before :).
Then if a ' is encountered the lexer will advance until it finds the first unescaped '.
For all remaining tokens we must first find the end of the potential token. This is done by searching for the first whitespace or symbol in the input.

If the substring we obtain starts with @ then we return it as a directive, else we will match it as a keyword. If it was neither a keyword nor a directive it must be an identifier (which is then checked to be only alphanumerical).

The parser will be required to peek at the token stream in order to create the syntax tree. The lexer can be easily adapted to support this, by using a VecDeque to store the peeked tokens. When the parser asks for the next token the queue will first be checked, if it was empty we can start to parse the input again. To avoid errors in the parser peeking will only return the token type, not it’s value.

To help the parser create sensible error messages the Lexer also has a utility function that wraps a closure to return a SpannedValue spanning the entire range of the source for the tokens that have been consumed.

Creating the AST

I’ve used in the past lalrpop in the past to write parsers, but I’d like to avoid external dependencies in pantheon, so I’ll need to use something custom. A long time ago I wrote a parser generator for SLR parsers, stelar but from what I remember the error messages were terrible, and I believe there were still bugs to fix.

As such we will go in another direction, hand-writing a parser based on recursive descent.

We will first define our (naive, I know there are more cache efficient ways to create this) AST:

#[derive(Debug)]
pub enum Statement {
    Module(String),
    Assign {
        name: String,
        value: Rc<SpannedValue<Expression>>,
    },
    Expr(SpannedValue<Expression>),
}


#[derive(Debug, Clone)]
pub enum Expression {
    String(Rc<str>),
    Array(Vec<Rc<SpannedValue<Expression>>>),
    Identifier(ItemPath),
    Target(TargetExpr),
}

#[derive(Debug, Clone)]
pub struct TargetExpr {
    pub directives: Vec<Directive>,
    pub kind: TargetKind,
    pub args: Arguments,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum TargetKind {
    Executable,
    Library,
    ProcMacro,
    Test,
    BareMetalBin,
}

#[derive(Debug, Clone)]
pub struct Arguments(Rc<HashMap<SpannedValue<String>, Rc<SpannedValue<Expression>>>>);

// Not really, but let’s not complicate things
type Directive = String;

We will then need to write for each AST node a function that is able to parse it from the token stream

For example for statements we could write the following:

impl Statement {
    fn parse<'a>(
        tokens: &mut TokenParser<'a>,
        root: &ItemPath,
        mut directives: Vec<SpannedValue<&'a str>>,
    ) -> Result<SpannedValue<Self>, ParseError> {
        tokens.parse_seq(|tokens| -> Result<_, ParseError> {
            let mut peeker = tokens.peek_cursor();

            match peeker.peek()? {
                // This is not directly "parsed" but accumulated for the expression that
                // will require it
                TokenKind::Directive => {
                    let token = tokens.next().unwrap();

                    directives.push(token.value);

                    Ok(Statement::parse(tokens, root, directives)?.v)
                }
                TokenKind::Module => {
                    if let Some(directive) = directives.first() {
                        return Err(ParseError::UnsupportedDirective {
                            name: directive.v.to_string(),
                            location: directive.span(),
                        });
                    }

                    // mod keyword
                    tokens.next().unwrap();

                    let name = tokens.next_identifier("module statement")?;
                    tokens.assert_next(TokenKind::Semicolon, "module statement")?;
                    Ok(Statement::Module(name.v.to_owned()))
                }
                TokenKind::Identifier => {
                    if peeker.peek()? == TokenKind::Equal {
                        let token = tokens.next().unwrap().value;
                        let name = token.v.to_string();

                        tokens.next().unwrap(); // Skip the equal sign

                        Ok(Statement::Assign {
                            name,
                            value: Rc::new(Expression::parse(tokens, root, &directives)?),
                        })
                    } else {
                        Ok(Statement::Expr(Expression::parse(
                            tokens,
                            root,
                            &directives,
                        )?))
                    }
                }
                _ => Ok(Statement::Expr(Expression::parse(
                    tokens,
                    root,
                    &directives,
                )?)),
            }
        })
    }
}

Expressions are a bit more complicated, but can be summarized as such:

Peek at the first token
- '<string>': A literal expression, nothing much to do
- ::: Parse a succession of :: and <identifier>, this represents an absolute item path
- [: Start parsing an array, this means parsing a succession of expressions followed by ,, terminated by a ]. As trailing commas are allowed care must be taken to correctly detect the end of the array
- <identfier>: This is the most "tricky" part as there are two possible kinds of expressions, either a relative path to a variable (foo or foo::bar), or a TargetExpr. Those two cases can be decided by peeking one more token. This is the reason we had to have a VecDeque instead of simply an Option for the peeked value.

Target expressions are easy enough to parse, so we will not dwell on it more. This grammar is relatively simple, and as such using a handwritten parser is not an issue at all. The parser is not particularly optimized, but the runtime of the tool is going to be dwarfed by calls to the compiler, so this should not be a problem.

Evaluation

The evaluation is going to be split into two phases:

Evaluate the build expressions to identify the build targets, and correctly order them
Call the compiler for the different build targets, passing along the required information

As stated in the goal we want the evaluation to be lazy, this means that we need a way to identify which targets are going to be run (and built).

Build file evaluation

A vvk tree can look something like this:

.
├── root.vvk
├── apis
│   ├── build.vvk
│   └── src
│       └── ...
├── hades
│   ├── build.vvk
│   └── src
│       └── ...
├── sheshat
│   ├── build.vvk
│   ├── derive
│   │   └── src
│   │       └── ...
│   ├── src
│   │   └── ...
│   └── tests
│       └── ...
└── vishvakarma
    ├── build.vvk
    └── src
        └── ...

vvk will search for a root.vvk file in the current directory, or any ancestor directory. Once the root has been found the entire tree can be extracted by following the mod directives and recursively parsing the child build.vvk files.

This does not evaluate anything, so we can do it eagerly.

The next step is to resolve all assignment (not to evaluate them!). This is done by registering the source code associated with the variable name, so that we can evaluate it at a later stage.

We will then choose the "primary" build file to start with. This can be done in two ways, either by specifying the file on the command line, or by searching for the first build.vvk file in a parent directory.

Once the build.vvk file has been identified vvk will read the statements in each file, recursively, searching for targets annotated with the @def directive. This directive identifies the "root" expressions that will be evaluated. Alternatively if vvk run is called the target with @main will be searched in the current file, and if none was found in the subtree. This is the only use of directives as of today, constraining the evaluation. We can’t use an argument in the target definition, as this would mean that we would need to start evaluating targets.

We are now ready to start evaluating expressions! Let’s define the structures that will hold our values:

#[derive(Debug, Clone)]
struct LazyValue(Rc<RefCell<ValueInner>>);

#[derive(Debug)]
enum ValueInner {
    /// A value that has already been evaluated
    Realized(Value),
    /// A value that has yet to be evaluated
    Lazy(Rc<SpannedValue<Expression>>),
}

/// Our primitive types
#[derive(Debug, Clone)]
enum Value {
    /// String litteral
    String(Rc<str>),
    /// Array expressions
    Array(Rc<[LazyValue]>),
    /// Targets
    Target(Rc<Target>),
}

// We don’t care for now
struct Target { ... }

The idea is that we convert all ast::Expression we encounter into a ValueInner::Lazy. We then store this in the interpreter, until we need to go one level deeper.

This will look something like this:

impl Interpreter {
    fn eval_lazy(&mut self, lazy: LazyValue) -> Result<Value, EvalError> {
        let realized = {
            // Our current value, starts with the LazyValue we are supplied
            let mut v = lazy.clone();
            loop {
                // Get the write guard on the value
                let guard = v.0.borrow_mut();
                match &*guard {
                    // If it is already realized then we have nothing more to do
                    ValueInner::Realized(value) => break value.clone(),
                    // If it is not, then we need to evaluate the expression’s source code,
                    // and start again
                    ValueInner::Lazy(expression) => {
                        let realized = self.eval_expr(expression)?;
                        drop(guard);
                        v = realized;
                    }
                }
            }
        };

        // Store the realized value back into the original value. This is a bit inefficient if
        // it was already realized
        *lazy.0.borrow_mut() = ValueInner::Realized(realized.clone());
        Ok(realized)
    }
}

The reason that we may need to evaluate in a loop are constructs like the following snippet:

a = target {}
b = a
c = b

If we want to evaluate c this will first return LazyValue::Lazy("a"), then LazyValue::Lazy("target {}").

Evaluating expressions themselves is quite straightforward:

impl Interpreter {
    fn eval_expr(&mut self, value: &SpannedValue<Expression>) -> Result<LazyValue, EvalError> {
        match &value.v {
            Expression::String(s) => Ok(Value::String(s.clone()).into()),
            Expression::Array(expressions) => {
                Ok(Value::Array(expressions.iter().cloned().map(Into::into).collect()).into())
            }
            Expression::Identifier(path) => self.variables.get(path)?,
            Expression::Target(TargetExpr { kind, args, .. }) => {
                Ok(Value::Target(self.evaluate_target(value.span(), *kind, args)?).into())
            }
        }
    }
}

The most tricky part is evaluating targets, targets are stateful as evaluating them will result in compiler invocations. As such care must be taken to only evaluate them once, and return the value all other times. This is not a problem for the other expressions, as they are pure.

Evaluating targets will result in their arguments being realized, and at the end we have a set of Rc<Target> that we will need to build. As we have been careful to only instantiate each target once we can use the Rc itself in our Hash and Eq implementations, using the following small wrapper:

#[derive(Debug)]
#[repr(transparent)]
struct RcCmp<T>(Rc<T>);

impl<T> Borrow<RcCmp<T>> for Rc<T> {
    fn borrow(&self) -> &RcCmp<T> {
        // SAFETY: RcCmp is a transparent structure so a pointer to a Rc is the same as a pointer
        // to a RcCmp
        unsafe { &*(self as *const _ as *const RcCmp<T>) }
    }
}

impl<T> Clone for RcCmp<T> {
    fn clone(&self) -> Self {
        Self(self.0.clone())
    }
}

impl<T> PartialEq for RcCmp<T> {
    fn eq(&self, other: &Self) -> bool {
        Rc::ptr_eq(&self.0, &other.0)
    }
}

impl<T> Eq for RcCmp<T> {}

impl<T> std::hash::Hash for RcCmp<T> {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        std::ptr::hash(&*self.0, state)
    }
}

impl<T> Deref for RcCmp<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        self.0.deref()
    }
}

Target evaluation

Once we have obtained the set of targets we are going to need to evaluate them. This will require two sub steps:

Find the correct ordering between targets
Call rustc correctly for each target

Ordering targets

In order to find the correct ordering I have written arachne a small graph library. The library is as usual overengineered as it supports a generic Graph trait that allows an implementation based on adjacency lists of indices, or adjacency lists based on HashMaps. I’m not going to go into the code, as there is nothing very new, though I enjoyed writing GATs for the first time.

This library allows us to convert our flat set of targets into a graph, using the dependencies for each target.

Here lies the main feature of vvk, the values in the graph are not simply a RcCmp<Target>, they are the following:

#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
enum TargetArch {
    Native,
    BareRV64,
}

#[derive(Clone, PartialEq, Eq, Hash)]
struct RealizedTarget {
    arch: TargetArch,
    target: RcCmp<Target>,
}

vvk support (as of today) two kind of binaries: native & bare metal RISC-V 64. The library dependencies of a binary it is inserted in the graph with the architecture of the binary, and this is propagated recursively through the dependency graph. Care must be taken when encountering proc-macros, as they are always for the native architecture even if the binary is a RISC-V 64 binary.

Once we have the dependency graph we first need to reverse all the edges to get the dependent graph, and we can then apply a topological sort on the graph in order to obtain the ordering that will allow us to build the project

Building Rust crates

In order to store the artifacts we will need to define a build directory with the following structure:

.
├── debug
│   ├── bare-rv64
│   └── native
└── release
    ├── bare-rv64
    └── native

In each of the leaf directories the project hierarchy will be re-created in order to store the artifacts of the corresponding build.vvk.

Once we have identified the directory in which we will output the binary we can call rustc as follows:

rustc
    --crate-name '<target name>'
    --edition=2024
    --emit=dep-info,link
    --out-dir '<path to build directory>'
    --crate-type 'bin|lib|proc-macro'
    --remap-path-prefix '<project root>='
    '<path to main source file>'
    -C link-arg=-T'<linker_script>' # If any
    --target riscv64gc-unknown-none-elf # If this is a RISC-V 64 target
    --extern proc_macro # If we are building proc macros
    # For each *direct* dependency
    --extern foo=build_root/debug/native/...
    # For all *direct* and *transitive dependencies
    -L dependency=build_root/debug/native/...

In addition, if we are building a test we need to pass --test to rustc to include the test harness

The dep-info allows us to generate Makefile files for each target we are building that will record all the .rs files that are used to build this target. Using this information, we can prune the target graph of all up-to-date targets, i.e. targets for which no .rs file is newer that the compile output. Note that to be completely correct we need to be careful to take into account the status of:

Source files (as emitted by rustc with dep-info
Dependencies, as calculated by vvk
Linker scripts

If all of those are older that the currently available output then we do not need to run anything.

Future steps

This has taken me quite some time to write, but most of the time was spent being overwhelmed by thinking about what needed to be written. As I wanted to finally write something on this blog vvk is still missing some features that will be nice to have when developing pantheon like:

Being able to run the bare-metal tests in QEMU (this will require using hades to run the tests)
Being able to run binaries with gdb to help debug issues (and this can be for both native & bare-metal binaries!)

In addition, I’ll add new targets to vvk as the need arises, like documentation targets or other non-executable things.

The next step for pantheon should be back on track for a lower level project, after all those high level projects!

Thank you for having read this quite long article, on a subject I’m clearly not an expert on!