📜 Add better documentation across the compiler. #3

Merged
acw merged 19 commits from acw/better-docs into develop 2023-05-13 12:34:48 -07:00
15 changed files with 95 additions and 95 deletions
Showing only changes of commit c870eeeca3 - Show all commits

View File

@@ -1,28 +1,28 @@
//! # The compiler backend: generation of machine code, both static and JIT.
//!
//!
//! This module is responsible for taking our intermediate representation from
//! [`crate::ir`] and turning it into Cranelift and then into object code that
//! can either be saved to disk or run in memory. Because the runtime functions
//! for NGR are very closely tied to the compiler implentation, we also include
//! information about these functions as part of the module.
//!
//!
//! ## Using the `Backend`
//!
//!
//! The backend of this compiler can be used in two modes: a static compilation
//! mode, where the goal is to write the compiled object to disk and then link
//! it later, and a JIT mode, where the goal is to write the compiled object to
//! memory and then run it. Both modes use the same `Backend` object, because
//! they share a lot of behaviors. However, you'll want to use different variants
//! based on your goals:
//!
//!
//! * Use `Backend<ObjectModule>`, constructed via [`Backend::object_file`],
//! if you want to compile to an object file on disk, which you're then going
//! to link to later.
//! * Use `Backend<JITModule>`, constructed via [`Backend::jit`], if you want
//! to do just-in-time compilation and are just going to run things immediately.
//!
//!
//! ## Working with Runtime Functions
//!
//!
//! For now, runtime functions are pretty easy to describe, because there's
//! only one. In the future, though, the [`RuntimeFunctions`] object is there to
//! help provide a clean interface to them all.
@@ -45,7 +45,7 @@ use target_lexicon::Triple;
const EMPTY_DATUM: [u8; 8] = [0; 8];
/// An object representing an active backend.
///
///
/// Internally, this object holds a bunch of state useful for compiling one
/// or more functions into an object file or memory. It can be passed around,
/// but cannot currently be duplicated because some of that state is not
@@ -64,7 +64,7 @@ pub struct Backend<M: Module> {
impl Backend<JITModule> {
/// Create a new JIT backend for compiling NGR into memory.
///
///
/// The provided output buffer is not for the compiled code, but for the output
/// of any `print` expressions that are evaluated. If set to `None`, the output
/// will be written to `stdout` as per normal, but if a String buffer is provided,
@@ -95,7 +95,7 @@ impl Backend<JITModule> {
/// Given a compiled function ID, get a pointer to where that function was written
/// in memory.
///
///
/// The data at this pointer should not be mutated unless you really, really,
/// really know what you're doing. It can be run by casting it into a Rust
/// `fn() -> ()`, and then calling it from normal Rust.
@@ -106,7 +106,7 @@ impl Backend<JITModule> {
impl Backend<ObjectModule> {
/// Generate a backend for compiling into an object file for the given target.
///
///
/// This backend will generate a single output file per `Backend` object, although
/// that file may have multiple functions defined within it. Data between those
/// functions (in particular, strings) will be defined once and shared between
@@ -139,11 +139,11 @@ impl Backend<ObjectModule> {
impl<M: Module> Backend<M> {
/// Define a string within the current backend.
///
///
/// Note that this is a Cranelift [`DataId`], which then must be redeclared inside the
/// context of any functions or data items that want to use it. That being said, the
/// string value will be defined once in the file and then shared by all referencers.
///
///
/// This function will automatically add a null character (`'\0'`) to the end of the
/// string, to ensure that strings are non-terminated for interactions with other
/// languages.
@@ -163,7 +163,7 @@ impl<M: Module> Backend<M> {
}
/// Define a global variable within the current backend.
///
///
/// These variables can be shared between functions, and will be exported from the
/// module itself as public data in the case of static compilation. There initial
/// value will be null.
@@ -179,7 +179,7 @@ impl<M: Module> Backend<M> {
}
/// Get a pointer to the output buffer for `print`ing, or `null`.
///
///
/// As suggested, returns `null` in the case where the user has not provided an
/// output buffer; it is your responsibility to check for this case and do
/// something sensible.
@@ -192,7 +192,7 @@ impl<M: Module> Backend<M> {
}
/// Get any captured output `print`ed by the program during execution.
///
///
/// If an output buffer was not provided, or if the program has not done any
/// printing, then this function will return an empty string.
pub fn output(self) -> String {

View File

@@ -10,7 +10,7 @@ use pretty::termcolor::{ColorChoice, StandardStream};
use target_lexicon::Triple;
/// A high-level compiler for NGR programs.
///
///
/// This object can be built once, and then re-used many times to build multiple
/// files. For most users, the [`Default`] implementation should be sufficient;
/// it will use `stderr` for warnings and errors, with default colors based on
@@ -62,17 +62,17 @@ impl Compiler {
/// This is the actual meat of the compilation chain; we hide it from the user
/// because the type is kind of unpleasant.
///
///
/// The weird error type comes from the fact that we can run into three types
/// of result:
///
///
/// * Fundamental errors, like an incorrectly formatted file or some
/// oddity with IO. These return `Err`.
/// * Validation errors, where we reject the program due to something
/// semantically wrong with them. These return `Ok(None)`.
/// * Success! In this case, we return `Ok(Some(...))`, where the bytes
/// returned is the contents of the compiled object file.
///
///
fn compile_internal(&mut self, input_file: &str) -> Result<Option<Vec<u8>>, CompilerError> {
// Try to parse the file into our syntax AST. If we fail, emit the error
// and then immediately return `None`.
@@ -111,7 +111,7 @@ impl Compiler {
}
/// Emit a diagnostic.
///
///
/// This is just a really handy shorthand we use elsewhere in the object, because
/// there's a lot of boilerplate we'd like to skip.
fn emit(&mut self, diagnostic: Diagnostic<usize>) {

View File

@@ -1,5 +1,5 @@
//! Helpful functions for evaluating NGR programs.
//!
//!
//! Look, this is a compiler, and so you might be asking why it has a bunch of
//! stuff in it to help with writing interpreters. Well, the answer is simple:
//! testing. It's really nice to know that if you start with a program that
@@ -9,15 +9,15 @@
//! programs don't do 100% the same things in the same order, but you shouldn't
//! be able to observe the difference ... at least, not without a stopwatch,
//! memory profilers, etc.
//!
//!
//! The actual evaluators for our various syntaxes are hidden in `eval` functions
//! of the various ASTs. It's nice to have them "next to" the syntax that way, so
//! that we just edit stuff in one part of the source tree at a time. This module,
//! then, just contains some things that are generally helpful across all the
//! interpreters we've written.
//!
//!
//! In particular, this module helps with:
//!
//!
//! * Defining a common error type -- [`EvalError`] -- that we can reasonably
//! compare. It's nice to compare errors, here, because we want to know that
//! if a program used to fail, it will still fail after we change it, and
@@ -45,7 +45,7 @@ pub use value::Value;
use crate::backend::BackendError;
/// All of the errors that can happen trying to evaluate an NGR program.
///
///
/// This is yet another standard [`thiserror::Error`] type, but with the
/// caveat that it implements [`PartialEq`] even though some of its
/// constituent members don't. It does so through the very sketchy mechanism
@@ -102,17 +102,17 @@ impl PartialEq for EvalError {
EvalError::Linker(a) => match other {
EvalError::Linker(b) => a == b,
_ => false,
}
},
EvalError::ExitCode(a) => match other {
EvalError::ExitCode(b) => a == b,
_ => false,
}
},
EvalError::RuntimeOutput(a) => match other {
EvalError::RuntimeOutput(b) => a == b,
_ => false,
}
},
}
}
}

View File

@@ -4,7 +4,7 @@ use std::sync::Arc;
/// An evaluation environment, which maps variable names to their
/// current values.
///
///
/// One key difference between `EvalEnvironment` and `HashMap` is that
/// `EvalEnvironment` uses an `extend` mechanism to add keys, rather
/// than an `insert`. This difference allows you to add mappings for
@@ -20,7 +20,7 @@ enum EvalEnvInternal {
}
/// Errors that can happen when looking up a variable.
///
///
/// This enumeration may be extended in the future, depending on if we
/// get more subtle with our keys. But for now, this is just a handy
/// way to make lookup failures be `thiserror::Error`s.
@@ -45,7 +45,7 @@ impl EvalEnvironment {
}
/// Extend the environment with a new mapping.
///
///
/// Note the types: the result of this method is a new `EvalEnvironment`,
/// with its own lifetime, and the original environment is left unmodified.
pub fn extend(&self, name: ArcIntern<String>, value: Value) -> Self {

View File

@@ -75,7 +75,7 @@ impl Value {
}
/// Calculate the result of running the given primitive on the given arguments.
///
///
/// This can cause errors in a whole mess of ways, so be careful about your
/// inputs. For example, addition only works when the two values have the exact
/// same type, so expect an error if you try to do so. In addition, this

View File

@@ -1,7 +1,7 @@
use std::fmt::Display;
/// Values in the interpreter.
///
///
/// Yes, this is yet another definition of a structure called `Value`, which
/// are almost entirely identical. However, it's nice to have them separated
/// by type so that we don't mix them up.

View File

@@ -1,12 +1,12 @@
//! The middle of the compiler: analysis, simplification, optimization.
//!
//!
//! For the moment, this module doesn't do much besides define an intermediate
//! representation for NGR programs that is a little easier to work with then
//! the structures we've built from the actual user syntax. For example, in the
//! IR syntax, function calls are simplified so that all their arguments are
//! either variables or constants, which can make reasoning about programs
//! (and implicit temporary variables) quite a bit easier.
//!
//!
//! For the foreseeable future, this module will likely remain mostly empty
//! besides definitions, as we'll likely want to focus on just processing /
//! validating syntax, and then figuring out how to turn it into Cranelift

View File

@@ -7,17 +7,17 @@ use proptest::{
};
/// We're going to represent variables as interned strings.
///
///
/// These should be fast enough for comparison that it's OK, since it's going to end up
/// being pretty much the pointer to the string.
type Variable = ArcIntern<String>;
/// The representation of a program within our IR. For now, this is exactly one file.
///
///
/// In addition, for the moment there's not really much of interest to hold here besides
/// the list of statements read from the file. Order is important. In the future, you
/// could imagine caching analysis information in this structure.
///
///
/// `Program` implements both [`Pretty`] and [`Arbitrary`]. The former should be used
/// to print the structure whenever possible, especially if you value your or your
/// user's time. The latter is useful for testing that conversions of `Program` retain
@@ -63,15 +63,15 @@ impl Arbitrary for Program {
}
/// The representation of a statement in the language.
///
///
/// For now, this is either a binding site (`x = 4`) or a print statement
/// (`print x`). Someday, though, more!
///
///
/// As with `Program`, this type implements [`Pretty`], which should
/// be used to display the structure whenever possible. It does not
/// implement [`Arbitrary`], though, mostly because it's slightly
/// complicated to do so.
///
///
#[derive(Debug)]
pub enum Statement {
Binding(Location, Variable, Expression),
@@ -100,11 +100,11 @@ where
}
/// The representation of an expression.
///
///
/// Note that expressions, like everything else in this syntax tree,
/// supports [`Pretty`], and it's strongly encouraged that you use
/// that trait/module when printing these structures.
///
///
/// Also, Expressions at this point in the compiler are explicitly
/// defined so that they are *not* recursive. By this point, if an
/// expression requires some other data (like, for example, invoking
@@ -148,7 +148,7 @@ where
}
/// A type representing the primitives allowed in the language.
///
///
/// Having this as an enumeration avoids a lot of "this should not happen"
/// cases, but might prove to be cumbersome in the future. If that happens,
/// this may either become a more hierarchical enumeration, or we'll just
@@ -191,7 +191,7 @@ where
}
/// An expression that is always either a value or a reference.
///
///
/// This is the type used to guarantee that we don't nest expressions
/// at this level. Instead, expressions that take arguments take one
/// of these, which can only be a constant or a reference.
@@ -227,7 +227,7 @@ impl From<ValueOrRef> for Expression {
#[derive(Debug)]
pub enum Value {
/// A numerical constant.
///
///
/// The optional argument is the base that was used by the user to input
/// the number. By retaining it, we can ensure that if we need to print the
/// number back out, we can do so in the form that the user entered it.

View File

@@ -6,7 +6,7 @@ use super::{Primitive, ValueOrRef};
impl Program {
/// Evaluate the program, returning either an error or a string containing everything
/// the program printed out.
///
///
/// The print outs will be newline separated, with one print out per line.
pub fn eval(&self) -> Result<String, EvalError> {
let mut env = EvalEnvironment::empty();

View File

@@ -4,7 +4,7 @@ use std::collections::HashSet;
impl Program {
/// Get the complete list of strings used within the program.
///
///
/// For the purposes of this function, strings are the variables used in
/// `print` statements.
pub fn strings(&self) -> HashSet<ArcIntern<String>> {

View File

@@ -1,41 +1,41 @@
//! # NGR (No Good Reason) Compiler
//!
//!
//! This is the top-level module for the NGR compiler; a compiler written
//! in Rust for no good reason. I may eventually try to turn this into a
//! basic guide for writing compilers, but for now it's a fairly silly
//! (although complete) language and implementation, featuring:
//!
//!
//! * Variable binding with basic arithmetic operators.
//! * The ability to print variable values.
//!
//!
//! I'll be extending this list into the future, with the eventual goal of
//! being able to implement basic programming tasks with it. For example,
//! I have a goal of eventually writing reasonably-clear
//! [Advent of Code](https://adventofcode.com/) implementations with it.
//!
//!
//! Users of this as a library will want to choose their adventure based
//! on how much they want to customize their experience; I've defaulted
//! to providing the ability to see internals, rather than masking them,
//! so folks can play with things as they see fit.
//!
//!
//! ## Easy Mode - Just Running a REPL or Compiler
//!
//!
//! For easiest use, you will want to use either the [`Compiler`] object
//! or the [`REPL`] object.
//!
//!
//! As you might expect, the [`Compiler`] object builds a compiler, which
//! can be re-used to compile as many files as you'd like. Right now,
//! that's all it does. (TODO: Add a linker function to it.)
//!
//!
//! The [`REPL`] object implements the core of what you'll need to
//! implement a just-in-time compiled read-eval-print loop. It will
//! maintain variable state and make sure that variables are linked
//! appropriately as the loop progresses.
//!
//!
//! ## Hard Mode - Looking at the individual passes
//!
//!
//! This compiler is broken into three core parts:
//!
//!
//! 1. The front-end / syntax engine. This portion of the compiler is
//! responsible for turning basic strings (or files) into a machine-
//! friendly abstract syntax tree. See the [`syntax`] module for
@@ -49,22 +49,22 @@
//! helps with either compiling them via JIT or statically compiling
//! them into a file. The [`backend`] module also contains information
//! about the runtime functions made available to the user.
//!
//!
//! ## Testing
//!
//!
//! Testing is a key focus of this effort. To that end, both the syntax
//! tree used in the syntax module and the IR used in the middle of the
//! compiler both implement `Arbitrary`, and are subject to property-based
//! testing to make sure that various passes work properly.
//!
//!
//! In addition, to support basic equivalence testing, we include support
//! for evaluating all expressions. The [`eval`] module provides some
//! utility support for this work.
//!
pub mod syntax;
pub mod ir;
//!
pub mod backend;
pub mod eval;
pub mod ir;
pub mod syntax;
/// Implementation module for the high-level compiler.
mod compiler;

View File

@@ -10,11 +10,11 @@ use pretty::termcolor::{ColorChoice, StandardStream};
use std::collections::HashMap;
/// A high-level REPL helper for NGR.
///
///
/// This object holds most of the state required to implement some
/// form of interactive compiler for NGR; all you need to do is provide
/// the actual user IO.
///
///
/// For most console-based used cases, the [`Default`] implementation
/// should be sufficient; it prints any warnings or errors to `stdout`,
/// using a default color scheme that should work based on the terminal
@@ -62,7 +62,7 @@ impl From<REPLError> for Diagnostic<usize> {
impl REPL {
/// Construct a new REPL helper, using the given stream implementation and console configuration.
///
///
/// For most users, the [`Default::default`] implementation will be sufficient;
/// it will use `stdout` and a default console configuration. But if you need to
/// be more specific, this will help you provide more guidance to the REPL as it
@@ -79,7 +79,7 @@ impl REPL {
}
/// Emit a diagnostic to the configured console.
///
///
/// This is just a convenience function; there's a lot of boilerplate in printing
/// diagnostics, and it was nice to pull it out into its own function.
fn emit_diagnostic(
@@ -95,13 +95,13 @@ impl REPL {
}
/// Process a line of input, printing any problems or the results.
///
///
/// The line number argument is just for a modicum of source information, to
/// provide to the user if some parsing or validation step fails. It can be
/// changed to be any value you like that provides some insight into what
/// failed, although it is probably a good idea for it to be different for
/// every invocation of this function. (Not critical, but a good idea.)
///
///
/// Any warnings or errors generated in processing this command will be
/// printed to the configured console. If there are no problems, the
/// command will be compiled and then executed.
@@ -117,7 +117,7 @@ impl REPL {
}
/// The internal implementation, with a handy `Result` type.
///
///
/// All information from the documentation of `REPL::process_input` applies here,
/// as well; this is the internal implementation of that function, which is
/// differentiated by returning a `Result` type that is hidden from the user

View File

@@ -1,11 +1,11 @@
//! NGR Parsing: Reading input, turning it into sense (or errors).
//!
//!
//! This module implement the front end of the compiler, which is responsible for
//! reading in NGR syntax as a string, turning it into a series of reasonable Rust
//! structures for us to manipulate, and doing some validation while it's at it.
//!
//!
//! The core flow for this work is:
//!
//!
//! * Turning the string into a series of language-specific [`Token`]s.
//! * Taking those tokens, and computing a basic syntax tree from them,
//! using our parser ([`ProgramParser`] or [`StatementParser`], generated
@@ -15,12 +15,12 @@
//! * Simplifying the tree we have parsed, using the [`simplify`] module,
//! into something that's more easily turned into our [compiler internal
//! representation](super::ir).
//!
//!
//! In addition to all of this, we make sure that the structures defined in this
//! module are all:
//!
//! * Instances of [`Pretty`](::pretty::Pretty), so that you can print stuff back
//! out that can be read by a human.
//! out that can be read by a human.
//! * Instances of [`Arbitrary`](proptest::prelude::Arbitrary), so they can be
//! used in `proptest`-based property testing. There are built-in tests in
//! the library, for example, to make sure that the pretty-printing round-trips.

View File

@@ -1,7 +1,7 @@
use codespan_reporting::diagnostic::{Diagnostic, Label};
/// A source location, for use in pointing users towards warnings and errors.
///
///
/// Internally, locations are very tied to the `codespan_reporting` library,
/// and the primary use of them is to serve as anchors within that library.
#[derive(Clone, Debug, Eq, PartialEq)]
@@ -13,7 +13,7 @@ pub struct Location {
impl Location {
/// Generate a new `Location` from a file index and an offset from the
/// start of the file.
///
///
/// The file index is based on the file database being used. See the
/// `codespan_reporting::files::SimpleFiles::add` function, which is
/// normally where we get this index.
@@ -22,7 +22,7 @@ impl Location {
}
/// Generate a `Location` for a completely manufactured bit of code.
///
///
/// Ideally, this is used only in testing, as any code we generate as
/// part of the compiler should, theoretically, be tied to some actual
/// location in the source code. That being said, this can be used in
@@ -36,10 +36,10 @@ impl Location {
/// Generate a primary label for a [`Diagnostic`], based on this source
/// location.
///
///
/// Note, this is just the [`Label`], you'll want to fill in the [`Diagnostic`]
/// with a lot more information.
///
///
/// Primary labels are the things that are they key cause of the message.
/// If, for example, it was an error to bind a variable named "x", and
/// then have another binding of a variable named "x", the second one
@@ -52,10 +52,10 @@ impl Location {
/// Generate a secondary label for a [`Diagnostic`], based on this source
/// location.
///
///
/// Note, this is just the [`Label`], you'll want to fill in the [`Diagnostic`]
/// with a lot more information.
///
///
/// Secondary labels are the things that are involved in the message, but
/// aren't necessarily a problem in and of themselves. If, for example, it
/// was an error to bind a variable named "x", and then have another binding
@@ -63,20 +63,20 @@ impl Location {
/// label (because that's where the error actually happened), but you'd
/// probably want to make the first location the secondary label to help
/// users find it.
pub fn secondary_label(&self) -> Label<usize> {
pub fn secondary_label(&self) -> Label<usize> {
Label::secondary(self.file_idx, self.offset..self.offset)
}
/// Given this location and another, generate a primary label that
/// specifies the area between those two locations.
///
///
/// See [`Self::primary_label`] for some discussion of primary versus
/// secondary labels. If the two locations are the same, this method does
/// the exact same thing as [`Self::primary_label`]. If this item was
/// generated by [`Self::manufactured`], it will act as if you'd called
/// `primary_label` on the argument. Otherwise, it will generate the obvious
/// span.
///
///
/// This function will return `None` only in the case that you provide
/// labels from two different files, which it cannot sensibly handle.
pub fn range_label(&self, end: &Location) -> Option<Label<usize>> {
@@ -96,7 +96,7 @@ impl Location {
}
/// Return an error diagnostic centered at this location.
///
///
/// Note that this [`Diagnostic`] will have no information associated with
/// it other than that (a) there is an error, and (b) that the error is at
/// this particular location. You'll need to extend it with actually useful
@@ -109,7 +109,7 @@ impl Location {
}
/// Return an error diagnostic centered at this location, with the given message.
///
///
/// This is much more useful than [`Self::error`], because it actually provides
/// the user with some guidance. That being said, you still might want to add
/// even more information to ut, using [`Diagnostic::with_labels`],

View File

@@ -6,28 +6,28 @@ use thiserror::Error;
/// A single token of the input stream; used to help the parsing go down
/// more easily.
///
///
/// The key way to generate this structure is via the [`Logos`] trait.
/// See the [`logos`] documentation for more information; we use the
/// [`Token::lexer`] function internally.
///
///
/// The first step in the compilation process is turning the raw string
/// data (in UTF-8, which is its own joy) in to a sequence of more sensible
/// tokens. Here, for example, we turn "x=5" into three tokens: a
/// [`Token::Variable`] for "x", a [`Token::Equals`] for the "=", and
/// then a [`Token::Number`] for the "5". Later on, we'll worry about
/// making sense of those three tokens.
///
///
/// For now, our list of tokens is relatively straightforward. We'll
/// need/want to extend these later.
///
///
/// The [`std::fmt::Display`] implementation for [`Token`] should
/// round-trip; if you lex a string generated with the [`std::fmt::Display`]
/// trait, you should get back the exact same token.
#[derive(Logos, Clone, Debug, PartialEq, Eq)]
pub enum Token {
// Our first set of tokens are simple characters that we're
// going to use to structure NGR programs.
// going to use to structure NGR programs.
#[token("=")]
Equals,
@@ -74,7 +74,7 @@ pub enum Token {
#[regex(r"[ \t\r\n\f]+", logos::skip)]
// this is an extremely simple version of comments, just line
// comments. More complicated /* */ comments can be harder to
// implement, and didn't seem worth it at the time.
// implement, and didn't seem worth it at the time.
#[regex(r"//.*", logos::skip)]
/// This token represents that some core error happened in lexing;
/// possibly that something didn't match anything at all.