hsuScript Compiler
SourceSmall teaching compiler that walks from tokens to x86-64 with a bundled runtime.
CBashGCC
Why I Built It
hsuScript started as my way to demystify "real" compiler pipelines by implementing every major stage by hand in portable C. I wanted a project small enough to read in a weekend but complete enough to show how lexing, parsing, analysis, and code generation cooperate to turn source text into a native binary.
fn main() {
write("No Power Tools!");
}
Pipeline
Lexing & Parsing
lexer.c
tokenizes.hsc
programs, tagging identifiers, numbers, strings, keywords, and operators while preserving line information for diagnostics.parser.c
is a Pratt parser that builds anNK_Program
AST, with helpers for loops, conditionals, and function declarations.
Semantic Analysis
sem.c
walks the tree with scoped symbol tables, enforceslet
declarations, and annotates each node with one of the core types:int
,bool
,string
, orvoid
.- The checker also validates function signatures and ensures control-flow constructs type-check before codegen runs.
Code Generation
codegen.c
emits AT&T flavoured x86-64 assembly directly from the typed AST. It manages a manual stack frame, keeps locals in a scope-indexed symbol table, and auto-aligns%rsp
before every call (emit_call
handles the ABI bookkeeping).- String literals are interned once and spilled into a
.rodata
section; runtime helpers likehsu_concat
handle heap strings at execution time.
Runtime & CLI
runtime/rt.c
exportshsu_print_cstr
,hsu_print_int
, andhsu_concat
, which the generated assembly calls through the System V ABI.main.c
wires the stages together and exposes CLI switches:--ast-only
(print the AST),--emit-asm [path]
,--compile [output]
, and--dump-rt [path]
for unpacking the embedded runtime object.- The default mode writes assembly, assembles and links it with the runtime blob, and then executes the resulting binary so you get immediate feedback.
Language Surface
let
bindings, shadowing, and nested lexical scopes- Arithmetic, comparison, logical operators, and unary negation
- Strings with runtime concatenation and printing
if
/elif
/else
,for
, andwhile
- Top-level
fn
declarations with return types write(expr)
for stdout andexit(code)
to terminate with a status
Tooling & Tests
tools/build.sh
buildsbuild/hsc
, embeds the runtime object withobjcopy
, and links everything into a single executable.- Parser fixtures under
tests/cases
compare expected AST dumps against actual output, catching regressions in syntactic handling. - End-to-end samples in
tests/exec
compile, link, and run.hsc
programs, verifying both stdout and exit codes viatools/runexec.sh
. - A top-level
tools/run_all_tests.sh
script stitches the suites together so one command exercises the entire pipeline.
What I Took Away
Manually managing ABI details, stack alignment, and even simple string interning gave me a concrete feel for what full-size compilers hide behind abstractions. hsuScript is now my go-to reference when explaining how source code becomes a runnable binary without relying on existing toolchains to do the hard parts.