Hindley-Milner Type Inference in Lateralus

July 2025 · 9 min read

One of the most common complaints I hear about statically typed languages is the annotation tax — the feeling that you're writing types for the compiler, not for yourself. In dynamically typed languages you just write x = 42 and move on. In many static languages you write int x = 42, which feels redundant when the compiler can clearly see there's a 42 right there.

When I designed Lateralus, I wanted the best of both worlds: the safety of static types with the ergonomics of dynamic typing. The answer was Hindley-Milner type inference — a well-studied algorithm from the 1970s and 80s that can infer the types of nearly every expression without a single annotation.

The Problem with Explicit Types

Consider a simple function that doubles a number. In many languages, you'd write something like this:

// Java-style: types everywhere
int double(int x) {
    return x * 2;
}

// TypeScript: better, but still annotated
function double(x: number): number {
    return x * 2;
}

In Lateralus, you just write the logic:

let double = x => x * 2

// The compiler infers: (Int) -> Int
// No annotation needed — the * operator constrains x to Int

This isn't just syntactic sugar. The compiler genuinely derives the type from the structure of the expression. And it works for far more complex cases than simple arithmetic.

What Is Hindley-Milner?

Hindley-Milner (often called HM) is a type system and inference algorithm developed by Roger Hindley in 1969 and later independently by Robin Milner in 1978. Luis Damas formalized the completeness proof in 1982. The key insight is that for a language without subtyping or ad-hoc overloading, every well-typed expression has a unique most-general type that can be computed mechanically.

The algorithm works in two phases:

Constraint generation — Walk the AST and emit equations like "the type of x must equal the type of y"
Unification — Solve those equations simultaneously to find concrete types

Constraint Generation in Lateralus

Let's trace through a real example. Consider this function:

let apply_twice = (f, x) => x |> f |> f

The compiler assigns fresh type variables to each unknown:

// Internal constraint generation:
// f : α
// x : β
// apply_twice : (α, β) -> γ
//
// From `x |> f`:
//   α must be a function: α = β -> δ
//
// From `(x |> f) |> f`:
//   α must accept δ as input: α = δ -> γ
//
// Combining: β -> δ = δ -> γ
//   therefore β = δ = γ
//
// Final type: apply_twice : ((β -> β), β) -> β

The compiler deduced that f must be a function from some type to the same type, and that the input and output of apply_twice share that type. All without a single annotation.

The Unification Algorithm

Unification is the heart of HM inference. Given a set of type equations, it finds a substitution — a mapping from type variables to types — that satisfies all constraints simultaneously. Here's a simplified view of how Lateralus implements it:

// Pseudocode for the unification engine
let unify = (t1, t2, subst) => match (apply_subst(t1, subst), apply_subst(t2, subst))
    | (TypeVar(a), t) => if occurs(a, t)
        then Error("infinite type")
        else extend(subst, a, t)
    | (t, TypeVar(a)) => unify(TypeVar(a), t, subst)
    | (Arrow(a1, r1), Arrow(a2, r2)) =>
        subst |> unify(a1, a2) |> unify(r1, r2)
    | (Con(n1), Con(n2)) when n1 == n2 => subst
    | _ => Error("type mismatch")

The occurs check on line 3 prevents infinite types. Without it, you could unify α with List<α>, producing the nonsensical List<List<List<...>>>. The Lateralus compiler catches this and reports a clear error.

Type Variables and Generalization

After inference completes for a let binding, the compiler generalizes any remaining type variables into polymorphic parameters. This is what lets you write generic functions naturally:

let identity = x => x
// Inferred: ∀α. α -> α

let first = (a, b) => a
// Inferred: ∀α β. (α, β) -> α

// Both work with any type:
identity(42)          // Int
identity("hello")     // String
first(1, "two")       // Int

This is called let-polymorphism, and it's the feature that separates HM from simpler inference systems. Each use of identity gets a fresh instantiation of the type scheme, so it can be used at different types without conflict.

When You DO Need Annotations

HM inference is powerful, but it has limits. There are a few cases where Lateralus asks for explicit annotations:

Annotation Required

Recursive types — e.g., a tree node that contains children of its own type
Higher-rank polymorphism — passing polymorphic functions as arguments
FFI boundaries — interfacing with Python or C functions
Ambiguous numeric literals — when 42 could be Int or Float in context

// Recursive type needs annotation
type Tree<T> = Leaf(T) | Branch(Tree<T>, Tree<T>)

// FFI boundary needs annotation
extern fn py_json_parse(s: String) -> Dict<String, Any>

// Everything else? The compiler handles it.
let process = data =>
    data
    |> filter(x => x.age > 18)
    |> map(x => x.name)
    |> sort()
// Inferred: (List<{age: Int, name: String, ...}>) -> List<String>

Comparison: Lateralus vs. TypeScript vs. Rust

TypeScript uses bidirectional type checking — it flows types both "forward" from expressions and "backward" from expected contexts. It's pragmatic but unsound by design (any any breaks the guarantees). Lateralus never has an escape hatch; inference is total or it's a compile error.

Rust uses a more complex system with trait bounds, lifetimes, and borrow checking layered on top. Rust's inference is local — it can infer types within a function body but requires signatures on all function declarations. Lateralus infers across function boundaries, which means top-level functions can be written without signatures.

In practice, well-written Lateralus code has 80-90% fewer type annotations than equivalent Rust, and 100% more type safety than equivalent TypeScript.

Pipeline Inference: The Killer Feature

Where HM inference really shines in Lateralus is with pipelines. Because the |> operator threads a value through a chain of functions, the compiler can infer the entire chain's types from a single starting point:

let result =
    "hello world 123 foo 456"
    |> split(" ")           // String -> List<String>
    |> filter(is_numeric)   // List<String> -> List<String>
    |> map(parse_int)      // List<String> -> List<Int>
    |> fold(0, add)        // List<Int> -> Int

// result : Int — inferred through 4 pipeline stages

Each stage constrains the next. The type flows left-to-right, just like the data. I find this deeply satisfying — the type system and the syntax are aligned in a way that makes both more intuitive.

Types should follow the data. In Lateralus, they literally do — flowing through the pipeline just like the values themselves.

What's Next

We're actively working on extending the type system with row polymorphism for structural record types and effect inference to track side effects through pipelines. The goal is to maintain the zero-annotation ergonomics while giving the compiler even more information to optimize with.

🧪 Try type inference in the Playground ⭐ Star on GitHub

Types and the pipeline operator

The type system's relationship with |> deserves special attention. Pipelines are typed as function composition:

// Each |> stage must type-check:
// filter : (Vec<T>, T -> Bool) -> Vec<T>
// map    : (Vec<T>, T -> U) -> Vec<U>
// sum    : (Vec<Numeric>) -> Numeric

let result = [1, 2, 3, 4, 5]      // Vec<Int>
    |> filter(|x| x > 2)          // Vec<Int> (filter preserves element type)
    |> map(|x| x.to_float() / 10) // Vec<Float> (map transforms element type)
    |> sum()                       // Float

// Type error caught at compile time:
let bad = ["a", "b", "c"]
    |> filter(|x| x > 2)  // Error: String does not support '>' with Int

The compiler traces types through the entire pipeline chain. If any stage produces a type incompatible with the next stage's input, you get a precise error pointing to the exact |> boundary where the mismatch occurs.

Types and the pipeline operator

The type system's relationship with |> deserves special attention. Pipelines are typed as function composition:

// Each |> stage must type-check:
// filter : (Vec<T>, T -> Bool) -> Vec<T>
// map    : (Vec<T>, T -> U) -> Vec<U>
// sum    : (Vec<Numeric>) -> Numeric

let result = [1, 2, 3, 4, 5]      // Vec<Int>
    |> filter(|x| x > 2)          // Vec<Int> (filter preserves element type)
    |> map(|x| x.to_float() / 10) // Vec<Float> (map transforms element type)
    |> sum()                       // Float

// Type error caught at compile time:
let bad = ["a", "b", "c"]
    |> filter(|x| x > 2)  // Error: String does not support '>' with Int