Language Reference

Deep Dive: Vocabulary of Intent

A comprehensive exploration of the 10 explicit semantic contracts in X-Lang. Discover the underlying performance problems, the theoretical solutions, and the rigorous LLVM IR generated by the compiler.

Memory & Data

@layout(SoA)

Memory & Data

Data-Oriented Design for Vectorisation

The Problem

The default Array-of-Structures (AoS) layout creates strided memory access when iterating over a single field, destroying cache locality and completely preventing SIMD auto-vectorization.

The Contract

Instructs the compiler to physically transform the in-memory layout of an array of structs into a Structure-of-Arrays (SoA), achieving contiguous, unit-stride access.

LLVM Mechanism

The compiler rewrites the type definition into a 'handle' struct of pointers, and transforms single GEP (getelementptr) instructions into multi-step sequences that load the field base pointer before indexing.

X-Lang Source

#[layout(SoA)]

struct Particle {

x: f64,

y: f64,

vx: f64

}

fn update(p: &mut [Particle]) {

for i in 0..p.len() {

// Compiler generates contiguous access

p[i].x += p[i].vx;

}

Generated LLVM IR

; 1. Type is transformed into a Handle

%Particle_SoA = type { ptr, ptr, ptr }

; 2. GEPs are rewritten for unit-stride access

%x_ptr_ptr = getelementptr %Particle_SoA, ptr %handle, i32 0, i32 0

%x_base = load ptr, ptr %x_ptr_ptr

%x_addr = getelementptr double, ptr %x_base, i32 %i

@unique

Memory & Data

Eliminating Alias Analysis Pessimism

The Problem

Without explicit guarantees, a compiler must conservatively assume any two pointers might alias. This creates a 'Wall of Ambiguity' that forces redundant memory reloads and prevents instruction reordering.

The Contract

Backed by the frontend borrow checker, it guarantees that for the duration of its lifetime, the reference is the sole, exclusive pointer to that memory region.

LLVM Mechanism

Translates directly to the standard LLVM `noalias` parameter attribute. This provides mathematical proof to LLVM's alias analysis oracle, unlocking Global Value Numbering (GVN), LICM, and vectorization.

X-Lang Source

fn process(d1: &unique mut Data, d2: &unique mut Data) {

d1.a = 10;

d2.b = 20;

// Compiler knows d2 write cannot affect d1

let result = d1.a;

}

Generated LLVM IR

; High-level contract lowered to 'noalias' attribute

define void @process(ptr noalias %d1, ptr noalias %d2) {

store i32 10, ptr %d1

store i32 20, ptr %d2

; Redundant load is eliminated by the optimizer

; The value 10 is forwarded directly

ret void

}

@invariant

Memory & Data

Axiomatic Check Elimination

The Problem

Defensive programming requires runtime checks (e.g., bounds checking) to maintain internal consistency. In hot loops, the cumulative cost of these redundant conditional branches degrades pipeline performance.

The Contract

Provides a formal axiom that is guaranteed by the programmer to hold true between any two public method calls.

LLVM Mechanism

The compiler injects `call void @llvm.assume(i1 %inv)`. This acts as a mathematical truth for the optimizer, allowing it to prove defensive paths (like panics) are dead code and prune the branches entirely.

X-Lang Source

struct BoundedVec {

len: i32, cap: i32

invariant { self.len <= self.cap }

}

fn push(&mut self) {

// The compiler knows this is always true

if self.len < self.cap { self.len += 1; }

}

Generated LLVM IR

; Axiom is passed to the optimizer

%inv = icmp sle i32 %len, %cap

call void @llvm.assume(i1 %inv)

; The panic branch is pruned, resulting in linear CFG

%new_len = add i32 %len, 1

ret i32 %new_len

Purity & State

@pure

Purity & State

Guaranteeing Referential Transparency

The Problem

Function calls act as opaque memory barriers. The compiler must assume an unknown function might modify global state or aliased memory, invalidating cached registers and blocking code motion.

The Contract

A mathematically verifiable promise that the function's output depends solely on its inputs, and it produces no observable side effects.

LLVM Mechanism

Lowers to the LLVM `readnone` (no memory access) or `readonly` (no memory writes) attributes. This pierces call opacity, safely enabling Common Subexpression Elimination (CSE) across call boundaries.

X-Lang Source

pure fn calculate(x: f64) -> f64 {

return (x * 3.14) - (x * x);

}

fn main() {

let a = calculate(5.0);

let b = calculate(5.0); // Safely eliminated

}

Generated LLVM IR

; 'readnone' attribute proves zero side-effects

define double @calculate(double %x) readnone {

; ...

}

; At the call site, the second call is removed via CSE

%a = call double @calculate(double 5.0)

; %b is replaced by %a

@memoise

Purity & State

Automated Dynamic Programming

The Problem

Recursive algorithms with overlapping subproblems (e.g., naive Fibonacci) exhibit exponential O(2^n) time complexity, performing billions of redundant calculations.

The Contract

Instructs the compiler to automatically generate a stateful caching mechanism (memoization) for a `@pure` function.

LLVM Mechanism

The pass renames the original function to an internal implementation, generates a public wrapper with cache lookup/update logic (using hash maps or arrays), and rewrites all recursive calls to target the wrapper.

X-Lang Source

// Simply add the keyword; complexity becomes O(n)

pure memoised fn fib(n: u64) -> u64 {

if n <= 1 { return n; }

return fib(n - 1) + fib(n - 2);

}

Generated LLVM IR

; Cache lookup injected into generated wrapper

%found_ptr = call ptr @__xl_rt_get(ptr @cache, i64 %n)

%found = icmp ne ptr %found_ptr, null

br i1 %found, label %hit, label %miss

miss:

%val = call i64 @fib.impl(i64 %n)

call void @__xl_rt_set(ptr @cache, i64 %n, i64 %val)

Execution Models

@static

Execution Models

Partial Evaluation via Function Specialisation

The Problem

Generic functions (like regex engines or state machines) evaluating dynamic configurations incur massive interpretive overhead—constant branching, parsing, and lookups at runtime.

The Contract

Declares that a specific parameter will be a compile-time constant for a given call site, explicitly triggering Partial Evaluation.

LLVM Mechanism

The `StaticSpecialisationPass` clones the function, propagates the constant argument into the clone, and re-runs aggressive optimization (constant folding, DCE). The result is a bespoke, hard-coded finite automaton.

X-Lang Source

fn regex_match(pattern: static &str, text: &str) -> bool {

// Generic interpretive logic...

}

// Generates a specialized version for this exact string

let is_email = regex_match("[a-z]+@[a-z]+", input);

Generated LLVM IR

; The generic interpreter is bypassed entirely

; A highly specialized function is generated

define i1 @regex_match_specialised_1(ptr %text) {

; The state machine is baked directly into the CFG

%res = call i1 @check_alpha_sequence(ptr %text)

ret i1 %res

}

@parallel

Execution Models

Programmer-Guided Fork-Join Concurrency

The Problem

True automatic parallelisation is nearly impossible because static analysis cannot definitively prove the absence of loop-carried dependencies in complex code.

The Contract

An unbreakable guarantee from the programmer that all iterations of a `for` loop are strictly data-independent and can execute concurrently without race conditions.

LLVM Mechanism

The compiler outlines the loop body into a separate 'work' function, replaces the original loop with a call to the X-Lang parallel runtime, and dispatches the work across available hardware threads.

X-Lang Source

fn scale(data: &mut [f64], factor: f64) {

// Bypasses dependency analysis; forces threading

parallel for i in 0..data.len() {

data[i] *= factor;

}

Generated LLVM IR

; 1. Loop body is extracted to an internal work function

define internal void @work_fn(i32 %idx) { ... }

; 2. Sequential loop replaced with runtime dispatch

call void @__xl_rt_dispatch(ptr @work_fn, i32 %len)

ret void

Control Flow & Dispatch

@become

Control Flow & Dispatch

Guaranteed Tail-Call Optimisation (TCO)

The Problem

Deep recursion consumes stack space linearly O(n), inevitably leading to stack overflow crashes. Standard compilers only offer TCO as an unreliable, best-effort heuristic.

The Contract

An explicit demand that the call must be compiled with Tail-Call Optimisation. If the call is not in a true tail position, the compiler produces a hard error.

LLVM Mechanism

Generates the LLVM `tail` call attribute. The backend transforms the recursive call into an in-place register update followed by a simple `jmp`, ensuring O(1) constant stack space.

X-Lang Source

fn sum(n: i64, acc: i64) -> i64 {

if n == 0 { return acc; }

// Compiler guarantees this will not grow the stack

become sum(n - 1, acc + n);

}

Generated LLVM IR

; The 'tail' attribute forces the backend to use a jump

%res = tail call i64 @sum(i64 %n_minus_1, i64 %acc_next)

ret i64 %res

; Machine code generated:

; add rsi, rdi

; sub rdi, 1

; jmp .L_loop_body

@multi

Control Flow & Dispatch

Extensible and Efficient Multiple Dispatch

The Problem

Resolving behavior based on the runtime types of multiple objects (e.g., collision detection) traditionally requires the verbose, rigid, and branch-heavy Visitor Pattern (chained virtual calls).

The Contract

Defines a polymorphic interface where dispatch logic is automatically generated to resolve to the correct implementation based on the dynamic types of all arguments.

LLVM Mechanism

Assigns 32-bit type IDs, combines them into a single 64-bit key, and generates a unified resolver function using an LLVM `switch` instruction, which lowers to an ultra-fast O(1) jump table.

X-Lang Source

multi fn collide(a: &Circle, b: &Rect) -> bool { ... }

multi fn collide(a: &Rect, b: &Rect) -> bool { ... }

// Dispatched in O(1) time without virtual chaining

collide(shapeA, shapeB);

Generated LLVM IR

; Type IDs combined into a single 64-bit key

%key = or i64 %t0_shifted, %t1_zext

; LLVM optimizes this switch into a fast jump table

switch i64 %key, label %default [

i64 8112167642167172411, label %case_Circle_Rect

i64 8589934594, label %case_Rect_Rect

]

@no throws

Control Flow & Dispatch

Managing the High Cost of Exceptions

The Problem

The mere possibility that a function might throw an exception forces the compiler to generate complex, non-linear Control Flow Graphs (CFG) and costly stack unwinding metadata.

The Contract

A two-sided contract: its presence signals complex unwinding, but crucially, its absence provides a formal `nounwind` guarantee that the function will never throw.

LLVM Mechanism

For `nounwind` functions, the compiler uses the simple `call` instruction instead of the branching `invoke` instruction. This eliminates landing pads, keeps the CFG linear, and significantly boosts inlining probability.

X-Lang Source

// Absence of 'throws' implies a nounwind guarantee

fn safe_add(a: i32, b: i32) -> i32 {

return a + b;

}

Generated LLVM IR

; Simple 'call' generated. Only one linear exit path.

%res = call i32 @safe_add(i32 %a, i32 %b)

ret i32 %res

; Compare to 'invoke' which requires a landingpad block:

; invoke i32 @might_throw() to label %cont unwind label %lpad