JIT

What ships today

Per-host pipeline stages

Each stage is independent. extract parses the object file produced by the host compiler. generate emits the on-disk byte table the runtime patcher reads. build is the full in-tree compile against that byte table. smoke is the test suite running through the JIT pipeline. parity is the four-way auto/on/off/lean byte-identical-stdout check.

TargetFormatextractgeneratebuildsmokeparity
ARM64 DarwinMach-O 64greengreengreengreengreen
ARM64 LinuxELF64greengreengreengreengreen
x86_64 LinuxELF64greengreengreengreengreen
x86_64 DarwinMach-O 64greengreenpartialpartialpartial
x86_64 WindowsPE/COFFgreengreengreengreenpartial

Partial-cell notes

Runtime control

Each row in the support table is a build claim. At runtime, every JIT-capable binary lets the host choose how the pipeline executes. The five public symbols are stable across releases:

void                  mino_state_set_jit_mode(mino_state_t *S,
                                              mino_jit_mode_t mode);
mino_jit_mode_t       mino_state_jit_mode(const mino_state_t *S);

void                  mino_state_set_jit_hot_threshold(mino_state_t *S,
                                                       unsigned n);
unsigned              mino_state_jit_hot_threshold(const mino_state_t *S);

mino_jit_capability_t mino_state_jit_capability(const mino_state_t *S);

Modes

MINO_JIT_MODE_AUTO (default): compile when the hot-call threshold trips. MINO_JIT_MODE_OFF: never compile. MINO_JIT_MODE_ON: compile on first call. ON is for benchmarking and parity testing; AUTO is the default for embedders.

Hot threshold

Default seed is the compile-time MINO_JIT_THRESHOLD (currently 10 calls). Lower for shorter-lived scripts where warm-up matters; raise to avoid compiling rarely-called functions in long-lived embedders. Inside an AUTO region the threshold collapses to 1 for callees, so the warm-up gap doesn't compound across nested JIT'd calls.

Capability discovery

mino_state_jit_capability returns a struct with :available, :mode, :threshold, :host_arch, and :host_os fields. Embedders use this at startup to size their tuning before any script runs. mino-lean returns {.available=0, ...} so a host build that depends on JIT throughput knows to fall back.

CLI flags and env vars

Mode and threshold are also reachable from outside the embed surface for scripting use:

Side-exit deopt path

When a function's first unstenciled op sits past PC 0, the JIT compiles the supported prefix natively and plants an OP_DEOPT_TO_INTERP stencil at the first unstenciled position. The stencil records the resume PC on the state and returns NULL; mino_jit_invoke detects the deopt sentinel, clears the flag, and tail-calls mino_bc_run_resume to drive the interpreter over the same regs window from the recorded PC. The interpreter runs to function exit; subsequent calls re-enter the native prefix from the top, so the deopt cost is paid once per call, not per iteration.

Two safety gates apply: the resume PC must fit in the 16-bit Bx slot the deopt stencil reads, and no direct-emit branch in the prefix may land past it. Both are checked by mino_jit_eligible before compile; fns failing either gate take the regular interpreter path. MINO_CPJIT_STATS=tracing surfaces an ok-with-deopt line per fn that took the compile-with-deopt path, and the bytes-blocked dashboard splits each op's total into hard (no native prefix) and ok-with-deopt counts so the reader can tell which blockers side-exit picked up.

Where the JIT shines: tight compute

Loop kernels and recursive compute where the JIT's stencils cover the inner cycle end-to-end. These are the workloads the copy-and-patch substrate was designed for: no allocation per iteration, no transducer machinery, just fused tagged-int arithmetic and inline-cached call dispatch. Median of three runs each on Apple Silicon (arm64-darwin) against mino v0.323.0.

WorkloadJIT offJIT onSpeedup
(dec-only 10M) — counted-down loop30.46 ms15.20 ms2.00x
(lt-only 10M) — counted-up loop30.84 ms17.15 ms1.80x
(sum-to 1M) — counter + accumulator19.41 ms3.01 ms6.46x
(fib 30) — recursive compute107.15 ms53.34 ms2.01x

The sum-to row is the strongest case in current shapes: the JIT covers both (< i n) and (+ acc i) inline (fused OP_LOOP_INT_LT_INC stencil), eliminating the tagged-int dispatch overhead on both the counter and the accumulator. The other rows halve roughly because the JIT covers either the loop step or the recursion path, but the function-call layer still goes through the interpreter dispatcher for the recursive branch.

Where the JIT does not shine: alloc / GC pressure

Median of three runs per cell, captured on Apple Silicon (arm64-darwin) against mino v0.323.0. All numbers in ms/op except the sub-ms row in µs/op.

RowJIT onJIT offRatio (off/on)Reading
build 5k int-map and sum10.05 ms10.34 ms1.03xwithin noise envelope
bump 5k int-map values17.97 ms16.94 ms0.94xwithin noise envelope
map/filter/map/reduce over 50k757 µs779 µs1.03xwithin noise envelope
nested vectors 500x10018.03 ms18.67 ms1.04xwithin noise envelope
realize 10k of lazy range4.19 ms4.48 ms1.07xwithin noise envelope
fibonacci(25)6.65 ms9.21 ms1.38xmeaningful JIT win

Five of six rows land within the +/- 7% noise envelope. Allocation- and GC-dominated workloads are not where the JIT lives; they are dominated by nursery sizing, write-barrier cost, and minor-cycle frequency. The JIT sits above the GC and cannot accelerate work the allocator and collector are already doing. The one row that moves meaningfully is fibonacci(25), pure compute that the JIT's recursive-call inline cache and fused tagged-int arithmetic cover end-to-end.

Out of scope

How each cell is gated

All five targets land their byte tables through the same extractor (tools/stencil-extract), so a regression in the format parser breaks all hosts that share that format. The synthetic-blob selftest in tools/stencil_extract --selftest catches parser regressions before any compile runs; the per-host generated byte table comparison catches drift introduced after the parser passes.

Three workflows produce the green cells:

Next steps