Lean VM
1. System Objective
Lean Ethereum targets a transition from BLS to a post-quantum signature path using stateful hash-based signatures (XMSS), with aggregation handled by a hash-based SNARK. The dominant cost is proving many hash evaluations, and a minimal Cairo-inspired zkVM expresses the control logic and hashing.
Two operations are supported:
Aggregation of XMSS signatures
Merging of aggregate signatures via recursive SNARK verification
2. Protocol
2.1 Statement of Computation:
All aggregation and merge behavior is unified into a single program, , which is the only program whose execution must be proven.
2.1.1 Inputs
Public inputs:
pub_keys,bitfield,msgPrivate inputs:
s,sub_bitfields,aggregate_proofs,signatures
2.1.2 Semantics
Bitfield consistency:
For : verify recursive proofs with inner public input
For the last sub-bitfield: verify actual XMSS signatures for indices selected by
2.2 VM Execution Model
2.2.1 Field and Extension Field
Base field: KoalaBear prime
chosen for fewer Poseidon rounds and an efficient Poseidon2 S-box.
Extension field dimension: 5 or 6 (for security/performance trade-offs).
2.2.2 Memory and Registers

Memory is read-only, word-aligned to base-field elements, size
and the memory size is fixed at proof start.
The first
memory cells contain public input written by the verifier.
Registers:
pc,fp; there is noapregister (no tracked allocation pointer in the VM state).
2.2.3 ISA and Precompiles
Core instructions:
add/mulderefconditional
jump
Precompiles:
POSEIDON_16: Poseidon2 permutation over 16 field elementsPOSEIDON_24: Poseidon2 permutation over 24 field elementsDOT_PRODUCT: Computes a dot product between two slices of extension field elements.MULTILINEAR EVAL: interpret a memory chunk as a multilinear polynomial over the base field and evaluate at an extension-field point; implemented via sparse equality constraints in WHIR, with no dedicated AIR table.
2.3 Reduced Commitments
2.3.1 Execution Table Commitments
Per cycle, the execution table commits five base-field elements:
2.3.2 Instruction Encoding and Bytecode Lookup
Each instruction is described by 15 field elements:
operands:
flags:
opcode flags: 8 one-bit selectors
auxiliary field:
Instead of committing these 15 elements per cycle, the execution table commits only pc, and a LogUp*-based indexed lookup proves that the correct instruction row in the bytecode table is used.
2.4 Precompile Interaction Model: Two Buses
Precompiles live in dedicated tables and are connected to the execution table and memory via two logical “buses”.
2.4.1 Bus 1: Execution → Precompile Tables
For each precompile call, the tuple
appearing in the execution table must also appear in the corresponding precompile table.
A permutation/grand-product argument enforces this inclusion: the multiset of tuples requested in the execution table must match the multiset of tuples present in precompile tables.
2.4.2 Bus 2: Precompile Tables → Memory
Each precompile row accesses memory and enforces correctness of the corresponding computation (for example, Poseidon over a block, or a dot product over an extension-field vector).
This is done by:
Using the addresses recorded in the precompile table
Performing memory lookups
Enforcing that the precompile outputs match the specified function of the fetched inputs
When memory accesses are aligned, power-of-two blocks (e.g., Poseidon16 with alignment 8), lookups can be optimized by folding memory regions to reduce lookup cost.
2.5 Commitment and Opening Schedule (Two-Phase)
The commitment scheme operates in two phases:
Phase 1: Commit to:
Memory
Execution table
Precompile tables for
poseidon16,poseidon24, anddot_product
Phase 2: After sampling random challenges (Fiat–Shamir), commit to pushforwards for the two
logup*lookups:
Finally, all required values are opened at sampled points across these commitments, and all AIR, lookup, and bus constraints are verified at those points.
2.6 Polynomial Packing
The system uses multilinear polynomials (primarily AIR columns and table columns). Instead of committing each polynomial separately, multiple polynomials are packed into a single polynomial , reducing both:
The number of commitments
The number of openings
Evaluation claims on each at some point are reduced to evaluation claims on via:
Chunking each into a small number of power-of-two blocks
Treating evaluation of as a combination of “inner evaluations” within these blocks
Encoding these inner evaluations as sparse equality constraints on the packed polynomial
This avoids running a separate sumcheck over each and amortizes costs.
3. Implementation of Prover
See prove_execution in leanMultisig
3.1 Execution witness generation
The system begins by executing a program written in lean lang ISA, capturing the entire execution trace of the virtual machine. This includes the state of key registers (e.g., pc, fp) at each step, memory read operations, and invocations of any precompile operations. Memory is modeled as immutable input and logged during execution, forming the primary witness.
3.2 AIR Table Construction
From the execution witness, the system constructs:
Execution Trace Table: Tracks VM state evolution (registers, flags) across cycles.
Memory Table: Records accessed memory cells and their contents.
Precompile Tables: For Poseidon16, Poseidon24, records full round-by-round computation of trace of that hash and records for dot-product operations as well.
Each table is represented as a multilinear polynomial over a hypercube domain.
3.3 Phase 1: WHIR Commitments
Using the WHIR commitment scheme, the prover commits to five polynomial tables:
Execution trace
Memory table
Poseidon16 table
Poseidon24 table
Dot-product table
These commitments are absorbed into a Fiat–Shamir transcript and fix the witness data.
3.4 Phase 2: Logup* Lookup Arguments
The system performs two indexed lookups:
Memory Lookup:
Merges all (address, value) pairs from CPU and precompile access logs.
Verifies consistency against the committed memory table.
Uses random challenges to compress each pair to a field element (e.g., ).
Constructs pushforward polynomial that is essentially the interpolation of all those combined values from the access logs.
Bytecode Lookup:
Captures all
(pc, opcode)pairs from the execution trace.Validates against the committed bytecode ROM.
Applies similar Logup* compression and commitment as with memory.
3.6 WHIR polynomial IOP
With all commitments, the system proceeds to generate the final proof. In the concluding phase, it invokes the WHIR interactive proof protocol (a multilinear polynomial IOP) to produce the proof. Two batched open are done for memory and bytecode. At this point, lookup arguments are represented as evaluations by inner GKR of Logup*.
4. Reference
Written by Soowon Jeong of Fractalyze
Last updated