Designing State in Aleo. Merkle-Tree Model for Records and Transaction Storage
Aleo adopts a record-based state model (akin to UTXOs) rather than a plain account map. Each record is an encrypted data object that represents application-specific state. When a transaction executes, records are consumed (spent) or created, updating the global state. To maintain integrity and privacy, Aleo tracks all record commitments in a global Merkle-tree. Roughly speaking, each record r = (v, pid, apk, d, ρ, r)
(with visibility v
, program ID pid
, owner key apk
, data payload d
, random nonce ρ
, and randomness r
) is turned into a commitment
which hides its contents. When the record is created in a transaction, its commitment cm
is appended to the ledger; when the record is spent, its unique serial number
is revealed. The ledger (an append-only chain of blocks) maintains a global Merkle tree over all these commitments. The Merkle root of this tree (the “state root”) is included in each block header. Thus, the block header succinctly summarizes the entire record state. For example, Aleo’s block header fields include previous_state_root
(the Merkle root of all prior records) and transactions_root
(the Merkle root of that block’s transactions) (see Figure below).
Each Aleo transaction carries its own proof of correct execution. In particular, an execution contains a local state root ℓ_state that commits to all input/output records of that transaction. When the transaction is validated, the global state root G_state (from the previous block) is updated by including the new records and removing spent ones, yielding the new state root published in the next block. Crucially, the transaction’s zero-knowledge proof must include Merkle membership proofs: (1) that each input record’s commitment is indeed in the prior Merkle tree, and (2) (conceptually) that each new output record commitment will be inserted correctly. In practice, outputs need not prove non-membership (they are new by construction), but inputs must prove membership in the existing state tree. These proofs ensure integrity without revealing record contents.
Aleo’s transaction verification involves two inclusion checks (plus serial-number checks):
- Local inclusion (ℓ_state): Each transaction groups one or more state transitions (function calls) and computes a local state root ℓ_state by hashing all its input and output commitments. Internally, verifying an execution requires checking that every input record’s commitment
cmi
matches its authentic value and thatcmi
is on the Merkle tree committed by ℓ_state. In code this is like:
// For each input record ri with nonce ρi: sn_i = PRFSN(skPRF, ρi); // compute serial cm_i = CM.Commit(v || apk || d || ρi; r_i); assert(T.Verify(ℓ_state, cm_i, path_i) == 1); // Merkle proof w.r.t ℓ_state
This ensures the transaction internally “knows” its input records and that they hash up to ℓ_state.
- Global inclusion (G_state): Each input commitment must also be present in the global ledger tree (the state up to the previous block). The transaction must provide a ledger membership witness (a Merkle path) for every consumed
cmi
. Formally, the prover shows for each input:
i.e. cmi
is a leaf in the global Merkle tree whose root is G_state. Meanwhile, revealing unique serial numbers sn_i
ensures no double-spend (the same record is not consumed twice). (Outputs, when published on-chain, automatically extend the state tree to the new root, so their “inclusion” is just by appending.)
In summary, Merkle proofs in Aleo work much like in other blockchains: a record commitment is a leaf in a hash tree, and the path of hashes up to the root serves as a membership proof. If a protocol needed to prove non-membership (that a record does not exist), one could use a sparse-Merkle scheme: showing that at a given index the tree has only a default value. Aleo’s core model, however, mainly requires positive membership proofs for spent records. The effect is that miners (validators) need only verify O(log N) hashes per record, plus the SNARK, instead of scanning the whole state.
Merkle Trees and Record Commitments
Merkle trees are binary hash trees where each non-leaf node is H(left ∥ right)
and leaves are hashes of data (or commitments). In Aleo, when records are added to the ledger, their commitments cm
become new leaves. Figure [59] below illustrates a generic Merkle tree: transactions (or record commitments) at the bottom level are hashed pairwise, then those hashes are hashed again, and so on up to the Merkle root. The Aleo block header stores exactly this root.
T
is hashed upward: pairwise hashing produces parent nodes, continuing until the single Merkle root. The root is stored in the block header.Mathematically, if the leaf hashes are $h_0,h_1,\dots,h_{k-1}$, the tree computes pairs
and so on; if the leaf count is odd, the last hash may be duplicated (as in Bitcoin) or treated specially. In Aleo, a Poseidon hash (Poseidon2::hash_to_field
) is typically used under the hood for ZK SNARK friendliness. Concretely, the Merkle root for 8 leaves is built by hashing pairs, then those results, etc. The Merkle root provides a single 𝔽-field value that cryptographically binds all leaves. In code one might write (in Leo) a function to verify a Merkle proof:
function verify_merkle(root: field, leaf: field, path: [field; d], index: u8) -> bool { // Recompute the root from the given leaf and sibling hashes. let mut hash = leaf; for i in 0..d { let sibling = path[i]; // bit i of index tells whether hash is left or right child if ((index >> i) & 1u8) == 0u8 { hash = Poseidon2::hash_to_field(hash + sibling); } else { hash = Poseidon2::hash_to_field(sibling + hash); } } return hash == root; }
This circuit checks that starting from leaf
and hashing with the provided path
, we recover the known root
. In Aleo, this kind of loop would be part of the SNARK proof logic. (In practice, Aleo’s proof equations like Rloc and Rglb formalize these checks.) Intuitively, proving membership is like proving you hold a valid VIP ticket in the tree of all tickets, without showing the ticket itself.
Because the tree depth is $\approx\log_2(N)$, membership proofs cost $O(\log N)$ hashing time. This scales well even for millions of records. Updating the state (appending or removing leaves) can also be done in $O(\log N)$ time per change, e.g. by re-computing hashes on the path from a leaf to the root. Aleo’s validators batch many record updates per block: they “take all these record updates” and apply them to the Merkle tree in one go. This bulk-update yields a new Merkle group or snapshot, whose root goes into the block. The FAQ explains this as allowing concurrent updates: the network can combine all parties’ diffs and then “patch up to a Merkle group” for the block header. In effect, Aleo regains some account-model concurrency (parallel state changes) while still using a UTXO-like record model.
Proving Membership in Transaction Verification
In Aleo’s execution model, each transaction supplies all data needed to verify it, including two sets of Merkle proofs:
- Local State Proofs: A transaction may consist of multiple function calls. The prover first hashes all inputs/outputs of those calls into a local state commitment ℓ_state. The SNARK then enforces that each input record satisfies
where cic_i is the record commitment recomputed from the prover’s inputs and wi(ℓ)w^{(ℓ)}_{i} is the Merkle path from cic_i to ℓ_state. Similarly, each output commitment (representing a newly created record) must hash into the same local root ℓ_state. This binds the transaction internals into a single digest.
- Global State Proofs: To finalize the transaction, each input commitment must also appear in the global ledger tree. The prover provides a ledger membership witness (a path in the global Merkle tree) for every
c_i
. The validator checks
where GstateG_{\text{state}} is the prior global root. In other words, the transaction proves it is “consuming” existing records. In addition, each input’s serial number sn_i = PRF(sk_PRF, ρ_i)
is verified unique and consistent. (This prevents double-spend: once a record’s sn
appears on-chain, that record cannot be used again.)
Concretely, a spend proof in the SNARK includes constraints like: parse each input record ri=(vi,apki,di,ρi,ri)r_i=(v_i,apk_i,d_i,ρ_i,r_i), compute
then require T.Verify(ℓ_state, c_i, w_i^(ℓ)) = 1
and L.Verify(G_state, c_i, w_i^(G)) = 1
. Output records have similar commitments but need only prove consistency (they automatically extend the tree). The result is that a block’s transactions can be checked purely via public hashes: validators recompute the record commitments and verify the Merkle proofs against the known state root.
Finally, once a transaction is accepted, validators append the new output commitments to the state. Each new block thus produces an updated global root. Any user can later obtain a Merkle proof for any recorded commitment using APIs (e.g. getMerkleProof
). This allows light clients to verify inclusion of a record in the public state without downloading everything.
Transaction History and Efficient State Access
Aleo blocks record not only state roots but also the transaction list. Internally, the block body contains an array of transactions; the header stores the Merkle root of this array (the “transactions_root”). This means each block is itself a small Merkle tree: one can prove a transaction is in a block by its Merkle path. In practice, this lets explorers or relayers show inclusion or build SPV proofs for transactions.
Beyond Merkle trees, Aleo also uses mappings (account-like tables) for certain applications. For example, if a program uses a global mapping (address → value), the ledger maintains a Merkle accumulator of that map’s entries. An update to a mapping entry (like an account transfer) likewise provides a membership proof against the old root and yields a new root via an UpdateMappings
operation. The state thus can mix UTXO-like records with key-value stores, each versioned by its root hash in the chain.
Because Aleo is a ZK-VM, all state is essentially off-chain data with on-chain commitments. Validators do not store plaintext; they only need to maintain the Merkle roots. To access data (e.g. read a record’s contents), a user typically queries full nodes. However, provers only need random access (through the Merkle paths they include). The static Merkle structure allows fast lookup by index (if records have deterministic indices) or by searching the tree (in a sparse tree, index can be the hash of the commitment itself). In any case, membership proofs are $O(\log N)$ in size. For large-scale use, Aleo could employ techniques like partitioning the state (each program has its own tree) or storing Merkle forests for scaling. The existing specs suggest batching updates “in one go” to avoid constant tree rebalancing, which hints at a design like a state Merkle forest where each program’s records form a subtree, then combined.
In terms of history, Aleo also keeps an append-only ledger of all transactions and state roots. Because each transaction’s execution root ℓ_state is included in that transaction, one can reconstruct the state evolution by iterating the blocks. (Figure [62] in the spec shows a diagram of how transitions in transactions are “Merkilized” to produce ℓ_state, and then how ℓ_state’s are similarly combined per block.) For long-term pruning or light-clients, one could checkpoint roots. But fundamentally, Aleo archives everything in block history (block headers + Merkle roots + SNARK proofs).
Performance and Scalability Analysis
Operation cost: All Merkle operations in Aleo cost $O(\log N)$ hashes per record, where $N$ is the number of records in the state. Computing or verifying a Merkle root requires hashing along a tree branch of length $\approx\log_2(N)$. Poseidon hash (used by default) is efficient in SNARK circuits, so membership proofs are succinct (small, constant-size per proof) and fast to verify. Record insertion is similar: one must hash the new leaf with its sibling, and propagate up to the root, again $O(\log N)$. In practice, a block may contain many transactions each with multiple inputs, so validators may update thousands of leaves and recompute many paths each block. This is mitigated by batching: since all updates in a block produce one new root, miners can aggregate updates and re-compute the tree once (e.g. using incremental Merkle-tree techniques). The FAQ implies exactly this: “the network can effectively take all these record updates, and then in one go, update a Merkle tree with everybody’s diffs... and patch up to a Merkle group”.
Latency: The biggest bottleneck in Aleo is actually generating the zero-knowledge proof, which happens off-chain by either the user or a prover. The Merkle-check parts are light compared to the arithmetic-circuit part of the proof. Verification on-chain (by validators) is quite fast: besides a SNARK verify, only a few hash computations are needed. Aleo’s consensus (PoSW/AleoBFT) targets on-chain throughput of a few hundred TPS initially, but theoretical limits are much higher with optimized hardware. By comparison, Ethereum’s world state uses a Merkle–Patricia Trie. That structure also offers $O(\log N)$ operations, but in practice it can be slower due to RLP-encoding overhead and larger node fanout. Aleo’s flat Merkle tree is simpler (binary, fixed structure) and easier to parallelize when batching.
- Ethereum (Account State): Ethereum stores accounts in a global trie. Lookups/inserts are $O(\log N)$, but nodes have up to 16 branches (hexary trie), and modification requires updating all ancestor nodes. Aleo’s Merkle tree is binary (branch factor 2) and updates happen only on paths touched. Also, Ethereum’s state is plaintext on-chain (no ZK privacy), while Aleo’s commitments hide values. Ethereum’s future upgrade to Verkle trees promises faster updates (even batched), which is conceptually similar to Aleo’s approach.
- Zcash (Note Commitment Tree): Zcash’s shielded pool (Sprout/Sapling) also uses a Merkle tree of note commitments. Spending a note requires a Merkle proof that the note’s commitment is in the tree (and revealing a nullifier). This is very close to Aleo’s record model, except Zcash’s tree only grows (notes are never “removed”, they just become spent via nullifiers) and the proof system is specialized for transfers. Aleo generalizes this to arbitrary programs: any record, not just a currency note, sits in the tree. Both systems incur $O(\log N)$ costs for proofs. Zcash’s major overhead is in the SNARK proving time (spent-and-output spends), whereas Aleo also has SNARK cost but can batch multiple state transitions in one proof. (Note: Aleo uses Fiat-Shamir NIZKs, not just Groth16 zk-SNARKs.)
- Solana (Account Model, No Tree): Solana famously avoids any global Merkle structure for state. Each account’s state is stored as a flat key-value entry in a big memory-mapped database (Cloudbreak). Looking up an account is $O(1)$ (hashtable-like), and validators don’t compute state roots at runtime. This yields very high TPS (tens of thousands) but sacrifices on-chain verifiability of history. There is no on-chain Merkle root to audit (finality relies on checkpoints and signature voting instead). By contrast, Aleo trades some speed for verifiable privacy: every change is double-checked by hash commitments and ZK proofs. In Solana, a transaction’s success means the runtime wrote to accounts; in Aleo, it also means “I proved I was allowed and correct.” Naturally, state size in Solana can grow huge and requires pruning, whereas in Aleo the global state is cryptographically compressed into roots (though full nodes still store the data).
Scalability: Aleo’s Merkle-tree model scales logarithmically. As $N$ grows, proofs lengthen only by a few bits. Even if the network had millions of records, a membership proof is just ~20 hashes (if $N\sim10^6$). The heavy lifting remains the SNARK proof, which can be parallelized off-chain. On-chain storage is small: each block header stores only roots. By contrast, naive UTXO models require storing all UTXOs on-chain, and pure account models store the entire map state. Aleo’s hybrid (Merkle commitments + ZK proofs) is designed so that the on-chain footprint is minimal (just hashes and proofs), and bulk data (encrypted records) lives off-chain or only in node databases.
Finally, end-to-end latency and throughput depend also on consensus. Aleo’s novel PoSW/AleoBFT mixes proof-generation effort with BFT. In principle, many transactions (each with their own multi-transition proofs) can be batched into one block, and parallelized prover nodes can run SNARKs. The concurrency trick of merging Merkle-deltas means Aleo can handle high contention: many users updating disjoint records cause little conflict. This is an advantage over traditional account-based systems, where global locks (on the state tree) can serialize updates.
Example: Leo Code Snippets
Below is a simplified Leo snippet showing how one might check Merkle membership within a circuit (pseudocode):
program verify_record.aleo { // Verifies a record commitment is in the given Merkle root. function check_record(root: field, leaf_commitment: field, merkle_path: [field; 32], index: u8) -> bool { let computed = leaf_commitment; for i in 0..32u8 { let sib = merkle_path[i]; if ((index >> i) & 1u8) == 0u8 { computed = Poseidon2::hash_to_field(computed + sib); } else { computed = Poseidon2::hash_to_field(sib + computed); } } // Ensure the recomputed root matches. assert(computed == root); true } }
This function iteratively rebuilds the Merkle root from a leaf and its siblings, bit by bit. In an actual Aleo circuit, root
would be either ℓ_state or G_state, and the assert would be part of the SNARK constraints. All arithmetic is in the finite field (Poseidon hash is circuit-friendly).
Another snippet: computing a record serial number (nullifier) in Leo:
program serial.aleo { function compute_serial(sk_prf: field, rho: field) -> field { // PRF function for serial numbers (here using Poseidon as a stand-in). return Poseidon2::hash_to_field(sk_prf + rho); } }
In reality Aleo uses a specific PRF circuit PRF_{SN}
for serials, but the idea is the same: the serial sn
is a deterministic hash of the secret key and nonce. The SNARK includes the constraint sn == PRF_SN(sk_prf, ρ)
to bind the prover’s knowledge of the spending key.
Conclusion
Aleo’s state design blends the UTXO-style record model with Merkle-accumulator proofs to achieve verifiable, private state transitions. Every record’s commitment lives in a giant Merkle tree whose root is on-chain. Transaction validity is anchored by SNARKs and Merkle proofs, ensuring that anything needed (membership of inputs, uniqueness of serials, correct commit of outputs) is checked without leaking state. This model scales logarithmically, can exploit parallel updates, and allows full auditability of historical state. In practice, it means Aleo validators and clients rely on very efficient hash computations and proofs, while the user’s view remains succinct (only roots and proofs). Compared to traditional blockchains, Aleo’s approach is like Bitcoin’s UTXO model on steroids: it generalizes UTXOs to arbitrary programs and encrypts them by default, but still enjoys the efficient lookup and proof-of-inclusion properties of a Merkle tree. The result is a state architecture that is both rigorous (provably consistent) and scalable, suited for private ZK applications without sacrificing performance.