P1 - The architecture of SP1 [FULL OVERVIEW]
Welcome to the wacky world of Succinct Processor 1 (SP1), our zero-knowledge proving prodigy! In this series, we’ll tour the ins and outs of SP1’s architecture – from how it runs your Rust code to how it conjures cryptographic proofs. Buckle up for a slightly chaotic, humorous, yet informative ride through zkVM-land. By the end of Part 1, you’ll have a high-level grasp of what makes SP1 tick (and why it’s so darn cool). And yes, there’s a game demo invite waiting for you at the end. Let’s dive in! 🎉
Meet SP1: A ZK Virtual Machine with Personality
SP1 in a nutshell: It’s a zero-knowledge Virtual Machine (zkVM) that proves a computer program ran correctly – without needing everyone to rerun that program step-by-step. Think of it as a truth machine for code. You give SP1 a program (say, written in Rust), and it gives back a cryptographic proof that “I swear on math, this program executed correctly!” 🔒✨.
Why is this awesome? Traditionally, using zero-knowledge proofs (ZKPs) was like rocket science – you’d have to hand-craft circuits or learn obscure frameworks. SP1 changes that. It lets developers write normal code (Rust, or any language that can compile to RISC-V) and then automatically proves that code’s execution is valid. No need to trust a specialized circuit or a PhD in cryptography – SP1 makes truth programmable in a familiar language. In short, SP1 is here to democratize “moon math” for everyday developers, bringing ZK proofs to rollups, bridges, dApps, or whatever wild idea you have, with performance that’s actually practical.
Oh, and did we mention performance? SP1 isn’t just any zkVM – it’s built to be blazing fast. We’re talking up to 10× cheaper proof generation and ~28× faster execution than some alternatives. How? Spoiler alert: a lot of clever engineering (and a dash of black magic) went into SP1’s design – from a novel “precompile-centric” architecture to GPU-powered proving. Don’t worry, we’ll unpack all those buzzwords in a moment. Just know that SP1 has a need for speed, and it shows.
Before we get carried away (too late?), let’s start at the beginning of a proof’s journey: how your Rust code becomes something SP1 can run and prove.
From Rust to RISC-V: Code, Meet zkVM
SP1’s motto could be “Write Rust, get proofs.” But how does Rust actually run on this VM? The secret sauce is RISC-V, a simple and open Instruction Set Architecture (ISA). SP1 uses RISC-V as its language for low-level instructions. This means any program in Rust (or C/C++/Swift/etc.) can be compiled down to RISC-V machine code, which SP1 understands.
- Write code in Rust. Use your usual tools and libraries – SP1 even supports the Rust standard library, so you don’t have to limit yourself to no-std kung-fu.
- Compile to a RISC-V ELF binary. SP1 provides a toolchain (via
sp1-sdk
) that takes your Rust code and produces a RISC-V executable (an ELF file). Essentially, your high-level code is translated into RISC-V instructions (think of them as assembly language ops). - Feed the RISC-V code to SP1’s zkVM. The SP1 virtual machine will emulate these RISC-V instructions in a special way: not only does it execute them, it records every step it takes (more on this in a sec), preparing the ground to prove those steps were all legit.
Why RISC-V? Because it’s a general-purpose ISA that’s well-supported (lots of compilers target it) and relatively simple. Simplicity is key here – every RISC-V instruction that SP1 supports needs to be turned into a bunch of mathematical constraints (yikes 🙃), so the fewer and simpler the instructions, the easier that job is. Using RISC-V gives SP1 flexibility (we can run any code compiled for RISC-V) while staying lean enough for proving. It’s like choosing a common language everyone understands, then teaching a robot to prove statements in that language.
So at this stage, we have a RISC-V binary ready to roll. Time to run it, right? Here’s where things get interesting: SP1 doesn’t run the program on a normal CPU; it runs it on a virtual circuit and produces an execution trace – a fancy log of everything that happened. Let’s explore that next.
Execution Trace: Every Step Leaves a Trail
When SP1 executes your program, it’s not just computing outputs – it’s also keeping a detailed diary of the execution. This diary is what we call the execution trace. In a normal computer, you don’t log every CPU cycle and memory access (that would be overkill), but in a zkVM, this trace is golden: it’s the data we’ll feed into our proof system to convince the world your program did what it should.
Think of the trace as a giant table of rows and columns, where each row is like a snapshot of the system at a given step. Each step might include things like the current instruction, register values, memory read/writes, etc. In SP1, the trace is actually composed of multiple tables (one for the CPU, one for memory, etc.), all linked together in harmony. This multi-table design is called a cross-table lookup architecture – a cool trick to keep different parts of the VM in sync. For example, if the CPU part of the trace says “I loaded data from memory address X,” the memory table better have a matching entry for that access. SP1 uses permutation checks under the hood to enforce this consistency. It’s like reconciling two copies of a story to make sure they match – any inconsistency, and the proof will fail. (No sneaky business allowed! 😉)
So, as your RISC-V code runs step by step, SP1 populates these trace tables with all the nitty-gritty details. By the end of execution, we have a complete log of what happened. If someone were crazy enough to replay the computation from this trace, they could – it has all the info needed. But instead of replaying it, we’re going to prove it was correct using some cryptographic mojo. Enter the world of STARKs!
STARK Proofs: Math Magic to Verify Execution
Now for the fun part: turning that execution trace into a proof that anyone can verify without re-executing the program. SP1 uses a proving system in the STARK family, which stands for Scalable Transparent ARgument of Knowledge. Despite the intimidating name, the idea is straightforward (okay, relatively straightforward 😅):
- Scalable: can handle large computations (our trace can be huge) efficiently.
- Transparent: no trusted setup ceremony needed (no secret keys or powers of tau ceremonies – just pure math and randomness).
- Argument of Knowledge: it’s a proof that we “know” a valid execution trace without revealing it fully.
In practice, what SP1 does is convert the trace into a bunch of polynomials and then uses a protocol to prove that those polynomials satisfy certain conditions (the conditions encoding “every step followed the rules of RISC-V and the program logic”). If you’ve never heard of polynomials being used this way, it’s wild – essentially each column of the trace table becomes a polynomial, and relations between steps become equations those polynomials must satisfy. Proving the program ran correctly boils down to proving these polynomial equations hold true for the trace we have.
This is where a component called AIR (Algebraic Intermediate Representation) comes in. AIR is like the high-level recipe for those polynomial relations. SP1’s designers (and the underlying Plonky3 library) defined an AIR that captures the RISC-V logic: increment the program counter, update registers, enforce memory consistency, etc. It’s all encoded algebraically. Sounds complex? It is – but the good news is, as a user of SP1, you don’t have to write those equations; SP1’s circuits do it for you. 🎉
Once the AIR constraints are set up, the actual proof is generated using a STARK protocol. Here’s a (very) high-level sketch of what happens next:
- SP1 takes all those trace polynomials and commits to them (sort of like generating cryptographic fingerprints of the data).
- It uses a fancy check called FRI (Fast Reed-Solomon Interactive Oracle) to prove these polynomials have the right form and satisfy our constraints, without revealing the whole polynomials. FRI is like checking a huge math problem by sampling a few points – a bit of probabilistic wizardry that keeps the proof small and verification quick.
- After a few rounds of this algebraic ping-pong, SP1 produces a proof (a bundle of cryptographic data) that can convince anyone who checks it that “yep, those trace polynomials exist and are consistent with a correct execution.”
The result? A STARK proof attesting to the validity of the program’s execution. No one needs to crunch the original program again – they can just check this proof (which is much faster to verify) and be sure the output is trustably correct. It’s as if SP1 compressed the entire computation and its correctness into a little mathematical certificate. Pretty neat, right?
But we’re not done. Raw STARK proofs can be large (think megabytes). And while they’re fast to verify relative to re-running a program, in blockchain settings we often want tiny proofs for on-chain verification. This is where SP1 gets even craftier: it uses recursion and SNARK wrapping to shrink and stack proofs in clever ways.
Recursion: Proofs of Proofs (ZK-ception!)
Recall that verifying a STARK proof is faster than running the whole program – but it’s still some work. What if that verification step is too heavy for the place we want to use it (like a smart contract)? Enter recursive proofs. Recursion in zkVMs means generating a proof about the verification of another proof. Mind blown yet? 🤯
Imagine we have a proof P1
for our program. We can design a second program whose job is to verify P1
and then prove that was done correctly. The outcome is a new proof P2
which essentially says “Hey, P1
is valid.” Why is this useful? Because P2
can be much smaller or in a different format that’s cheaper to verify somewhere else. It’s like compressing a big video into a thumbnail – you trade off some computation (to compress/prove it) in order to make sharing/checking it easier.
SP1 supports recursion natively. In fact, the team built a mini zkVM specifically for verifying SP1 proofs inside SP1! If that sounds meta, it is – SP1 has a special recursion ISA and DSL (domain-specific language) tailored for succinctly verifying proofs. This recursion VM can check a STARK proof of SP1 and produce its own proof (a STARK proving that verification step). By doing this, SP1 can aggregate many proofs into one or convert a big STARK proof into a smaller proof. You can even recursively verify multiple program executions and bundle them into one ultimate proof. It’s like chaining proofs together until you get a single proof that convinces you of the truth of a whole batch of computations.
In practical terms, recursion is what enables proof aggregation. For example, if you have 100 transactions each proven by SP1, you could recursively prove all 100 proofs were valid and end up with one proof to verify on-chain instead of 100. That’s a huge savings in verification cost for blockchains (where every byte of proof data can cost gas). SP1’s recursion is highly optimized (borrowing techniques from Plonky3 and some custom gadgets), making it one of the first zkVMs to have efficient STARK recursion in production.
At this point, you might wonder: if STARK proofs are big, and we want to verify on an Ethereum smart contract, do we really want a giant STARK verifier on-chain? Probably not – even recursive STARK proofs might be too bulky for an Ethereum contract to check directly (at least until Ethereum gets more math-friendly precompiles). So SP1 employs one more trick: SNARK wrapping.
SNARK Wrapping: Making SP1 Proofs Ethereum-Friendly
SNARKs are another type of proof (like STARKs, but with different trade-offs). Notably, some SNARKs (e.g., Groth16 or Plonk) produce super tiny proofs and have efficient verifiers that fit nicely on-chain. The downside? They often require a trusted setup and aren’t as scalable for huge computations by themselves. But what if we use a SNARK just to verify our STARK proof? Best of both worlds! 🌟
SP1 does exactly this. After generating a STARK proof of your program, SP1 can wrap it in a SNARK – specifically, it can produce a Groth16 proof that “a valid STARK proof exists for the following computation.” This Groth16 proof is succinct and can be verified on Ethereum with just a precompile call (on bn254 elliptic curve) and about 300k gas.
How does SP1 generate a SNARK proof about a STARK proof? It uses those precompiles we hinted at and some auxiliary circuits. Essentially, SP1’s proof system itself can output proofs in two layers: first a STARK (for heavy lifting), then a SNARK that attests to the correctness of that STARK. It’s like writing a detailed novel (the STARK) and then publishing a one-paragraph review (the SNARK) that everyone can quickly read to be convinced the novel is legit without reading it in full.
With SNARK wrapping, SP1 achieves on-chain verification compatibility. You get the transparency and speed of STARK off-chain, and the succinct convenience of SNARK on-chain. As a developer, you can generate a proof of your Rust program and immediately verify it in an Ethereum smart contract – no sweat. In fact, SP1’s output can be a Groth16 proof that any Ethereum contract can check, which makes it plug-and-play for existing zk-rollup and dApp infrastructures.
Alright, we’ve talked a lot about how SP1 ensures correctness and compresses proofs. Now let’s switch gears to how SP1 achieves its speed. Two big ingredients: precompiles (special accelerated circuits) and GPU parallelism. Time to turbocharge this zkVM!
Precompiles: Hardware Acceleration, but for Proofs
Remember how we compile to RISC-V and run every instruction as part of the proof? That’s flexible but sometimes inefficient – especially for tasks like hashing or elliptic curve ops that would take many RISC-V instructions. Enter SP1 precompiles: think of them as built-in cheat codes for common heavy tasks.
Instead of interpreting, say, 1000 RISC-V instructions to do a SHA-256 hash, SP1 can invoke a hash precompile – a purpose-built circuit that computes the hash much more efficiently within the proof. It’s like having a hardware accelerator in your CPU that speeds up cryptography – but here it’s a ZK-circuit accelerator. SP1’s precompile system was designed to be flexible: new precompiles can be added for different operations without overhauling the whole architecture.
Current precompiles in SP1 include things like:
- Hash functions (e.g., Keccak, SHA) for fast Merkle proofs or crypto hashing.
- Elliptic curves (bn254, bls12-381, secp256k1) for fast signature checks or recursive proof verification.
- Others for specific use-cases (you name it – if it’s used a lot and slow in pure RISC-V, it could be a precompile).
The benefit is massive speed-ups. As an example, verifying a Groth16 proof inside SP1 went from 173 million cycles down to ~9 million cycles by using the bn254 precompile – roughly a 20× improvement! That’s because the specialized circuit can do the elliptic curve math in far fewer steps than simulating it via basic arithmetic instructions. Similarly, hashing large data or doing big integer math is much faster with precompiles.
From a proof standpoint, precompiles reduce the trace length significantly for those operations, which means smaller proofs and faster proving. And thanks to SP1’s cross-table design, integrating a precompile is seamless – the output of a precompile circuit feeds back into the main CPU trace via lookups, as if the CPU “called a subroutine” that instantly did the work. This precompile-centric architecture is a big reason SP1 benchmarks so well against other zkVMs.
In short, precompiles give SP1 the ability to cheat (legally): do in one step what others might do in a thousand. It’s one of our favorite tricks in the toolbox for performance. But what about raw computational power? That’s where GPUs join the party.
GPU Parallelism: Proving at Warp Speed
Even with all the optimizations, generating ZK proofs is heavy work – lots of math on big polynomials and vectors. Traditionally, proof generation was done on CPUs, but why not use a whole fleet of cores? This is where GPUs (Graphics Processing Units) shine. They’re basically many-core processors optimized for parallel number crunching, perfect for the linear algebra and FFTs in STARK proving.
SP1 leverages GPUs to dramatically speed up proving. In fact, the latest SP1 prover can offload major parts of the computation to GPUs, achieving enormous speed-ups (and cost reductions) over CPU-only proving. Real-world benchmarks showed SP1 proving something like an Ethereum block’s worth of transactions 10× cheaper using GPUs. The intuition: instead of one brain doing all the work sequentially, SP1 uses many smaller brains tackling chunks of the problem in parallel.
For example, the polynomial FFTs (Fast Fourier Transforms) and cryptographic hashing in STARKs can be parallelized. A GPU with thousands of threads can evaluate or interpolate polynomials much faster than a single CPU core. SP1’s engineering team spent a ton of effort optimizing these GPU kernels (even using fancy instruction sets like AVX-512 on CPUs as well). The result is an order-of-magnitude leap in proving throughput.
What does this mean for you? Faster proofs, cheaper cloud costs, and the ability to tackle bigger computations. SP1’s proving on a decent GPU rig can turn hours into minutes for certain tasks. And if one GPU isn’t enough, you can even scale out with multiple GPUs or machines working in parallel (think of a prover cluster racing through chunks of the trace).
In summary: SP1 + GPUs = 🚀. It’s the difference between carving a statue with a toothpick versus using a laser cutter. Both can get the job done, but one is way faster.
Decentralized Prover Network: Proofs as a Service
Okay, so SP1 can run fast on powerful hardware. But what if you don’t have a beefy GPU farm at home? Or what if your protocol needs to generate tons of proofs and you’d rather not foot the hardware bill at all? That’s where the Succinct Prover Network comes in – a distributed network of provers that anyone can use (and contribute to) to get proofs done. Think of it as the “Uber for ZK proofs” 🚕, or a decentralized AWS for proving.
The idea is simple: you submit a proving job (your program and inputs) to the network, and a network of participants (provers) will do the heavy lifting and return you a valid proof. These provers are incentivized (they earn fees or rewards) to compute correctly and quickly. Since it’s a network, you have reliability and scale – many machines across the world can work on many proofs concurrently. It removes the need for each project to maintain its own specialized ZK hardware cluster.
Succinct’s Prover Network is designed with an auction-like marketplace: provers bid to take on your job, ensuring you get a competitive price and fast turnaround. And because proofs can be verified easily, you don’t have to trust the prover – the proof they return either checks out or not. This decentralizes trust and makes proof generation a commodity service.
In context of SP1, the Prover Network makes it easy for developers to adopt. You can write your app’s logic in Rust, and rather than also becoming a DevOps expert in ZK hardware, you just call the network to get your proofs. It’s like having a cloud render farm for your zkVM. Early partners and projects have already been using our private beta to generate thousands of proofs this way. The network is now coming online to the public, unlocking ZK capacity for everyone.
For those interested in the nitty-gritty: the network coordinates jobs, ensures provers put up stakes (so they can be slashed if they cheat), and will even support proof aggregation services. The long-term vision is an open marketplace where proving power is abundant and cheap, driving down the cost for all users. We won’t dive too deep here (that could be its own article), but keep an eye out for future parts of this series where we explore how the Prover Network works and how it ties into SP1’s design.
Wrapping Up (and What’s Next)
Congratulations, you made it through a whirlwind tour of SP1’s architecture! 🎊 We covered a lot of ground: starting from high-level Rust code, through the RISC-V pipeline and execution trace, into the arcane realm of STARKs and FRI, climbing up the recursion ladder, wrapping things in neat SNARK bows, and turbocharging it all with precompiles, GPUs, and a distributed network of provers. Phew! That’s a mouthful (or brainful). The key takeaway is that SP1 combines many advanced techniques into one coherent system – all so you can prove programs and verify them efficiently.
But guess what? We’re just getting started. This was Part 1, giving you the big picture. In upcoming parts, we’ll zoom in on specific pieces of SP1. Curious about how exactly those trace tables are constructed? Or how our recursive proof DSL works under the hood? Or maybe you want a deep dive into the precompile design (we’ve got some fun stories about implementing elliptic curves in a zkVM 😜)? Stay tuned – we’ll tackle each of these in a friendly, accessible way, continuing our (slightly crazy) journey through ZK land.
Before you go: why not see SP1 in action for yourself? On my personal site, I’ve put up a quirky demo game that runs on SP1. It’s a simple puzzle game where, behind the scenes, SP1 is proving that the game logic was followed correctly (no cheating!). You can play the game and then inspect a real SP1 proof that verifies your win. 🤯 It’s both fun and a neat tech showcase. Consider this an open invitation to give it a try – head over to the game demo link (on the author’s site) and let me know what you think!
Thanks for reading this far. We hope you enjoyed the energetic romp through SP1’s architecture. Zero-knowledge proofs can be complex, but with the right analogies (and a dash of humor), they become a lot more approachable. Stick around for Part 2, where we’ll deep-dive into “How Does SP1 Actually Prove a Program?” – we promise more clarity, more chaos, and more knowledge. Until then, keep it succinct and zk-on! 🫡