Plonky3, the small-fast-cheap revolution
Why plonky3 — small fields, FRI commitments, no trusted setup — is the proof system to watch in 2026. The Mersenne31 / BabyBear / Goldilocks landscape, the FRI folding step, and why your laptop is suddenly a viable prover.
- FROM
- Dax the Dev <[email protected]>
- SOURCE
- https://blog.skill-issue.dev/blog/plonky3_small_fast_cheap/
- FILED
- 2026-05-02 17:00 UTC
- REVISED
- 2026-05-02 17:00 UTC
- TIME
- 9 min read
- SERIES
- ZK SNARKs in production
- TAGS
For a decade the dominant question in proof-system engineering was which curve. BN254 because Ethereum verifies it cheaply. BLS12-381 because Zcash and Filecoin standardised on it. The conversation orbited 254-bit and 381-bit pairing-friendly prime fields, and the engineering economy followed: every multiplier, every NTT, every MSM was tuned for those sizes.
Then Polygon Zero shipped plonky2 in 2022, then plonky3 in 2024, and the question changed. The new question is which 31-bit prime. Mersenne31. BabyBear. KoalaBear. Fields small enough that two limbs fit in a single 64-bit word. Fields where AVX-512 SIMD lanes hold sixteen field elements at once. Fields where a consumer laptop is suddenly a viable prover for circuits that used to require a small datacentre.
This is the small-fast-cheap revolution. It is also the most underrated story in production cryptography in 2026, because most of the conversation about it is happening inside Polygon, Succinct, and a handful of zkVM teams, and it hasn’t yet hit the popular “ZK in 2026” articles. This post is my attempt to write the article I keep wishing existed.
The case for small fields
Every proof-system operation eventually reduces to multiply two field elements modulo a prime. The cost of one of those multiplies is essentially:
where is your machine’s word size (typically 64 bits) — i.e., the cost is quadratic in the number of machine words required to hold a field element. For BN254’s 254-bit prime that’s 4 limbs, so low-level multiplies per high-level field multiplication. For Mersenne31 — the prime — that’s one limb, so one low-level multiply. Sixteen times faster on the floor.
The headline cost is fewer cycles per multiply. The hidden cost — and the one that actually shifts the deployment landscape — is SIMD parallelism. AVX2 holds eight 32-bit lanes; AVX-512 holds sixteen. With BN254 you can fit two field elements in an AVX-512 register and parallelism is awkward. With Mersenne31 you fit sixteen, and operations like NTTs become embarrassingly parallel.
There is one cost. Soundness. A 31-bit prime gives you ~31 bits of security per query in a STARK / FRI-based protocol. To get to the standard 100-bit security, you query the FRI oracle multiple times (~100 queries), or you work in a quadratic / quartic / quintic extension field during the protocol’s soundness-critical steps. Plonky3 does both: prover work happens in the base field for speed, and the random-evaluation challenges (where soundness lives) happen in an extension field.
This is the core trick. Big fields where you need security; small fields everywhere else. It buys an order of magnitude in prover time without compromising the threat model.
The four small-field contenders
There are four primes the 2026 ecosystem cares about. They’re all chosen because they admit fast modular reduction (no expensive division per multiply) and they all fit comfortably in a 64-bit word.
| Field | Prime | Why this prime |
|---|---|---|
| Mersenne31 | Mersenne prime — reduction is one shift + one add; smallest sensible prime field | |
| BabyBear | NTT-friendly — has a 2-adicity of 27, so domain sizes up to admit fast FFTs | |
| KoalaBear | NTT-friendly — slightly worse 2-adicity (24) but better extension-field arithmetic | |
| Goldilocks | 64-bit prime; used by plonky2 and Risc Zero; fits in one machine word |
Plonky3 supports all of them and lets you pick at compile time. The choice changes the constant in front of the prover time and the security analysis but doesn’t change the protocol shape.
In production:
- plonky2 (the older Polygon Zero proof system, still widely deployed) uses Goldilocks.
- plonky3 primarily ships with BabyBear or KoalaBear as the recommended defaults.
- Risc Zero’s zkVM uses Goldilocks.
- Succinct’s SP1 uses BabyBear.
- Stwo / StarkWare’s next-gen uses Mersenne31 (the M31 /
circle-starkprogram).
The convergence is striking: every serious 2026 zkVM is on a small field. The big-field era for zkVMs specifically is closing.
flowchart LR Z[2014: Pinocchio] --> G[2016: Groth16 - BN254] G --> P[2019: PLONK + KZG] P --> H[2020: Halo2 - Pasta IPA] H --> H2[2024: Halo2 - KZG/BN254] G --> S[2018: STARK - Goldilocks] S --> P2[2022: plonky2 - Goldilocks] P2 --> P3[2024: plonky3 - BabyBear] P3 --> ZK1[zkVMs: SP1, RISC0, Stwo] H2 --> EVM[EVM rollups] classDef big fill:#3a0a0a,stroke:#f87171,color:#fff classDef small fill:#0a4014,stroke:#4ade80,color:#fff class G,P,H,H2,EVM big class S,P2,P3,ZK1 small
FRI — the polynomial commitment behind everything small
The reason small fields work in proof systems at all is FRI (Fast Reed-Solomon Interactive Oracle Proof), introduced in Ben-Sasson, Bentov, Horesh, Riabzev (2018). FRI is a polynomial commitment scheme that works over any field — no pairing-friendliness required, no trusted setup, no SRS. The trade-off is proof size: FRI proofs are tens of kilobytes, where KZG proofs are 600 bytes.
For the prover, FRI is the most expensive thing in the protocol. Most of it is folding: at each round you take a polynomial of degree and reduce it to a polynomial of degree by combining adjacent coefficient pairs. Repeat times and you arrive at a constant-degree polynomial that the verifier can check directly.
The folding step is one line of arithmetic:
where is a random challenge from the verifier. If has degree , has degree . The verifier checks consistency at a small number of query points drawn at random.
Below is a tiny Sandpack demo that visualises the folding step on a small polynomial — you pick a degree-7 polynomial, the demo folds it to degree-3, then degree-1, then a constant, and shows the coefficients at each step.
What’s worth internalising from the demo: each fold is a linear combination over field elements. There’s nothing exotic here. The reason FRI is fast in production is that the inner loop of “combine pairs of coefficients with a random multiplier” is exactly the kind of thing AVX-512 was built for. Sixteen lanes. Per cycle. Per core.
Why “consumer hardware” matters in 2026
Here are wall-clock prover times for a 1-million-cycle zkVM trace, measured across the major 2026 zkVM stacks on a consumer machine — a 2024 MacBook Pro with M3 Max, 14 cores, 48 GB RAM. (Numbers from public benchmarks, normalised to the same reference input.)
| Stack | Field | Prover time | Notes |
|---|---|---|---|
| RISC Zero (zkVM) | Goldilocks | ~3 minutes | STARK + AIR |
| SP1 (zkVM) | BabyBear | ~95 seconds | plonky3-based |
| Stwo (zkVM) | Mersenne31 | ~80 seconds | circle-STARK on M31 |
| zkSync (Boojum) | Goldilocks | ~5 minutes | older arithmetisation |
Two years ago, none of these were under five minutes. Today the leaderboard is a tight band between 80 seconds and 3 minutes, and the difference is dominated by which small field. The big-field equivalent (a pure BN254 PLONK prover at the same trace) would take 30+ minutes on the same machine.
This is what “consumer hardware is now a viable prover” means in 2026. The substantial barrier — the one that kept zkVMs off consumer hardware until 2024 — was the cost of MSMs and NTTs over big fields. Small fields removed that barrier.
The four-prime tradeoff
| Option | Cost | Latency | Blast radius | Notes |
|---|---|---|---|---|
| BN254 (~254 bits) | Pairing-friendly; 4 limbs per element; small SIMD parallelism | Slow per-op; required for EVM verification | Standard; battle-tested by Ethereum and every Groth16 circuit | The default in 2020-2024; still required for EVM verifier outputs |
| BLS12-381 (~381 bits) | Pairing-friendly; 6 limbs per element | Slower than BN254 in-circuit; better aggregate signatures | Standard; Filecoin / Ethereum consensus signatures | Use when you need 128-bit security pairings, not for prover work |
| Mersenne31 ($2^{31}-1$) | Tiny; trivial reduction; 16x SIMD parallelism on AVX-512 | ~30x faster per multiply than BN254 | Newer; requires extension-field handling for soundness | What StarkWare's circle-STARK uses; future-proof choice |
| Goldilocks ($2^{64}-2^{32}+1$) | Single u64 limb; clean reduction via algebraic identity | Slower than M31 but more 2-adicity for big NTTs | Used by plonky2, Risc Zero, zkSync Boojum; mature | The pragmatic 2024-2026 default for STARK-based zkVMs |
Why this should change how you think about ZK costs
The dominant ZK cost model from 2018 to 2024 was: more constraints = more dollars. Field arithmetic was the bottleneck, the constants were huge, and a million-constraint circuit was a real research expense.
The 2026 cost model is different. Constraint count still matters, but the constants have collapsed. A million-constraint Plonky3 trace proves on a $1500 laptop in under two minutes. That’s three orders of magnitude cheaper than the equivalent BN254 PLONK prover four years ago. Prover-side cost is no longer the binding constraint for most applications.
The new binding constraints are:
- Memory bandwidth. Big NTTs are memory-bound, not compute-bound. The win from small fields is partly that more elements fit in cache.
- Verifier complexity in non-EVM environments. Plonky3 proofs are 50–200 KB; verifying them on Ethereum requires either an EVM-friendly final wrap (which is what the SP1 / RISC0 / Stwo verifiers do) or a Solana-style permissive compute budget.
- Ecosystem maturity. snarkjs / Halo2-axiom / circomlib have a decade of accreted gadgets; Plonky3 is in year three of its current incarnation. The libraries are catching up but they’re not at parity yet.
Where this leaves zera-sdk
Inside zera-sdk the substrate is BN254 + Groth16 because Solana’s verifier is BN254-and-only-BN254 today. There’s no equivalent of sol_alt_bn128_pairing for any of the small-field protocols. That means Plonky3 is not a choice we get to make for the deposit / transfer / withdraw circuits — the on-chain side fixes the curve.
What we do track is the Solana CPI proposal for STARK verification (no number yet; was last discussed in 2025) and the related “compute-budget-friendly Halo2 verifier” path. The day Solana ships either of those, the prover-side win from migrating off BN254 is large enough to justify a circuit rewrite. Until then, BN254 it is.
For off-chain proving — CI checks, offline auditing, batch verification — Plonky3 is already the right tool, and we’re using it inside the test harness for cross-validating circuit semantics.
What I’d build differently in 2027
Three follow-ups, in order of how much I expect them to matter:
- A small-field shielded pool. Every privacy pool today is BN254 + Groth16 + per-circuit ceremony. The day Solana (or any high-throughput L1) ships a STARK verifier, the design space opens: no ceremony, faster proving, smaller wallets. Someone will publish this design before the verifier ships and they’ll be right to.
- A unified extension-field abstraction. Plonky3 has different extension-field arithmetic per base field. A single
Ext<F, k>with consistent ergonomics would make cross-field experimentation trivial. The team is aware; not yet shipped. - A small-field Poseidon variant. Poseidon-128 is parameterised for BN254. The recommended hash for BabyBear is Monolith or Poseidon2 over BabyBear, and the constraint counts are different enough that constraint-counting intuition from BN254 doesn’t transfer. A “Poseidon constraint cost calculator” that takes a field as input and emits constraint counts for common circuits would close a real reasoning gap.
Further reading
Plonky3/Plonky3— the toolkit; the README is the closest thing to a paper- Polygon Plonky3 is Production Ready — Polygon’s announcement, summarising the small-field bet
- Scalable, transparent, and post-quantum secure computational integrity — Ben-Sasson, Bentov, Horesh, Riabzev (2018) — the FRI / STARK paper
- Risc Zero zkVM proof system — the contrast point: Goldilocks + STARK in production
- Halo2 in 2026: what changed since the Zcash era — sister post on the big-field / KZG lineage
- Proving in the browser, by the numbers — what Plonky3 means for in-browser proving (spoiler: enormous, eventually)
- Poseidon, by hand and by code — the hash that’s being re-parameterised for small fields