DOC# POSEID SLUG poseidon_by_hand_and_by_code PRINTED 2026-05-06 03:47 UTC

Poseidon, by hand and by code

Why one of the cheapest hashes in zero-knowledge cryptography also has the strangest insides. Derive the S-box, count the constraints, and run a 30-line implementation in the browser.

FROM
Dax the Dev <[email protected]>
SOURCE
https://blog.skill-issue.dev/blog/poseidon_by_hand_and_by_code/
FILED
2026-04-22 15:00 UTC
REVISED
2026-04-22 15:00 UTC
TIME
9 min read
SERIES
ZK SNARKs in production
TAGS
#cryptography #poseidon #zk #snark #phd #math

A SHA-256 of “abc” inside a SNARK takes about 24,000 R1CS constraints. The same input through Poseidon — properly parameterised — takes about 250.

Two orders of magnitude. That ratio is the entire reason ZERA’s unified shielded pool ships with consumer-grade UX in 2026. It’s also the reason every modern ZK system you can name uses Poseidon, Rescue, or one of their cousins instead of something the cryptographic community has been beating on for twenty years.

This post is the long answer to why.

The problem with hashing inside a SNARK

A zero-knowledge SNARK proves you know a witness ww such that C(w)=0C(w) = 0 for some arithmetic circuit CC over a prime field Fp\mathbb{F}_p. Every operation in CC becomes a constraint, and proof time scales roughly linearly with the number of constraints.

The trouble with SHA-256 is that it was designed for CPU efficiency, not arithmetic-circuit efficiency. Its building blocks — XOR, AND, bitwise rotation — are cheap on a CPU and catastrophically expensive in Fp\mathbb{F}_p. A single XOR over 32-bit words requires unpacking each word into 32 individual binary constraints, doing the XOR bit-by-bit, then packing back. SHA-256 has 64 rounds of mixing, and every round does several of these.

The constraint cost looks roughly like:

costSHA-256 in SNARK64×(kxor+kand+krot)×w\text{cost}_{\text{SHA-256 in SNARK}} \approx 64 \times (k_{\text{xor}} + k_{\text{and}} + k_{\text{rot}}) \times w

where w=32w = 32 bits and the per-operation constants kk run between 30 and 100. You end up north of 25k constraints for a 64-byte input — and that’s just the hash. A real circuit has dozens of these per spend.

This is the gap that hash-friendly arithmetisation closes.

Poseidon’s design: only field operations, all the way down

Grassi, Khovratovich, Rechberger, Roy, and Schofnegger (2021) had a different idea: design the hash natively in Fp\mathbb{F}_p. No bits. No bytes. Just field elements all the way down.

Poseidon is a permutation-based sponge. The state is tt field elements — typically t=3t = 3 for hashing two-to-one (input | input \to output) and t=5t = 5 for absorbing three field elements at once. The permutation alternates two kinds of rounds:

The S-box is the simplest possible non-linear function over a prime field:

S(x)=xαS(x) = x^\alpha

with α\alpha chosen as the smallest exponent for which gcd(α,p1)=1\gcd(\alpha, p - 1) = 1 (so the map is a bijection). For BN254 — the curve underlying most production ZK pairings, including the one ZERA’s SDK uses — p1p - 1 is divisible by 2 and 3, so α=5\alpha = 5 is the smallest legal exponent. Poseidon over BN254 ships with α=5\alpha = 5.

The full permutation is:

Three primitives, repeated RF+RPR_F + R_P times: add round constantsS-boxMDS matrix multiplication.

That’s the whole algorithm.

Counting the constraints

This is where the order-of-magnitude advantage shows up.

Each S-box is x5=x2x2xx^5 = x^2 \cdot x^2 \cdot x. In R1CS that’s three multiplication constraints (one for x2x^2, one for x4=x2x2x^4 = x^2 \cdot x^2, one for x4x=x5x^4 \cdot x = x^5). The MDS matrix is a fixed t×tt \times t matrix of constants applied to the state — that’s free in R1CS because constant multiplications fold into linear combinations and don’t generate constraints.

So per round:

costfull round=3t,costpartial round=3\text{cost}_{\text{full round}} = 3t, \quad \text{cost}_{\text{partial round}} = 3

Recommended parameters for BN254 with t=3t = 3 (hashing two field elements) are RF=8R_F = 8 full rounds and RP=57R_P = 57 partial rounds. Total constraint count:

8(33)+573=72+171=2438 \cdot (3 \cdot 3) + 57 \cdot 3 = 72 + 171 = 243

Two hundred and forty-three constraints. For a hash of two field elements (~64 bytes of payload). SHA-256 was 24,000+ for a similar payload. That ratio — about 100× — is the entire ball game.

OptionCostLatencyBlast radiusNotes
SHA-256 (in-circuit) ~24,000 constraints / 64-byte input Fast on CPU; brutal in SNARKs Standard; battle-tested Designed for hardware, not for finite fields
Poseidon-128, t=3, α=5 (BN254) ~243 constraints / 2 field elements Slow on CPU vs SHA; fast in SNARKs Younger primitive; growing analysis Designed for SNARKs first; the standard since 2020
Rescue-Prime ~150 constraints / 2 field elements Slightly fewer constraints than Poseidon Less peer review than Poseidon Closer to a research curiosity in 2026
Anemoi ~120 constraints / 2 field elements Newest; lowest constraint count Very young; minimal cryptanalysis Promising but I would not bet a production pool on it yet

The blast-radius column is doing real work. Poseidon’s the one I’m comfortable shipping in zera-sdk right now. Rescue and Anemoi are interesting but the cryptanalysis hasn’t caught up to the deployment.

A 30-line Poseidon you can run in the browser

Here’s a complete, working Poseidon-128 over BN254, written in TypeScript with bigint arithmetic. It’s not optimised — production code uses Montgomery form, precomputes S-box squares, and uses constant-time field arithmetic — but it’s correct and small enough to read in one sitting.

sandbox [ vanilla-ts ]
run

The thing that’s striking when you write this out is how little there is. A SHA-256 implementation is hundreds of lines of bit-twiddling. Poseidon is essentially: add a constant, raise to the fifth power, multiply by a fixed matrix, repeat.

Why α=5\alpha = 5 specifically

The S-box choice is the most-questioned part of Poseidon. Why not α=3\alpha = 3? Or α=7\alpha = 7?

Two constraints:

  1. Bijection. xxαx \mapsto x^\alpha is a permutation of Fp\mathbb{F}_p if and only if gcd(α,p1)=1\gcd(\alpha, p - 1) = 1. For BN254, p1=2283(other stuff)p - 1 = 2^{28} \cdot 3 \cdot \text{(other stuff)}, so α{2,3,4}\alpha \in \{2, 3, 4\} all share a factor with p1p - 1 and produce non-bijective maps. The smallest α\alpha that works is 5.

  2. Algebraic degree. The whole point of the S-box is to introduce algebraic non-linearity that defeats interpolation attacks. Higher α\alpha → more non-linearity → fewer rounds needed. So you want α\alpha small enough to be cheap, large enough to need few rounds.

For curves where gcd(3,p1)=1\gcd(3, p-1) = 1 (like BLS12-381), the choice flips to α=3\alpha = 3 and the round count drops because each S-box is more powerful. The trade-off is: cheaper per-round but more rounds.

The choice of α=5\alpha = 5 for the prime field of BN254 is dictated by the requirement that the S-box must be a permutation: it must hold that gcd(α,p1)=1\gcd(\alpha, p-1) = 1. The next constraint — the one that determines round counts — is the algebraic degree.

Grassi, Khovratovich, Rechberger, Roy, Schofnegger ↗ source

What I would change in a v2

Three things, if I were re-designing Poseidon for 2027:

  1. Drop the partial-rounds split. The original design has 8 full + 57 partial rounds; the partial rounds save a lot of constraints but make security analysis harder. Poseidon2 (Grassi, Khovratovich, Roy 2023) keeps a similar structure with cleaner analysis. I’d ship Poseidon2 by default in a fresh deployment.

  2. Make the MDS matrix circulant. A circulant MDS — where each row is a rotation of the previous — has identical security properties but lets you exploit FFT-friendly arithmetic. Worth it on the prover side.

  3. Standardise the parameter file format. Every implementation rolls its own format for round constants. The Circomlib JSON format works, but a CBOR or Cap’n Proto schema would let implementations cross-check parameters in a way that’s currently per-vendor. I keep the Circomlib JSON in zera-sdk because compatibility, not because it’s the right choice.

Where this goes in production

Inside zera-sdk the Poseidon implementation is crates/zera-sdk-core/src/hash/poseidon.rs. It’s about 200 lines of safe Rust, written against the ff crate for field arithmetic, with the round constants loaded from a JSON file extracted from Circomlib for cross-implementation parity.

rust playground [ rust ]
open

Further reading

← Back to article