# Skill Issue Dev | Dax the Dev > I'm a Nuclear Engineer turned Software Engineer. I'm passionate about learning and sharing my knowledge with others. I'm currently working on a few projects and I'm always looking for new opportunities to learn and grow. ## Posts - [The post-quantum migration path: lattice commitments, STARK wrapping, isogeny credentials](https://blog.skill-issue.dev/blog/post_quantum_relayerless_path/): Series finale. Shor's algorithm breaks every elliptic-curve assumption F_RP currently rests on. The migration: lattice polynomial commitments (Brakedown/Orion), hash-based STARKs as universal backend, isogeny group actions for credentials. - [MEV resistance: why UPEE is sandwich-proof by construction](https://blog.skill-issue.dev/blog/mev_resistance_in_private_execution/): Theorem 7.3 — UPEE transactions resist sandwich/frontrun/liquidation MEV by construction. Theorem 7.4 — block MEV bounded by public-bit leakage, not transaction value. Independent of V, not super-linear. - [F_RP vs Zcash, Tornado, RAILGUN, Aztec, Penumbra, Aleo, Namada, Monero](https://blog.skill-issue.dev/blog/f_rp_vs_existing_privacy_systems/): F_RP vs nine deployed privacy systems on the four axes that matter: relayer-free, Turing-complete, on-chain verifiable on a high-perf L1, low-trust setup. - [Fitting F_RP in 656 bytes on Solana](https://blog.skill-issue.dev/blog/solana_instantiation_656_bytes/): Concrete F_RP instantiation on Solana. Groth16 over BN254, Poseidon Merkle, indexed nullifier tree, BN254 Pedersen, transaction in 656 of 1,232 bytes, 235K of 1.4M CU. - [UPEE: composing SPST + PPST + TAB into one framework](https://blog.skill-issue.dev/blog/upee_universal_private_execution/): F_RP Construction IV. The five-algorithm tuple Setup/Deploy/Invoke/Verify/Finalize plus the simulation-based privacy theorem (3.12) and self-sovereignty theorem (3.13). The composition that makes the whole thing deployable. - [Bayer-Groth verifiable shuffles for network-layer privacy](https://blog.skill-issue.dev/blog/verifiable_shuffles_for_privacy/): F_RP Construction III, Approach C. Bayer-Groth verifiable shuffles obscure the input→output permutation of a batch with O(√n) proof size — used to cascade-mix pre-broadcast batches at the network layer. - [TAB: hiding the submitter with ring signatures and FROST](https://blog.skill-issue.dev/blog/tab_threshold_anonymous_broadcast/): F_RP Construction III. ZK proofs hide the contents but the wrapping Solana tx still leaks the submitter pubkey. TAB closes that gap with a Fujisaki-Suzuki ring signature and a FROST threshold Schnorr over Ed25519. - [On the death of the trusted setup](https://blog.skill-issue.dev/blog/on_the_death_of_the_trusted_setup/): Universal SRS, transparent FRI, and why Groth16's per-circuit ceremony feels anachronistic in 2026 — even when, as ZERA does, you're still using one. A history of the ceremonies that worked, the ones that didn't, and what comes next. - [WASM-native proving for ZK SDKs: an SDK author's take](https://blog.skill-issue.dev/blog/wasm_native_proving_sdk_authors_take/): Why zera-sdk ships native Rust on Node and snarkjs in the browser — and what it would actually cost to ship a WASM-compiled Rust prover for the browser path. A design post about the dual-target build pipeline. - [Plonky3, the small-fast-cheap revolution](https://blog.skill-issue.dev/blog/plonky3_small_fast_cheap/): Why plonky3 — small fields, FRI commitments, no trusted setup — is the proof system to watch in 2026. The Mersenne31 / BabyBear / Goldilocks landscape, the FRI folding step, and why your laptop is suddenly a viable prover. - [Recursive proof composition without the abyss: Halo to Nova](https://blog.skill-issue.dev/blog/recursive_proofs_halo_to_nova/): The path from Halo's accumulation scheme to Nova's folding scheme, derived from the recurrence relation. Where Halo2, Nova, SuperNova, and HyperNova actually differ, and which one to reach for in 2026. - [PPST: extending SPST to arbitrary private computation](https://blog.skill-issue.dev/blog/ppst_private_programmable_state/): F_RP Construction II. Generalises SPST to private programmable state: arbitrary arithmetic circuits over committed pre/post-state, with R1CS-embedded program execution and atomic PPST-SPST composition. - [Halo2 in 2026: what changed since the Zcash era](https://blog.skill-issue.dev/blog/halo2_in_2026_what_changed/): A survey of the Halo2 ecosystem six years after the Zcash team published it — what stayed the same (PLONKish, lookups, IPA), what evolved (KZG, gadget libraries, fork landscape), and what we ship today. - [From sailor to CEO in three acts](https://blog.skill-issue.dev/blog/sailor_to_ceo_three_acts/): A short memoir of a strange decade — Navy reactor compartments, a bitcoin mine, ConsenSys-USAA-PMG, and the arc that ended at Zera Labs. The interesting question is not how I got here. It is where everyone else is going. - [SPST: a self-paying shielded transaction model](https://blog.skill-issue.dev/blog/spst_self_paying_shielded_transactions/): First construction in F_RP. The SPST relation, balance conservation under DLOG, double-spend resistance under collision-resistant PRF, unlinkability under DDH, simulation-extractable non-malleability. - [Circom, by example](https://blog.skill-issue.dev/blog/circom_by_example/): A DSL primer told through one circuit — proving knowledge of a Poseidon pre-image. Every Circom keyword annotated as it appears, the constraint graph drawn out, and the R1CS fall-through to a witness. - [Proving in the browser, by the numbers](https://blog.skill-issue.dev/blog/proving_in_the_browser_by_the_numbers/): What is actually feasible inside a browser tab in 2026 — Groth16 prover times for Poseidon, Range, and Merkle circuits, the WASM threading story, and where the main thread stops being a viable home for your prover. - [Merkle inclusion proofs over compressed account state on Solana](https://blog.skill-issue.dev/blog/merkle_inclusion_compressed_solana/): How a 32-byte hash and a logarithmic path replace a multi-kilobyte account. Walk the tree-height math, the Light Protocol compressed-account model, and an inclusion-proof construction you can run in Node. - [The fee paradox: why every smart-contract privacy mixer needs a relayer](https://blog.skill-issue.dev/blog/the_fee_paradox/): On account-model chains the very act of paying a transaction fee deanonymises the recipient. This post formalises the paradox, walks through three resolutions, and sets up the SPST construction that resolves it inside the ZK proof itself. - [Relayerless privacy on a Turing-complete L1: an intro to F_RP](https://blog.skill-issue.dev/blog/relayerless_privacy_intro/): A series-opening map of the relayerless full-privacy framework I've been writing up. Five cryptographic games, four constructions (SPST, PPST, TAB, UPEE), one main theorem — and why it matters that the target chain is Solana. - [Cross-compiling vantad for darwin: Apple Silicon, sign + notarise](https://blog.skill-issue.dev/blog/vanta_darwin_apple_silicon_build/): Shipping vantad as a notarised Mac binary inside a Tauri app meant fixing libconsensus link order, building Rust release with the right target triple, signing every sidecar, and stapling the DMG separately. The notes from the trenches. - [Vanta Desktop: a Tauri wallet that ships its own full node](https://blog.skill-issue.dev/blog/vanta_desktop_tauri_wallet/): Most desktop wallets are thin RPC clients that talk to somebody else's node. The Vanta desktop app spawns vantad and the L2 sidecar as Tauri sidecar binaries, owns their PIDs, and adopts orphans on restart. Here is how that came together. - [The vanta sidecar: how a Rust ZK indexer talks to a C++ Bitcoin node](https://blog.skill-issue.dev/blog/vanta_sidecar_architecture/): vantad is C++. The ZK index is Rust. They cooperate over RPC and a REST API, with the C++ verifier linked statically through libvanta_verifier.a. Here is the audit-surface trade we made and what the sidecar actually does. - [Why we shipped SP1 instead of RISC Zero](https://blog.skill-issue.dev/blog/vanta_sp1_zkvm_circuits/): Vanta's earliest design notes said 'RISC Zero zkVM.' Production ships SP1 + Plonky3. The swap was cheap because the privacy protocol is independent of the prover. Here is why we moved, what stayed the same, and what the FFI verifier looks like. - [Tauri 2.x sidecars in anger: the ergonomics paper-cuts I had to fix](https://blog.skill-issue.dev/blog/vanta_tauri_ergonomics/): externalBin wants a target-triple suffix nobody documents loudly enough. The dev resolver walks up parents. Startup must be sequenced. The setup-sidecars.sh + resolve_binary() story for shipping a wallet that runs its own node. - [Vanta: a Bitcoin fork with ZK at consensus](https://blog.skill-issue.dev/blog/vanta_zk_privacy_l1/): 42 billion supply. 1-minute blocks. RISC Zero proofs verified at consensus. The opinionated answer to 'why fork Bitcoin in 2026?' is that you're not really forking Bitcoin — you're shipping a different L1 that has Bitcoin's surface area. - [Poseidon, by hand and by code](https://blog.skill-issue.dev/blog/poseidon_by_hand_and_by_code/): Why one of the cheapest hashes in zero-knowledge cryptography also has the strangest insides. Derive the S-box, count the constraints, and run a 30-line implementation in the browser. - [Stuck Sell, Post-Graduation: Fixing a Trapped-Funds Bug Without a Redeploy](https://blog.skill-issue.dev/blog/stuck_sell_post_grad/): A graduated launchpad token left users unable to sell. Fix shipped without redeploying the program: a frontend conversion path that withdraws SPL, compresses, then sells through the AMM. - [Being CEO and still shipping code](https://blog.skill-issue.dev/blog/being_ceo_and_still_shipping_code/): The CTO-vs-CEO false dichotomy, why I still review every PR that touches the SDK core, and how I use Claude Code plus an MCP server over my own writing to keep technical leverage as the company grows. - [btc-tunnel.sh: SSH-jumping into a remote bitcoind for swap testing](https://blog.skill-issue.dev/blog/vanta_btc_tunnel_dev_environment/): Three small bash scripts wire the desktop dev environment to a real mainnet bitcoind for atomic-swap testing. Tunneling, RPC wrapping, and an address watcher with auto-reconnect — and why exposing 8332 to the internet is a worse idea than you think. - [Block explorers for privacy chains: a Rust indexer for vanta](https://blog.skill-issue.dev/blog/vanta_explorer_rust_indexer/): Patching btc-rpc-explorer got us to 'works.' Then we wrote vanta-explorer in Rust + React: an Axum backend, SQLite indexer, and a SPA that renders shielded transfers as opaque commitments without lying about what it knows. - [iroh in production: encrypted-note gossip on a 1-minute-block chain](https://blog.skill-issue.dev/blog/vanta_iroh_gossip_in_production/): Why vanta-node uses iroh-gossip for L2 P2P instead of libp2p, what the topic + ALPN setup actually looks like, the GossipMessage shape, and the saturating-decrement bug that taught me an event ordering lesson. - [L1 nullifier sets: enforcing no-double-spend at consensus](https://blog.skill-issue.dev/blog/vanta_l1_nullifier_set/): Most privacy chains track spent notes in a wallet-side index and pray. Vanta puts the nullifier set in chainstate and lets the consensus rules do the praying. Here's why that line moved, and what it costs. - [What's in vanta/papers — reading 17 design docs in 2026](https://blog.skill-issue.dev/blog/vanta_papers_design_doc_tour/): Vanta ships its whitepaper as 17 markdown files in the repo, not a PDF on a marketing page. This is the tour: what each doc covers, which one has the wording bug, and why the docs live next to the code. - [Private atomic swaps and the price-discovery problem](https://blog.skill-issue.dev/blog/vanta_private_atomic_swaps/): BTC ↔ VANTA atomic swaps via HTLC are the easy part. If the VANTA leg is shielded, no observer can compute the rate, and no rate means no public price. Walking through six designs and the hybrid recommendation in vanta/planning. - [BIP-199 by hand: a code walk through vanta-swap](https://blog.skill-issue.dev/blog/vanta_swap_htlc_walkthrough/): A line-by-line tour of the Rust HTLC state machine that drives BTC ↔ VANTA atomic swaps. Redeem script bytes, the 2x/1x timelock dance, BIP143 sighash binding, and the witness layout that makes refund and claim routes provably distinct. - [The unified dashboard: collapsing private and transparent into one wallet view](https://blog.skill-issue.dev/blog/vanta_unified_dashboard_wallet_ui/): Two pages — one for private balance, one for transparent — taught users to think in two heads. The 2026-04-17 commit folded them. The wallet now shows one balance, one feed, with the privacy boundary inside the data, not the URL. - [The vanta wallet HTTP API: an Axum bridge to vantad RPC](https://blog.skill-issue.dev/blog/vanta_wallet_axum_api/): Before the Tauri desktop wallet there was an Axum web wallet. It is a five-route Rust service that wraps vantad's JSON-RPC and serves a single static page. Boring on purpose — and the boring is the point. - [Stratum v1, the from-scratch Python version](https://blog.skill-issue.dev/blog/vanta_stratum_python_pool/): Solo mining Vanta requires a Stratum server. Public-pool is fine for normal chains; mandatory privacy pushes the pool toward shielded coinbases, encrypted-note submission, and an L2 retry queue. pool/stratum_server.py does it all in stdlib Python. - [Mining VANTA with a Bitaxe BM1368](https://blog.skill-issue.dev/blog/mining_vanta_with_a_bitaxe/): A 350 GH/s, ~12 W open-hardware ASIC plugged into a Stratum server I wrote against my own L1. Solo mining isn't economic on Bitcoin in 2026. On a 1-minute-block fork with 100k subsidy, the math changes. - [Why BN254, and when to switch off it](https://blog.skill-issue.dev/blog/why_bn254_and_when_to_switch/): BN254 is the default curve for production ZK in 2026. The 128-bit security claim is no longer 128 bits, and BLS12-381 is gaining ground. Here is the math, the deployment reality, and the migration path. - [Privacy's broadband moment](https://blog.skill-issue.dev/blog/privacys_broadband_moment/): ZK got fast, hardware got attestable, AI agents started carrying their own wallets, and regulators stopped trying to ban math. Four curves crossed and privacy stopped being a research topic — it became infrastructure. - [Generating mempool with a Rust txbot](https://blog.skill-issue.dev/blog/vanta_txbot_synthetic_mempool/): Empty blocks lie. A new chain whose miners are mining empty templates is not exercising any of the code that fails in production. The txbot is a 200-line Rust loop that round-robins coins through 114 addresses to keep mempool honest. - [Latitude bare-metal primary, Fly.io backup: the deploy story for a 1-min-block chain](https://blog.skill-issue.dev/blog/vanta_flytoml_latitude_baremetal/): Vanta v1 went LIVE on a Latitude bare-metal box at 64.34.82.145:9333 with a Fly.io seed fleet as auto-failover. Why a 1-min-block chain hates cold starts, what the fly.toml has to say about it, and the cost math that picks bare metal. - [The MCP server inside zera-sdk](https://blog.skill-issue.dev/blog/mcp_server_inside_zera_sdk/): Most SDKs ship as a library. zera-sdk also ships as a Model Context Protocol server. Here is why an AI agent should be able to call shielded-pool primitives directly, and how we keep that interface from becoming a footgun. - [Range proofs in 80 lines: Pedersen commitments and a tiny Bulletproof](https://blog.skill-issue.dev/blog/range_proofs_in_80_lines/): How a Bulletproof actually compresses a range proof to logarithmic size. Derive the inner-product argument from scratch, run a toy prover/verifier in the browser, and pick the right range-proof primitive for 2026. - [Nullifiers without the witchcraft](https://blog.skill-issue.dev/blog/nullifiers_without_witchcraft/): Nullifier Generation is on the ZERA front page next to Pedersen Commitments and Zero-Knowledge Proofs. The Rust + TypeScript implementations are six lines apiece. Here is what they actually do, and why the design borrows from Zcash. - [Pedersen commitments, in production](https://blog.skill-issue.dev/blog/pedersen_commitments_in_production/): ZERA marketing says "Pedersen Commitments" on the cryptography page. The SDK ships Poseidon. Both are right — and the gap between them is the whole story of what shipping ZK in 2026 actually looks like. - [144 Tests and a Surfpool Devnet](https://blog.skill-issue.dev/blog/zera_sdk_test_suite/): How the Zera SDK got from "scaffolded" to "trustable" — a 144-test Vitest suite, a Surfpool-forked devnet running on a Latitude box, and a quickstart that actually works. - [Building the ZERA Wallet for desktop, iOS, and Android](https://blog.skill-issue.dev/blog/zera_wallet_three_platforms/): Three platforms, one shielded pool, one design system. The trade-offs of building a wallet that has to feel like cash on a phone, like a tool on a laptop, and the same on both. - [Zera Wallet v3: ZK Proofs in a Tauri Webview](https://blog.skill-issue.dev/blog/zera_wallet_v3_zkp/): A Tauri 2 desktop wallet that proves Groth16 in the browser, persists encrypted notes locally, talks NFC to physical bearer cards, and never lets the private key out of Rust. - [x402 Vector 2: partial-signing instruction injection](https://blog.skill-issue.dev/blog/x402_partial_signing_injection/): The x402 client builds and partially signs the entire VersionedTransaction. A facilitator that validates structure but not bytes can co-sign a tx with extra clawback / drain instructions appended after the legitimate transfer. - [x402 Vector 1: settlement race condition](https://blog.skill-issue.dev/blog/x402_settlement_race_condition/): Coinbase x402's verify→settle pipeline isn't atomic. A client can submit the same PAYMENT-SIGNATURE to multiple facilitators in parallel, or race the facilitator with a direct on-chain submission. Double-spend within blockhash validity (~60s). - [x402 Vector 3: facilitator gas drain](https://blog.skill-issue.dev/blog/x402_facilitator_gas_drain/): x402 facilitators pay all transaction fees and the spec defines no per-client rate limit. A flood of valid-looking transactions that fail at maximum compute-unit consumption is a per-request economic attack on the facilitator. - [SOLMAL: the x402 attack surface (series intro)](https://blog.skill-issue.dev/blog/x402_attack_surface_intro/): Mapping the attack surface of Coinbase's x402 micropayment protocol on Solana. Series intro covering the verify→settle pipeline, the actor model, the 9 vectors, and the responsible-disclosure timeline. - [Building the Zera SDK: Day One](https://blog.skill-issue.dev/blog/zera_sdk_scaffolding/): Sixteen commits in fourteen minutes. The first day of the @zera-labs/sdk monorepo — Rust core via neon-rs, TypeScript scaffolding, Poseidon, Merkle trees, ZK provers, and an MCP server for AI agents. - [Cruiser: A Tauri Hookup App on iroh, Geohash-Bucketed Presence, and Why P2P Dating Is Actually Fine](https://blog.skill-issue.dev/blog/cruiser_iroh_gossip_p2p/): A Tauri 2 + React + iroh-gossip dating app where peers find each other by geohash, broadcast presence on a topic-per-bucket, and DM each other with consent signals — all without a central server. The architecture is the product. - [Why I started Zera Labs](https://blog.skill-issue.dev/blog/why_i_started_zera_labs/): Three things became true in the same year — ZK got fast enough, Solana got cheap enough, and AI agents needed verifiable money. Sitting at the intersection felt like a ship date, not a thesis. - [Prediction Markets, LP Locks, and an Admin Page That Doesn’t Suck](https://blog.skill-issue.dev/blog/prediction_markets_admin/): How I bolted CPMM prediction markets onto ZeraSwap, locked LP for graduated tokens, and built a 5-tab admin panel before the first malicious actor showed up. - [Five Commits to Get an OG Image Out of a Cloudflare Worker](https://blog.skill-issue.dev/blog/og_pngs_cf_workers/): A 24-minute slog where I got dynamic OG PNG generation to work on Cloudflare Pages Functions. The bug is WebAssembly. The fix is a build-time WASM import. - [ZeraSwap: An AMM for Compressed Tokens](https://blog.skill-issue.dev/blog/zeraswap_compressed_amm/): Initial commit of the first compressed-token AMM on Solana — Anchor program, x*y=k math, SOL/cToken pairs, and the cyberpunk launchpad UI that grew up around it. - [ZK-FHIR: A Medical Demo That Doesn’t Leak Patients](https://blog.skill-issue.dev/blog/zera_med_zk_fhir/): Building a RISC Zero zkVM gateway for FHIR-shaped medical records — proofs over private patient data, zero-knowledge insurance claims, and HIV/STI compartmentalization. - [A Privacy Demo That Works on a Phone: Mobile Drawer, HUD Offsets, and Real Breach Data](https://blog.skill-issue.dev/blog/zera_med_responsive_hud/): Bolting a mobile drawer onto the Zera Med ZK-FHIR demo without breaking the desktop sidebar, fixing AnimatePresence warnings, and updating PrivacyChallenge with 2024-2025 breach data. - [Zera Janitor: Closing Solana Dust Accounts in Leptos WASM](https://blog.skill-issue.dev/blog/zera_janitor_leptos_wasm/): A Solana program + Leptos 0.7 frontend that scans your wallet for empty SPL token accounts, batches up to 25 closes per transaction via CPI, and pays you back 95% of the rent. The fee path is the actual interesting part. - [Rebranding to m0n3y and Writing Crypto Docs Like You're 10](https://blog.skill-issue.dev/blog/m0n3y_eli5_rebrand/): The DAXSO → M0N3Y rebrand commit, the burn-to-earn explainer for degens, and an ELI10 walk-through of zk-shielded notes that does not mention the word "circuit" once. - [Empowering Local Crypto Advocacy](https://blog.skill-issue.dev/blog/congress_crypto/): - [m0n3y: Naming a Dream](https://blog.skill-issue.dev/blog/m0n3y_naming_a_dream/): The docs site that came before the code. Looking back at the m0n3y-web init commit and the voting proposal that was supposed to fix DAO whales. - [TW-TVV: Why Token-Quantity Voting Is Broken, and the Math I Tried to Fix It With](https://blog.skill-issue.dev/blog/m0n3y_tw_tvv_governance/): A full walk-through of the Time-Weighted Tiered Value Voting proposal I drafted for $M0N3Y in 2025. Five tiers, time multipliers, log-scaled volume, and why every variable in the formula is a knob fighting a different attack. - [Building A Better Cryptocurrency: What We Should Have Done](https://blog.skill-issue.dev/blog/a_better_crypto/): A technical proposal for a truly decentralized digital cash system - [Listening to the Bluesky Firehose for Accidental Haikus](https://blog.skill-issue.dev/blog/bsky_haiku_firehose/): A Rust firehose listener that decodes ATProto CAR frames live, runs whatlang + syllarust on every English post, and saves the ones that scan as 5-7-5 haikus to disk. There were a lot of haikus. - [You are thinking about AI wrong.](https://blog.skill-issue.dev/blog/rethink_ai/): We have had how many decades of Science Fiction to prepare us for the future of AI, and yet we are still thinking about it wrong. - [Rusty Pipes Exploit](https://blog.skill-issue.dev/blog/rusty_pipes_exp/): Using Rust to inject malicious code into npm packages. And hijack your entire node runtime. - [Youtube Wasting Money on Fake Livestreams](https://blog.skill-issue.dev/blog/ways_to_burn_money_at_google/): One of the biggest ways YouTube is wasting its money is promoting scam and spam prerecorded livestreams. - [Hungry Git: A Quick Guide to Hacking Orgs and Bots](https://blog.skill-issue.dev/blog/hacking_bots/): Recently more and more people are talking about how insecure GitHub is. This article will show you how to exploit GitHub organizations and bots to get what you want. - [What running a Bitcoin mine taught me about cloud margins](https://blog.skill-issue.dev/blog/what_running_a_bitcoin_mine_taught_me/): A short stint at Foundry Digital running ASIC fleets, immersion vs. air, the depreciation curve, and the brutal arithmetic of difficulty adjustments — and why I never stopped thinking like an operator after I went back to writing software. - [Nuclear reactors taught me to ship software](https://blog.skill-issue.dev/blog/nuclear_reactors_taught_me_to_ship/): Watchstanding, casualty drills, and pre-task briefs map onto code review, on-call, and disaster recovery more cleanly than any management book I have ever read. - [process-thing: An LSB Watermarker for upload-thing, Written in Rust via Neon](https://blog.skill-issue.dev/blog/process_thing_lsb_watermark/): A Rust npm package that embeds invisible watermarks in the least significant bit of every red channel pixel. Built for upload-thing image preprocessing. Cross-compiled for 7 platforms. The README is one paragraph. - [Rust in Peace: How to Hijack Node.js with a Single Require](https://blog.skill-issue.dev/blog/rusty_pipes_building_supply_chain_malware_for_npm/): Discover how to exploit the Node.js ecosystem with Rust-based supply chain malware. Learn about the vulnerabilities in npm packages and how a single require line can compromise JavaScript projects. Explore security measures to prevent such attacks. - [The Difference Between Publishers and Developers](https://blog.skill-issue.dev/blog/skg_fixes/): Alot of the time whenever gamers have a problem they blame the developers. But who are they really mad at? Time to take a breath and actually learn who is doing what to whom and how often. - [Stop Killing Games: A Pricing thought Experiment](https://blog.skill-issue.dev/blog/stop_killing_games_a_pricing_thought_experiment/): After talking with industry and business professionals a very interesting example or better yet expectation of what will happen was put forward by people in business. - [The Flaws of the #StopKillingGames Initiative: A Developer’s Perspective](https://blog.skill-issue.dev/blog/stop_killing_games/): Surprise, I am not a fan of the Stop Killing Games initiative. It is a flawed approach to addressing the issues in the gaming industry. Let me explain why. - [Origins of Foo and Bar](https://blog.skill-issue.dev/blog/origins_of_foo_and_bar/): Foo and Bar where did they come from? - [What is RISC V](https://blog.skill-issue.dev/blog/what_is_risc_v/): What is RISC V, why is it so cool? Why is it so important? - [Embedded AI](https://blog.skill-issue.dev/blog/embedded_ai/): Unlocking the potential of the Milk-V Duo with embedded AI and Linux-based interrupt handling - [Rusty Pipes](https://blog.skill-issue.dev/blog/rusty_pipes/): An npm supply chain exploit that checks for what packages you contribute to then injects a malicious rust binary into the next release. - [Developers in the Job Market](https://blog.skill-issue.dev/blog/developers_in_the_job_market/): Recent studies reveal an alarming increase in fake job postings. This article explores the economic implications of fake job postings and the challenges faced by job seekers in the current market. - [Rust Type Abuse for Beginners](https://blog.skill-issue.dev/blog/rust_type_abuse_for_beginners/): Explore some simple type system abuse and hacks to get used to the Rust model and syntax of Types - [Abusing Ts Type System](https://blog.skill-issue.dev/blog/abusing_ts_type_system/): Dive into the world of TypeScript and explores the fascinating aspect of the `Exclude` utility type. - [Introducing the Milk V](https://blog.skill-issue.dev/blog/introducing_milkv/): Milk-V Duo is an ultra-compact embedded development platform. It can run Linux and RTOS, providing a reliable, low-cost, and high-performance platform for professionals, industrial ODMs, AIoT enthusiasts, DIY hobbyists, and creators. - [Nix-flakes and Bun](https://blog.skill-issue.dev/blog/nixos_bunjs/): Small update to my development flow and focus. How to get up and running with Bun.js in NixOS. - [How Random is a Local LLM? A Rust Benchmark with Redis](https://blog.skill-issue.dev/blog/ai37_llm_random_numbers/): A Rust harness that asks Ollama models for "a random number between 1 and 100" thousands of times, parses every response with regex, stores results in Redis, and pits them against a real RNG. Spoiler: 42 wins. - [Blazingly Fast Drinks: A Repo I Made For The Bit](https://blog.skill-issue.dev/blog/glug_blazingly_fast_drinks/): A Clerk + Next.js + Expo turborepo I called "glug" with the description "Blazingly Fast Drinks". The README never mentioned drinks. The repo description carried the entire joke. ## About - [About Dax the Dev](https://blog.skill-issue.dev/about) --- # The post-quantum migration path: lattice commitments, STARK wrapping, isogeny credentials Canonical: https://blog.skill-issue.dev/blog/post_quantum_relayerless_path/ Description: Series finale. Shor's algorithm breaks every elliptic-curve assumption F_RP currently rests on. The migration: lattice polynomial commitments (Brakedown/Orion), hash-based STARKs as universal backend, isogeny group actions for credentials. Published: 2026-05-16T15:00:00.000Z Tags: zk, post-quantum, lattice, stark, csidh, sqisign, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The whole F_RP framework, as written today, is **completely broken by Shor's algorithm**. Every elliptic-curve assumption the construction rests on — DLOG on Curve25519, q-PKE on BN254, q-DLOG on the Pasta cycle — falls in polynomial time on a sufficiently large fault-tolerant quantum computer. Pedersen commitments lose binding. Groth16 loses soundness. Ed25519 signatures lose unforgeability. The entire stack is a pre-quantum house. Today this is fine. NIST estimates a cryptographically-relevant quantum computer is still 10-20 years away. But "we'll fix it later" is exactly how we got into the SHA-1 / RSA-1024 / DES situation. The right time to design the migration path is **before** there's a deadline. This is post 11 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series — the finale. ## What Shor breaks, in one paragraph Shor's algorithm gives a quantum polynomial-time reduction from integer factoring and discrete log to period-finding. RSA, classical Diffie-Hellman, DSA, ECDSA, Ed25519, Schnorr signatures, BLS signatures, Pedersen commitments, Pairings — every cryptographic primitive that relies on the hardness of DLOG or factoring is compromised. For F_RP specifically, the four broken pieces: 1. **Groth16 over BN254.** q-PKE / q-DLOG → polynomial-time forgery. 2. **Pedersen commitments over BN254.** DLOG → binding broken; commitments become equivocable. 3. **Ed25519 signatures.** DLOG → forgery, including FROST threshold variants. 4. **CSIDH and other classical isogeny constructions.** Kuperberg's quantum sub-exponential algorithm threatens them more aggressively than classical attacks. Hash-based primitives survive: SHA-2, SHA-3, Keccak, Poseidon (modulo round-by-round cryptanalysis). FRI / STARK proofs survive because they only depend on hash collision resistance, not on any algebraic structure. Lattice-based primitives (Module-LWE, Module-SIS) survive under best-known quantum attacks. ## The replacement stack | Pre-quantum component | Broken by Shor | Post-quantum replacement | Cost | |-----------------------|---------------|--------------------------|------| | Groth16 (BN254) | Yes | STARK (FRI) inner + lattice-SNARK outer | 5-20 KB proof; ~300K CU | | Pedersen (BN254 𝔾_1) | Yes | Lattice (Module-LWE) commitment | 4-50 KB | | Ed25519 + FROST | Yes | SQIsign / lattice signatures | 200-2000 B sig | | KZG (BN254 pairing) | Yes | FRI or Brakedown / Orion | O(log²n) hashes | | Poseidon hash | No (classical and quantum CR) | Same; possibly Anemoi | unchanged | Post-quantum F_RP is **5-20 KB per proof** instead of 128 bytes, **~300K CU on-chain** instead of ~150K, and **~5 KB per signature** instead of 64 bytes. The framework still works; the costs grow ~50-100× on every dimension. ## Lattice-based polynomial commitments The replacement for KZG is a polynomial commitment based on Module-LWE / Module-SIS. Two leading candidates: ### Brakedown (Golovnev, Lee, Setty, Thaler, Wahby — CRYPTO 2023) Linear-time SNARK based on linear-code polynomial commitments. The prover commits to multilinear polynomials using a linear-time encodable error-correcting code, combined with the Spartan polynomial IOP. - Prover: `O(N)` field operations for `N`-sized R1CS. - Proof size: ~1.5 MB for `2^20` multiplication gates (before code-switching compression). - Verification: linear in proof size. - No trusted setup. - Plausibly post-quantum secure (security from collision-resistant hashing + linear-code distance). The **`O(√N)`** base proof size is the killer for Solana — even the 4,096-byte SIMD-0296 limit isn't enough for a raw Brakedown proof on a meaningful circuit. ### Orion (Xie, Zhang, Song — CRYPTO 2022) `O(N)` prover time with `O(log²N)` proof size via **code-switching composition**. The code-switching mechanism reduces proof size from `O(√N)` to polylogarithmic by proving that the witness of a secondary zero-knowledge argument coincides with the message in a linear code. Numbers are still rough — ~10 KB proof for `2^20` constraints — but the trajectory is right. Orion is the most promising candidate for direct on-chain verification on Solana under SIMD-0296. ### Open problem 7.1 — lattice commitment size Current lattice-based commitments produce opening proofs of size `O(k · d · log q)` bits, yielding 4-50 KB concretely. Determine tight lower bounds for 128-bit post-quantum security; characterise the feasibility space within Solana's tx limit. ## Hash-based STARKs as universal backend STARKs are already post-quantum (security from collision-resistant hashing only). The migration is simpler in shape but more expensive in proof size: - **FRI over Goldilocks field** ($p = 2^{64} - 2^{32} + 1$): efficient NTT, native to 64-bit hardware. Plonky3 uses this. - **FRI over M31 (Mersenne-31)**: SIMD-optimised arithmetic. StarkWare's Circle STARK construction uses M31. Proof size scaling: `O(λ · log²N)`. For `2^20` steps at 128-bit security, ~50-200 KB per proof. A **`400×`-`1600×` blowup** vs. Groth16's 128 bytes. Way over Solana's transaction limit. Three deployment paths: ### Path 1: STARK-in-Lattice-SNARK (Open Problem 7.2) Wrap the STARK verifier circuit inside a lattice-based SNARK to recover succinct on-chain verification. The STARK verifier circuit is `O(log²N)` hash evaluations + field operations. With Poseidon (~250 R1CS constraints per hash), `2^20`-step verification is `~100K` constraints. Recursive composition: $$ \pi_{\mathrm{outer}} \;=\; \mathsf{Prove}_{\mathrm{Lattice}}\bigl(\,\mathsf{Verify}_{\mathrm{STARK}}(\pi_{\mathrm{inner}}) = 1\,\bigr). $$ Estimated proof size: ~5-20 KB. **Marginal fit for SIMD-0296** (4,096-byte transactions). Open whether the lattice outer is small enough. ### Path 2: STARK aggregation (STARKPack) Aggregate `n` STARK proofs into a single argument that's $(1 + 1/n)$× the size of a single proof, with `~2×` faster verification. | n packed | Aggregated size | Per-proof verify CU | |----------|----------------|---------------------| | 1 | ~100 KB | 500K | | 10 | ~110 KB | 50K/proof | | 100 | ~120 KB | 5K/proof | Doesn't help individual transactions (still 100 KB per submission) but amortises validator-side cost dramatically. ### Path 3: Off-chain STARK with on-chain commitment The most pragmatic near-term path. Publish the full STARK proof to a data-availability layer (Solana ledger via call-data, or a separate DA chain). On-chain verify only a 32-byte hash commitment. Add a challenge period where any observer can verify the off-chain proof and dispute on-chain if it's invalid. | Configuration | Proof size | Verify CU | PQ | Fits tx? | |---------------|-----------|-----------|----|---------| | Groth16 (current) | 128 B | ~100K | No | Yes | | Raw STARK | ~100 KB | ~500K | Yes | No | | STARK + aggregation (n=10) | ~110 KB total | ~50K/proof | Yes | No (on-chain) | | STARK → Lattice-SNARK wrap | ~5-20 KB est | ~300K est | Yes | Marginal (SIMD-0296) | | Off-chain STARK + on-chain hash | 32 B hash | ~10K | Yes^* | Yes | `^*` Requires off-chain proof availability + honest-verifier assumption for retrieval. ## Isogeny-based group actions for credentials For applications needing **anonymous identity binding** — compliance-compatible privacy, selective disclosure, "prove balance ≥ threshold without revealing balance" — isogeny-based cryptography offers post-quantum group actions that can replace DLOG-based constructions. ### CSIDH-based ring signatures CSIDH defines a commutative group action `★: Cl(O) × E(F_p) → E(F_p)` between the ideal class group of an imaginary quadratic order and the set of supersingular elliptic curves over `F_p`. This group action instantiates Sigma protocols for "knowledge of an isogeny", which yields ring signatures via Fiat-Shamir. **Current status (cautious):** CSIDH at NIST-1 security needs `p ≈ 2^512`, key sizes ~64 B, computation ~50-100 ms. **Quantum security analysis (Bonnetain-Schrottenloher, Peikert) shows Kuperberg's quantum sub-exponential algorithm threatens CSIDH more aggressively than classical attacks** — proposed 128-bit classical / 64-bit quantum parameters can be broken in `~2^35` quantum key-exchange evaluations, not `~2^62`. So CSIDH at the 128-bit level is **not** secure at the originally advertised parameters. Larger parameters (`p ≈ 2^4096+`) restore the security but balloon costs. ### SQIsign **SQIsign** (De Feo, Kohel, Leroux, Petit, Wesolowski — NIST Round 2) offers compact post-quantum signatures (**204 bytes**) from quaternion isogeny problems. Signing time ~100 ms. Verification is computationally expensive (~100 ms), which makes it impractical for direct on-chain verification on Solana — a single SQIsign verification would consume the entire 1.4M CU budget. ### Open problem 7.3 — isogeny anonymous credentials Design an anonymous credential scheme based on supersingular isogeny group actions that: 1. Supports selective attribute disclosure. 2. Has verification time < 10 ms (compatible with blockchain block times). 3. Achieves 128-bit post-quantum security with concrete parameter justification. 4. Is compatible with the SPST note model (credentials bound to note commitments). This is wide open. The most promising shape is a **hybrid architecture**: isogeny-based credentials for identity binding, composed with STARK proofs for the transactional privacy layer. ## What survives without modification Two pieces of F_RP carry over unchanged into the post-quantum world: 1. **Poseidon Merkle trees.** Hash-based, no algebraic structure assumption beyond collision resistance. 2. **Indexed Merkle Trees** for nullifier non-membership. Same hashes, same structure, same constraints. So the *state* layer of F_RP doesn't need to change. Only the *proof* and *signature* layers. ## Migration timeline (rough) | Year | Milestone | |------|-----------| | 2026-2028 | F_RP v1: Groth16 + BN254 + Ed25519. Production deployment. | | 2028-2030 | NIST PQC standardisation completes. ML-DSA / SLH-DSA / Falcon shipped. | | 2030-2032 | Solana adds PQ syscalls (NIST-recommended). F_RP v2 design starts. | | 2032-2035 | F_RP v2 ships: hybrid pre-quantum + post-quantum proofs. Both verify. | | 2035-2040 | F_RP v3: pure post-quantum. Pre-quantum support deprecated. | This timeline is contingent on (a) NIST shipping PQC standards on schedule, (b) Solana adopting the syscalls within ~2 years of standardisation, and (c) lattice-based polynomial commitments achieving sub-10 KB proof sizes. None of these are certain. All three look likely. ## What would have to change in F_RP itself The protocol design is mostly insulated. Specifically: 1. **The note model is unchanged.** Notes, commitments, nullifiers, Merkle trees — all hash-based. 2. **The five-tuple `(Setup, Deploy, Invoke, Verify, Finalize)` is unchanged.** Just the proof system inside `Invoke`/`Verify` swaps out. 3. **The simulation-based privacy theorem (3.12) survives.** The hybrid argument's transitions are: ZK of proof system, pseudorandomness of hash, pseudorandomness of PRF, CCA2 of encryption. The ZK / PRF / CCA2 each get a post-quantum-secure replacement; the structure of the proof is the same. 4. **The self-sovereignty theorem (3.13) survives unchanged.** It only depends on chain liveness and proof system completeness. What changes: byte sizes, CU costs, prover times. The math survives. That's a lucky property of having designed F_RP around the abstract `Π_hybrid` rather than committing to Groth16 in the relations. ## Why this isn't urgent (today) It's worth ending the series with the honest answer to "should I be worried right now?": No. Not in 2026, not in 2028. A cryptographically-relevant quantum computer is plausibly a decade-plus away. The harvest-now-decrypt-later threat applies to confidentiality (encrypted communications today, decrypted later when QC arrives) — but most F_RP outputs are *commitments and nullifiers*, not encrypted plaintexts. The information-theoretic content of an old shielded transaction is bounded; an adversary who breaks it in 2040 learns transaction graph structure that's no longer interesting. What does need attention: **building the migration path now** so that when the day comes, F_RP isn't a 2-year rewrite project. That's what this post is for. ## Closing the series Eleven posts: 1. [Series intro](/blog/relayerless_privacy_intro/) — the F_RP framework and the five games. 2. [The fee paradox](/blog/the_fee_paradox/) — why every smart-contract privacy protocol needs a relayer. 3. [SPST](/blog/spst_self_paying_shielded_transactions/) — self-paying shielded transactions, four security theorems. 4. [PPST](/blog/ppst_private_programmable_state/) — private programmable state via R1CS embedding. 5. [TAB](/blog/tab_threshold_anonymous_broadcast/) — submitter anonymity via ring sigs and FROST. 6. [Verifiable shuffles](/blog/verifiable_shuffles_for_privacy/) — Bayer-Groth network-layer mixing. 7. [UPEE](/blog/upee_universal_private_execution/) — composing the framework, the simulation-based privacy and self-sovereignty theorems. 8. [Solana instantiation](/blog/solana_instantiation_656_bytes/) — concrete numbers: 656 bytes, 235K CU. 9. [F_RP vs the rest](/blog/f_rp_vs_existing_privacy_systems/) — comparison with nine deployed privacy systems. 10. [MEV resistance](/blog/mev_resistance_in_private_execution/) — sandwich-proof by construction; Theorem 7.4. 11. [Post-quantum migration](/blog/post_quantum_relayerless_path/) — the future-proofing plan you just read. The full preprint will land at `/papers/relayerless-privacy/` once typeset. Until then the series is the canonical reference. If you want to discuss any of it, [book a call](https://cal.com/daxts) or open an issue on `Dax911/zera-sdk`. Thanks for reading. ## Bibliography - Shor, P. W. (1997). *Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer.* SIAM Journal on Computing. - Golovnev, A., Lee, J., Setty, S., Thaler, J., Wahby, R. (2023). *Brakedown.* CRYPTO 2023. - Xie, T., Zhang, Y., Song, D. (2022). *Orion.* CRYPTO 2022. - Ben-Sasson, E. et al. (2018). *STARKs.* https://eprint.iacr.org/2018/046 - De Feo, L., Kohel, D., Leroux, A., Petit, C., Wesolowski, B. (2020). *SQIsign.* ASIACRYPT 2020. https://eprint.iacr.org/2020/1240 - Castryck, W. et al. (2018). *CSIDH.* ASIACRYPT 2018. https://eprint.iacr.org/2018/383 - Bonnetain, X., Schrottenloher, A. (2018). *Quantum Security Analysis of CSIDH.* - Peikert, C. (2020). *He gives C-sieves on the CSIDH.* - NIST PQC Round 4 — *Post-Quantum Cryptography Standardization.* Previous: [MEV resistance ←](/blog/mev_resistance_in_private_execution/) · Series: [back to start](/blog/relayerless_privacy_intro/) --- # MEV resistance: why UPEE is sandwich-proof by construction Canonical: https://blog.skill-issue.dev/blog/mev_resistance_in_private_execution/ Description: Theorem 7.3 — UPEE transactions resist sandwich/frontrun/liquidation MEV by construction. Theorem 7.4 — block MEV bounded by public-bit leakage, not transaction value. Independent of V, not super-linear. Published: 2026-05-14T15:00:00.000Z Tags: zk, mev, flashbots, mempool, privacy, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; MEV is the second-order tax on public DeFi. Searchers monitor the mempool, see your swap before it confirms, and front-run / back-run / sandwich it for profit. On Ethereum L1 in 2024-2025, MEV extracted from retail users approached **\$700M/year** — straight value transfer from end users to searchers and validators. UPEE eliminates the dominant classes of MEV by construction. This post derives why, and quantifies what's left. This is post 10 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. ## What MEV is, formally **Definition.** Let `B = (tx_1, ..., tx_n)` be a block of transactions, `σ_0` the pre-block state. The MEV of block `B` relative to validator `V` is: $$ \mathrm{MEV}(B, V) \;=\; \max_{\pi \in S_n,\ \mathsf{tx}_{\mathrm{ins}}} \Bigl[\,\mathrm{profit}_V(\sigma_0, \pi(B) \cup \mathsf{tx}_{\mathrm{ins}}) - \mathrm{profit}_V(\sigma_0, B)\,\Bigr]. $$ The maximum is over (a) all permutations `π` of the transaction ordering and (b) all sets `tx_ins` of transactions the validator may insert. `profit_V` is the validator's balance change after executing the reordered/augmented block. Concretely, the four dominant MEV categories: | Category | What the adversary needs | Public DeFi cost | |----------|--------------------------|-----| | **Sandwich** | Trade direction + size | $(V^2 / L)$ for a $V$-sized swap, $L$ pool liquidity | | **Frontrunning** | Transaction content | Up to full tx value | | **Backrunning** | Observable state change | Bounded by arbitrage opportunity | | **Liquidation** | Position state | Liquidator's bonus % | ## Theorem 7.3 — MEV resistance of private transactions **Statement.** Let `tx` be a private UPEE transaction. For any PPT adversary `A` (including a colluding validator): $$ \Pr[\mathrm{MEV}_A(\mathsf{tx}) > 0] \;\leq\; \mathsf{negl}(\lambda) $$ for sandwich attacks, frontrunning, and liquidation MEV. **Backrunning** is bounded separately by public-output leakage. ### Proof of sandwich resistance A sandwich attack requires the adversary to determine the trade direction (buy or sell) and approximate size of the victim's swap. In UPEE, the transaction content — including the program being invoked, the private inputs, and the state transition — is hidden by the ZK proof. By Theorem 3.12 (simulation-based privacy), there exists a simulator `S` that produces a computationally indistinguishable view using only the public outputs `({nf_i}, {cm_j}, f, program_id)`. `S` does not receive the trade direction or size: $$ \bigl|\,\Pr[\mathcal{A}(\mathsf{View}_{\mathrm{Real}}) = \mathrm{direction}] - \tfrac{1}{2}\,\bigr| \;\leq\; \mathsf{negl}(\lambda). $$ Without the direction, a sandwich attack is a coin flip — expected profit zero (the adversary is equally likely to lose as to gain). ### Proof of frontrunning resistance Frontrunning requires the adversary to know what the transaction will do *before* it confirms. In UPEE the transaction content is encrypted within the ZK proof; the adversary sees only the public tuple, which is simulatable without the witness. The adversary has no advantage in predicting the transaction's effect on state, so frontrunning degenerates to random speculation. ∎ ### Proof of liquidation resistance Liquidation MEV requires knowing that a specific position has become undercollateralised. In UPEE, position state lives in the private state tree as committed values. The adversary can see *that* a position exists (via its commitment) but not whether it is liquidatable — that requires opening the commitment, which the ZK proof guarantees they cannot do. ∎ ### The backrunning caveat Backrunning exploits **observable state changes after the fact**. Even with UPEE, some public state changes leak: a private DEX swap might cause an observable change in a public AMM's price oracle, and that's a backrunnable event. The leakage is bounded by the number of bits of public state affected by the transaction. This is the point of Theorem 7.4. ## Theorem 7.4 — MEV revenue bound **Statement.** For a block containing `n` private UPEE transactions, the expected MEV revenue for a validator is bounded by: $$ \mathbb{E}[\mathrm{MEV}] \;\leq\; n \cdot f_{\max} \;+\; \ell_{\mathrm{bits}} \cdot v_{\mathrm{bit}} $$ where: - `f_max = max_i f_i` is the maximum public fee — validators trivially "extract" fees, but those are legitimate compensation for inclusion. - `ℓ_bits = sum |public outputs of tx_i|` is total information leakage in bits across the n transactions. - `v_bit` is the maximum economic value per bit of leaked information (application-dependent). **Proof.** Each private transaction contributes at most `f_i ≤ f_max` in direct revenue. Additional MEV requires exploiting information beyond the fee. By Theorem 7.3, the only exploitable info is from public outputs. Each bit of public output conveys at most one bit about private state. The economic value extractable per bit is bounded by the application's value density — for a DEX trade of value `V`, one bit of direction info yields expected profit `O(√V)` due to the square-root law of market impact. Sum over bits and transactions. ∎ ## The qualitative shift For public DeFi, MEV from a swap of value `V` scales as **`O(V^2 / L)`** for sandwich attacks (super-linear in `V`). For UPEE, MEV is bounded by **public-bit leakage × per-bit value**, which is **independent of `V`**. That's the shift. MEV no longer scales with transaction value. A user moving \$10M through UPEE is not 10× more valuable to an MEV searcher than a user moving \$1M — both leak the same number of public bits. | Model | Sandwich MEV scaling | Frontrun scaling | |-------|----------------------|------------------| | Public DeFi (Uniswap on ETH) | $O(V^2 / L)$ | $O(V)$ | | UPEE | $O(\ell_{\mathrm{bits}})$ — independent of V | 0 | ## Public outputs of a UPEE transaction Concretely, the public bits leaked per transaction: | Output | Bits | Information content | |--------|------|---------------------| | Nullifiers | 256 × n_in | Pseudorandom from the adversary's view (PRF security) | | Commitments | 256 × n_out | Pseudorandom from the adversary's view (Poseidon hiding) | | Merkle root | 256 | Public state, doesn't carry tx-specific info | | Fee | 64 | Reveals fee tier, ~10 bits effective entropy | | program_id | 256 | Identifies *which* program; partial function privacy leak | Pseudorandom outputs by definition leak nothing about the underlying state. The MEV-relevant leak is the **fee tier** (~10 bits) and the **program_id** (which program executed). For a DEX program, the program_id reveals that *some* swap happened in *that* DEX — but not the direction, size, or counterparty. ## What about backrunning a private DEX? A private swap might still cause an observable state change in the DEX's *public* price oracle. In that case the backrunner observes: - A nullifier was consumed (the swap happened). - The price oracle moved by some amount Δp. Δp encodes the trade size. The backrunner can arbitrage based on Δp without knowing who swapped or in which direction. **Mitigation.** Use a batch-auction DEX (Penumbra's ZSwap is the reference design): aggregate all swaps in a block into a single batch with a uniform clearing price. The price oracle moves once per block, not per trade. Individual trade direction and size remain hidden; only the *net* batch flow is visible. This is on the F_RP roadmap as a separate construction (Private Batch Auction, PBA). ## What stays public no matter what Three things UPEE can't hide while still letting validators do their job: 1. **The fee `f`.** Validators need to know `f` to prioritise inclusion. This is a 64-bit public input. 2. **The fact a transaction occurred.** The validator inserts the nullifier and commitment, both public. 3. **Block timing.** Block-level patterns (transactions per block, time-of-day) leak metadata about overall protocol usage. The first two are inherent to any chain with fees and global state. The third is mitigated by batch-auction DEX design and by encouraging client-side delay sampling on the user side. ## Comparison with Flashbots, MEV-Share, encrypted mempools The Ethereum ecosystem has been working on MEV mitigation for years: | Approach | Mechanism | What's hidden | What still leaks | |----------|-----------|---------------|------------------| | **Flashbots private mempool** | Direct submission to builder | Tx contents pre-confirmation | Builder sees + can extract MEV | | **MEV-Share** | Selective metadata disclosure | User chooses | What user discloses | | **Shutter Network** | Threshold-encrypted mempool | Tx until block sealed | Tx after seal | | **EIP-8105 enshrined encrypted mempool** | Protocol-level encryption | Tx during ordering | Some patterns | | **UPEE (this work)** | ZK-encrypted execution | All inputs/outputs | Fee + program_id + state-change side-effects | The Ethereum approaches are all about *delay* — hide the tx until the moment it's executed, accepting the leak after that. UPEE is structurally different: the tx is *never* visible in plaintext. Even after execution, the inputs and intermediate state remain encrypted. ## Why this matters for retail The user-facing implication: a retail user on UPEE doesn't pay an MEV tax that scales with their trade size. They pay their explicit fee `f` and a small bounded leakage cost. For a \$1M trade on UPEE, MEV cost is bounded by the same `ℓ_bits · v_bit` term as a \$1k trade. That's the point of building this on a smart-contract chain. Public DeFi is great for liquidity but hostile to retail. Private execution restores the property that "I trade because I want to trade", not "I trade and pay an invisible 30-50bps tax to the searchers between me and the AMM". ## Open problem 7.5 — tightness The bound in Theorem 7.4 is an upper bound. Is it tight? Specifically: construct an adversary that achieves MEV revenue within a constant factor of `ℓ_bits · v_bit`, or prove the bound can be tightened by structural analysis of SPST/UPEE. This is open. My intuition is the bound is loose — most public outputs are pseudorandom and don't carry economic value. But proving it requires careful analysis of the per-application leakage channels, which depends on the application. ## Bibliography - Daian, P., Goldfeder, S., Kell, T. et al. (2019). *Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges.* IEEE S&P 2020. - Flashbots Collective. *MEV-Share: programmably private orderflow.* - Shutter Network. *EIP-8105: Universal Enshrined Encrypted Mempool.* - Penumbra Labs. *ZSwap: shielded sealed-bid batch auctions.* - ESMA (2025). *Maximal Extractable Value: Implications for Crypto Markets.* European Securities and Markets Authority. Previous: [F_RP vs the rest ←](/blog/f_rp_vs_existing_privacy_systems/) · Next: [The post-quantum migration path →](/blog/post_quantum_relayerless_path/) --- # F_RP vs Zcash, Tornado, RAILGUN, Aztec, Penumbra, Aleo, Namada, Monero Canonical: https://blog.skill-issue.dev/blog/f_rp_vs_existing_privacy_systems/ Description: F_RP vs nine deployed privacy systems on the four axes that matter: relayer-free, Turing-complete, on-chain verifiable on a high-perf L1, low-trust setup. Published: 2026-05-12T15:00:00.000Z Tags: zk, comparison, zcash, tornado, aztec, monero, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; I've now spelled out the full F_RP framework. Two natural next questions: 1. Has someone built this already? 2. If not, what's the closest existing thing and why doesn't it cover the same ground? This post answers both. We compare F_RP against nine deployed privacy systems on twelve axes. The TL;DR: **no existing system simultaneously achieves relayer-free operation, Turing-complete computation privacy, and on-chain-verifiable proofs on a general-purpose Layer-1 blockchain.** That's the gap F_RP fills. This is post 9 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. ## The matrix | Property | F_RP (ours) | Zcash Orchard | Tornado Cash | RAILGUN | Aztec | Penumbra | Aleo | Namada | Monero | |---|---|---|---|---|---|---|---|---|---| | **Relayer required** | **No** | No | **Yes** | **Yes** | No (sequencer) | No | No | No | No | | **Proof system** | Groth16 (BN254) + Nova | Halo 2 (IPA) | Groth16 (BN254) | Groth16 (BN254) | Honk (UltraPLONK) | Groth16 (BLS12-377) | Varuna (Marlin) | Groth16 (BLS12-381) | CLSAG + Bulletproofs+ | | **Proof size** | 128 B compressed | 2,720 + 2,272·n B | 128 B | 128 B/circuit | ~400-800 B | ~192 B | Compact (KZG) | ~192 B | O(ring_size) + log(n) | | **Verification cost** | ≈150K CU on Solana | ~10ms CPU | ~200K gas (ETH) | 600K-1M gas | Off-chain L2 batch | Native (L1) | Native (L1) | Native (L1) | O(ring_size) EC | | **Trusted setup** | Per-circuit MPC | **None** | Per-circuit MPC | Per-circuit MPC | Universal KZG | Per-circuit MPC | Universal KZG | Per-circuit MPC | **None** | | **Post-quantum** | No (STARK migration path) | No (DLOG) | No | No | No | No | No | No | No (DLOG) | | **Anonymity set** | Global shielded pool (2^32) | All Orchard notes | Per-denomination (2^20) | All shielded UTXOs | All encrypted notes | Multi-asset unified pool | All records | Multi-asset MASP | Ring 16 (FCMP++ pending) | | **Programmability** | **Full (PPST)** | None | None | Limited DeFi | **Full (Noir)** | Limited (DEX/staking) | **Full (Leo)** | Limited (Convert) | None | | **Fee mechanism** | **Self-paying from pool** | Self-paying via valueBalance | **Relayer pays gas** | **Broadcaster pays gas** | Client-side ZK fee proof | Public fee from balance | Private fee proof | Convert circuit | Public miner fee | | **Self-sovereignty** | **Full (Theorem 3.13)** | Full | **Partial (relayer)** | **Partial (Broadcaster)** | Full (PXE-side) | Full | Full | Full | Full | | **Target chain** | **Solana** (smart-contract layer) | Zcash L1 | Ethereum (EVM) | EVM L1s | Ethereum L2 rollup | Cosmos L1 | Aleo L1 | Namada L1 | Monero L1 (PoW) | | **Program privacy** | **Full** (program inputs/outputs hidden) | N/A | N/A | N/A | Partial (public calls visible) | N/A | Partial (program ID visible) | N/A | N/A | ## Three things F_RP gets that nobody else gets simultaneously ### 1. Relayer-free on a smart-contract chain Zcash, Penumbra, Monero, and Aleo are all relayer-free, but they're each their **own L1 chain**. Their consensus, validators, and fee mechanism are bespoke. They get relayer-freedom by being a chain, not by solving the smart-contract-layer problem. Aztec is relayer-free on a smart-contract platform — but it's an **L2 rollup with its own sequencer**. The sequencer is the de facto relayer with extra steps; if it goes offline, the L2 stalls. Aztec's deployment model isn't applicable to Solana. F_RP runs as a **smart-contract program on Solana mainnet**. Same validators that run Jupiter and Helium. No new chain, no new sequencer, no relayer. The only assumption is Solana's chain liveness — which is what every Solana program already assumes. ### 2. Turing-complete program privacy Tornado Cash and RAILGUN provide value transfer only. No conditional logic, no AMM, no auctions — just shielded ERC-20 transfers (or fixed-denomination ETH). Adding programmability would require redesigning the protocol from the ground up. Aztec and Aleo do offer programmability. Aztec ships Noir, Aleo ships Leo. Both work, both are L1-or-L2 specific. F_RP's PPST construction puts arbitrary arithmetic circuits inside the proof on a chain that wasn't built for them. The R1CS for the user's program is embedded as a sub-circuit of the outer PPST relation. The Solana on-chain verifier doesn't care what the program is — it just verifies the wrapping Groth16 proof. ### 3. On-chain verification on a high-throughput L1 Solana's `alt_bn128` syscalls verify Groth16 in ~150K CU (~$0.02 USD at typical priority fees). Block time ~600ms. Theoretical TPS in the tens of thousands. | Chain | Groth16 verification cost | Block time | |-------|---------------------------|-----------| | Ethereum L1 | ~200K gas (~$5-12 USD) | 12 s | | Solana L1 | ~150K CU (~$0.02 USD) | 0.6 s | | Zcash L1 | Native (no gas model) | 75 s | The cost difference is ~250× and the latency difference is ~20×. For a privacy protocol that wants to compose with public DeFi (private swap → public AMM → private settlement), Solana's economics are the only ones that work for retail users. ## Where F_RP loses to existing systems Honest comparison cuts both ways. Three places F_RP loses: ### To Zcash Orchard: trusted setup Zcash Orchard uses **Halo 2 with IPA** over Pasta curves — fully transparent, no per-circuit ceremony. F_RP's primary instantiation uses Groth16 with a per-circuit MPC ceremony. The migration path is the hybrid proof architecture (Theorem 3.8): inner STARK or Nova folding (transparent), outer Groth16 wrapper. Once SIMD-0302 ships on Solana (BN254 G2 syscall), we can switch the outer to PLONK with universal SRS — eliminating per-circuit ceremonies. Until then, Groth16 is the price of admission for cheap on-chain verification. ### To Monero: simplicity of the threat model Monero's privacy story fits in three sentences: ring signatures hide the sender, stealth addresses hide the receiver, RingCT hides the amount. No L2, no relayers, no shielded pool, no programmability. That simplicity is a *feature* — Monero has been deployed and battle-tested since 2014. F_RP is more complex because it does more. Programmability is genuinely harder than value transfer. The pricing of that complexity is on the user; the gain is composability with the rest of the Solana ecosystem. ### To Aztec: native privacy DSL Aztec ships **Noir**, a Rust-like DSL purpose-built for ZK circuits. Compiles to ACIR, plugs into Honk / Barretenberg with first-class Aztec idioms (private functions, public functions, schedule-cross-boundary calls). F_RP currently relies on Circom or Noir for circuit authoring, with the developer responsible for wiring the program into the PPST relation. There's no "F_RP DSL" yet. That's a tooling gap, not a protocol gap — Noir-to-PPST adapters are an obvious next step. ## What F_RP and Zcash agree on A pleasant surprise: F_RP's SPST construction and Zcash's Sapling spend description are mathematically isomorphic. Same note/commitment/nullifier model, same value-balance equation, same Pedersen value commitments. The differences are deployment: - Zcash runs on its own L1 with native fee handling. - F_RP runs on Solana with the fee extracted from a program PDA reserve. The cryptography is the same. F_RP is, in some sense, "Zcash's Sapling pool, ported to Solana, extended with PPST for programs and TAB for submitter anonymity, with fees handled by an in-program reserve." The hard part is the protocol design. The cryptography is just engineering. ## What F_RP and Aleo agree on Aleo's records model (from ZEXE) and F_RP's PPST share the core insight: **a private program is an arithmetic circuit, and the proof attests to correct execution over committed state**. Both use a notion of records / notes that get nullified on consumption. The difference is again deployment: - Aleo runs on its own L1 with a native delegated-prover marketplace. - F_RP runs on Solana with prover delegation as a separate off-chain market. And one big disagreement: Aleo has elected **not** to implement function privacy — the program ID is visible on-chain. F_RP makes the same trade-off in v1 but flags universal-circuit-based function privacy as a future extension. ## The 2x2x2 decision lattice Here's the same data as a decision tree: smart contracts?} Q1 -->|No| Q2{Want
programmability?} Q1 -->|Yes| Q3{Need it
relayer-free?} Q2 -->|No| Z[Zcash / Monero] Q2 -->|Yes| A[Aleo / Penumbra] Q3 -->|No| T[Tornado / RAILGUN
relayer-dependent] Q3 -->|Yes| Q4{Layer 1 or 2?} Q4 -->|L2| AZ[Aztec] Q4 -->|L1| F[F_RP] classDef leaf stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff classDef us stroke:#facc15,stroke-width:3px,fill:#0a0a0a,color:#fff class Z,A,T,AZ leaf class F us `}/> The branch where F_RP lives — "yes I want a smart-contract chain, yes I want relayer-free, yes I want L1, with cheap on-chain verification" — is the cell that was empty until now. ## Bibliography - Hopwood, D., Bowe, S., Hornby, T., Wilcox, N. *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf - Pertsev, A., Semenov, R., Storm, R. *Tornado Cash Privacy Solution v1.4.* - RAILGUN Documentation. *Privacy System Architecture.* - Aztec Network. *Client-side Proof Generation.* https://aztec.network/blog/client-side-proof-generation - Penumbra Labs. *Penumbra Protocol Documentation.* https://protocol.penumbra.zone/main/index.html - Bowe, S., Chiesa, A., Green, M., Miers, I., Mishra, P., Wu, H. (2020). *ZEXE.* IEEE S&P 2020. - Namada Network. *Multi-Asset Shielded Pool.* https://github.com/namada-net/masp - Noether, S., Mackenzie, A. (2016). *Ring Confidential Transactions.* MRL-0005. Previous: [Solana instantiation ←](/blog/solana_instantiation_656_bytes/) · Next: [MEV resistance →](/blog/mev_resistance_in_private_execution/) --- # Fitting F_RP in 656 bytes on Solana Canonical: https://blog.skill-issue.dev/blog/solana_instantiation_656_bytes/ Description: Concrete F_RP instantiation on Solana. Groth16 over BN254, Poseidon Merkle, indexed nullifier tree, BN254 Pedersen, transaction in 656 of 1,232 bytes, 235K of 1.4M CU. Published: 2026-05-10T15:00:00.000Z Tags: zk, solana, bn254, alt_bn128, engineering, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The previous six posts derived F_RP at the level of relations and theorems. This post is the engineering side: every byte and every compute unit. The headline numbers: | Resource | Used by F_RP | Solana hard cap | Headroom | |----------|-------------|----------------|----------| | Transaction bytes | **656** | 1,232 (legacy) / 4,096 (SIMD-0296) | 576 / 3,440 | | Compute units | **~235,000** | 1,400,000 | 1,165,000 | | On-chain Groth16 verify | **~150,000** | (subset of CU above) | — | | Proof size (compressed) | **128** | (subset of bytes above) | — | This is post 8 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. ## Proof system: Groth16 over BN254 **Why Groth16, not PLONK or STARK.** Three reasons: 1. **128-byte compressed proof.** Smallest known SNARK output. Critical for Solana's 1,232-byte transaction envelope. 2. **`< 200,000 CU` verification on-chain.** The `sol_alt_bn128_group_op` and `sol_alt_bn128_pairing` syscalls (live since v1.16) make BN254 ops native to the validator runtime. 3. **Existing infrastructure.** Light Protocol's groth16-solana is already deployed; ZK Compression on mainnet uses it. PLONK is plausible once SIMD-0302 (BN254 G2 arithmetic syscall, in Review as of Q1 2026) activates — but as of writing, full G2 scalar multiplication is not a syscall, so KZG-based PLONK verification is impractical. STARKs are too big: a single STARK proof is ~50–200 KB, way over the transaction limit. Hybrid wrapping (STARK inner, Groth16 outer) gives the best of both — Theorem 3.8. | Parameter | Value | |-----------|-------| | Curve | BN254 (alt_bn128) | | Proof structure | π = (A ∈ 𝔾_1, B ∈ 𝔾_2, C ∈ 𝔾_1) | | Uncompressed size | 256 bytes (64 + 128 + 64) | | Compressed via `sol_alt_bn128_compression` | **128 bytes** | | Security level | ~128 bits (Barbulescu-Duquesne 2019 conservative estimate) | | Trusted setup | Per-circuit MPC (Powers-of-Tau) | ## Hash function: Poseidon over BN254 scalar field Poseidon is the standard SNARK-friendly hash for BN254 circuits. Solana ships it as a native syscall (`sol_poseidon`). | Parameter | Value | |-----------|-------| | Field | `𝔽_p` where p = BN254 scalar field order | | State width | t = 3 (binary tree: 2 inputs → 1 output) | | S-box exponent | α = 5 (gcd(5, p-1) = 1 holds) | | Full rounds | R_F = 8 | | Partial rounds | R_P = 57 | | R1CS constraints per hash | 8·3·4 + 57·4 = 96 + 228 = **324** | | Native syscall | `sol_poseidon` (mainnet, v1.16+) | This is what Light Protocol's compressed-account commitments use. Same hash everywhere keeps the compressed-account ↔ F_RP boundary clean. ## Merkle trees Two trees, both Poseidon-based: ### Note commitment tree (depth 32) | Parameter | Value | |-----------|-------| | Depth | d = 32 | | Capacity | 2^32 ≈ 4.3 × 10^9 notes | | On-chain state | 32-byte root in PDA | | Off-chain state | Light Protocol ZK Compression (Solana ledger call data) | | Membership-proof circuit cost | 32 · 324 ≈ 10,400 R1CS constraints | ### Nullifier tree (Indexed Merkle, depth 32) The nullifier set needs efficient *non-membership* checks. Sparse Merkle Trees over 254-bit hashes would cost 254 · 324 ≈ 82,300 constraints per non-membership proof. Indexed Merkle Trees (Aztec's construction) drop this to depth 32: $$ C_{\mathsf{IMT-nonmem}} \;=\; 32 \cdot 324 + 324 + 256 \;\approx\; 10{,}948 \text{ R1CS constraints.} $$ A **7.5× reduction** at the cost of maintaining a sorted linked list off-chain. Aztec's design proves the "low nullifier" — the leaf where the new nullifier would slot in — and asserts the new value is in the gap. Two range checks plus a standard Merkle path. ## Pedersen commitments over BN254 𝔾_1 Used for value hiding inside SPST + range-proof aggregation. | Parameter | Value | |-----------|-------| | Group | BN254 𝔾_1 (prime order p ≈ 2^254) | | Generators | G, H ∈ 𝔾_1 with unknown DL relation | | Commitment | C = v · G + r · H | | Value range | v ∈ [0, 2^64) | | Range proof | In-circuit bit decomposition: 128 R1CS constraints / 64-bit value | | Homomorphism | C_1 + C_2 = Com(v_1 + v_2, r_1 + r_2) | **Why BN254 𝔾_1, not Curve25519?** Solana's native Twisted ElGamal commitments live on Ristretto255 / Curve25519. We don't reuse them for two reasons: 1. **Curve mismatch.** Groth16 needs pairing-friendly BN254. Solana's Ed25519 / Curve25519 is not pairing-friendly. Mixing the two would require expensive cross-curve gadgets. 2. **Different threat model.** Token-2022 confidential transfers hide *amounts*. F_RP needs to hide *amounts + senders + receivers + program logic*. The two are different protocols on different math; clean separation is correct. ## Key derivation The privacy framework uses its own key hierarchy, independent of the user's Solana Ed25519 keypair: | Key | Derivation | Purpose | |-----|-----------|---------| | Spending key sk | `sk ← {0,1}^256` random | Master secret | | Nullifier key nk | `Poseidon(sk, "nk")` | Derives nullifiers | | Public key pk | `sk · G` (G ∈ BN254 𝔾_1) | Identifies note owner | | Viewing key vk | `Poseidon(sk, "vk")` | Decrypts incoming notes | The Solana Ed25519 keypair signs the transaction envelope (paying the on-chain fee from the privacy program's reserve). The ZK proof internally proves authorisation via the spending key. **Compromise of one does not compromise the other.** ## Transaction layout A canonical 2-input / 2-output SPST transaction: | Component | Size (bytes) | Notes | |-----------|------|------| | Groth16 proof (compressed) | 128 | via `sol_alt_bn128_compression` | | Nullifiers (2 × 32) | 64 | Public input; checked against on-chain set | | Output commitments (2 × 32) | 64 | Poseidon hashes | | Merkle root | 32 | Anchors the proof to recent state | | Fee (u64) | 8 | Public, in lamports | | Encrypted note ciphertexts (2) | 128 | For recipient note discovery | | Anchor instruction discriminator | 8 | Standard Anchor program | | Account references (with ALT) | ~120 | Program ID, PDAs, system accounts | | Ed25519 signature | 64 | Transaction-level auth | | Transaction headers | ~40 | Recent blockhash, message header | | **Total** | **~656** | **Within 1,232-byte limit** | **Headroom: ~576 bytes.** Enough for: - A second Groth16 proof (composed PPST + SPST). - A 4-input / 4-output transaction instead of 2-in / 2-out. - Ring signature of size ~17 (instead of 64-byte simple Ed25519 sig) for in-tx anonymity. Under SIMD-0296 (4,096 bytes), the headroom triples. ## Compute unit budget | Operation | CU cost | Source | |-----------|---------|--------| | Groth16 verification (3 pairings + public-input MSMs) | ~150,000 | groth16-solana benchmarks | | Nullifier set check (2 PDA reads + comparison) | ~50,000 | Compressed account lookups | | Merkle root validation (1 PDA read) | ~10,000 | Light Protocol root cache | | Note insertion + state updates (compressed account write via CPI) | ~20,000 | ZK Compression v2 batched updates | | Borsh deserialization | ~5,000 | Standard overhead | | **Total** | **~235,000** | **16.8% of 1.4M CU limit** | Headroom: 1,165,000 CU. Enough for: - A second Groth16 verification (composed PPST + SPST): +150K → total 385K CU (27.5% of limit). - Auxiliary in-program Poseidon hashing via `sol_poseidon` for state derivations. - CPI calls to external programs (token transfers for unshielding, swap execution for atomic private DEX). ## Existing infrastructure used | Infrastructure | Integration point | Status | |---------------|-------------------|--------| | [Light Protocol / ZK Compression](https://www.zkcompression.com/resources/whitepaper) | Merkle tree state, compressed accounts | Production (mainnet) | | [`groth16-solana`](https://github.com/Lightprotocol/groth16-solana) verifier | Groth16 verification crate | Production | | `sol_poseidon` syscall | In-program Poseidon hashing | Live (mainnet, v1.16+) | | `sol_alt_bn128_group_op` syscalls | BN254 group ops for proof verification | Live (mainnet, v1.16+) | | `sol_alt_bn128_compression` | G1/G2 point compression | Live (mainnet) | | Address Lookup Tables | Compact account references | Production | | SIMD-0296 (4,096-byte transactions) | Extended tx envelope for ring sigs / PPST | Approved Q4 2025; pending activation | The protocol is **deployable today** with the legacy 1,232-byte transaction format. SIMD-0296 makes it more comfortable but isn't a hard prerequisite. ## What we still need from Solana For full F_RP, two SIMDs are nice-to-have: ### SIMD-0302 (BN254 G2 arithmetic syscall) Currently in Review. Adds native G2 scalar multiplication and addition. Without it, full PLONK / KZG verification on-chain is expensive (G2 ops in the BPF VM). With it, F_RP can switch to a universal SRS that doesn't need a per-circuit Groth16 ceremony. Estimated impact: PLONK verification ~400–600K CU vs Groth16's ~150K. Larger but eliminates per-circuit ceremony. Worthwhile tradeoff for a multi-program ecosystem. ### Re-activation of the ZK ElGamal Proof Program Currently disabled following the Phantom Challenge bug (Fiat-Shamir transcript missing a hash input — June 2025). When re-activated, F_RP can lean on the existing native sigma-proof / Bulletproofs verifier for some sub-protocols. Until then, all proofs go through the BN254 Groth16 path. ## End-to-end latency budget For a 2-in / 2-out SPST transaction on commodity hardware: | Phase | Time | Notes | |-------|------|-------| | Read on-chain state (Merkle root + recent blockhash) | ~50 ms | RPC roundtrip | | Local proof generation (Apple M2, 8-core) | **0.5–1.5 s** | Dominated by FFT + MSM | | Transaction broadcast | ~50 ms | Direct to validator RPC | | Slot inclusion + finality | ~600 ms | Solana block time + confirmation | | **Total user-perceived latency** | **~1.5–3 s** | | Most of the latency is *prover time*, not chain time. A GPU prover (ICICLE on RTX 4090) drops this to ~300 ms. Browser-side proving via wasm-bindgen-rayon is workable but slower (~5–8 s) — discussed in [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/). ## What runs on the validators is intentionally boring On the chain side, F_RP is just three things: 1. A Solana program (Anchor-based) that verifies Groth16 + nullifier checks + state updates. 2. A Light Protocol-compatible Merkle tree state. 3. An on-chain account holding the protocol's lamport reserve (replenished from shield deposits, drained by validator fee extractions). That's it. No relayers, no off-chain operators, no governance multisig (other than for emergency pause). The boring deployment surface is the point. ## Bibliography - Light Protocol. *ZK Compression Whitepaper.* https://www.zkcompression.com/resources/whitepaper - Light Protocol. *groth16-solana on-chain verifier.* https://github.com/Lightprotocol/groth16-solana - Helius. *Zero-Knowledge Proofs: Applications on Solana.* https://www.helius.dev/blog/zero-knowledge-proofs-its-applications-on-solana - Solana Foundation. *Transactions documentation.* https://solana.com/docs/core/transactions - Solana Foundation. *SIMD-0296: Larger Transaction Format.* https://github.com/solana-foundation/solana-improvement-documents/blob/main/proposals/0296-larger-transactions.md - Solana Foundation. *SIMD-0302 (Review): BN254 G2 Arithmetic Syscalls.* https://github.com/solana-foundation/solana-improvement-documents/discussions/293 - Aztec Documentation. *Indexed Merkle Tree (Nullifier Tree).* https://docs.aztec.network/ - Grassi, L., Khovratovich, D., Rechberger, C., Roy, A., Schofnegger, M. (2021). *Poseidon.* USENIX Security 2021. https://eprint.iacr.org/2019/458 - Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991. Previous: [UPEE: composing the framework ←](/blog/upee_universal_private_execution/) · Next: [F_RP vs the rest →](/blog/f_rp_vs_existing_privacy_systems/) --- # UPEE: composing SPST + PPST + TAB into one framework Canonical: https://blog.skill-issue.dev/blog/upee_universal_private_execution/ Description: F_RP Construction IV. The five-algorithm tuple Setup/Deploy/Invoke/Verify/Finalize plus the simulation-based privacy theorem (3.12) and self-sovereignty theorem (3.13). The composition that makes the whole thing deployable. Published: 2026-05-08T16:00:00.000Z Tags: zk, cryptography, privacy, simulation-security, uc-framework, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; [SPST](/blog/spst_self_paying_shielded_transactions/) gave us self-paying private value transfer. [PPST](/blog/ppst_private_programmable_state/) extended it to arbitrary computation. [TAB](/blog/tab_threshold_anonymous_broadcast/) and [verifiable shuffles](/blog/verifiable_shuffles_for_privacy/) closed the submitter-identification gap. Each of those is a self-contained construction. This post is about how they compose into the **deployable framework**. UPEE — the Universal Private Execution Environment — is a five-tuple `(Setup, Deploy, Invoke, Verify, Finalize)` that wraps the lower-level pieces in a single deployable interface. By the end of this post we'll have stated the two main theorems of F_RP — simulation-based privacy and self-sovereignty — and shown how they fall out of the composition. This is post 7 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. ## The five algorithms **`Setup(1^λ) → pp`.** Generate public parameters: SRS for the proof system (universal KZG or transparent FRI), Poseidon parameters, Merkle tree depth `d = 32`, Pedersen generators `G, H`, range-proof bit-length `ℓ_v = 64`, field `𝔽_p`. **`Deploy(C, pp) → vk_C`.** Compile a private program circuit `C` to an R1CS (or PLONKish) constraint system, run the proof system's key generator to produce `(pk_C, vk_C)`, register `vk_C` on-chain at a deterministic PDA `addr_C = PDA("UPEE", H(vk_C))`. **`Invoke(C, state_priv, input_priv, pp) → (tx, π)`.** Client-side, no chain interaction. Read current Merkle root, execute `C` locally on private state, build the witness, generate the Groth16 proof, assemble the transaction with encrypted note ciphertexts. **`Verify(vk_C, tx, π) → {0, 1}`.** On-chain. Single Groth16 pairing check + nullifier-set check + recent-root check + minimum-fee check. **`Finalize(σ, tx) → σ'`.** State transition. Insert nullifiers, append commitments to the Merkle tree, credit the validator with `f`. ## Hybrid proof architecture (§3.4.2) A single Groth16 proof can't directly hold a Turing-complete program at scale. Big circuits → big provers. The fix is **recursive composition**: locally in PXE-style env] --> B[Inner STARK / Nova proof
~big circuit, transparent setup] B --> C[Wrap inner proof in Groth16
verify_inner = 1 inside outer circuit] C --> D[128-byte Groth16 proof
on-chain via alt_bn128] classDef step stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff class A,B,C,D step `}/> The outer Groth16 proof's circuit verifies the inner STARK or Nova accumulator. Composed soundness: $$ \epsilon_{\mathrm{hybrid}} \;\leq\; \epsilon_{\mathrm{inner}} + \epsilon_{\mathrm{outer}} + \mathsf{negl}(\lambda). $$ Concretely: STARK with FRI gives `ε_inner ≤ 2^{-100}` for the standard 30-query / blowup-4 parameters; Groth16 gives `ε_outer ≤ 2^{-100}` under q-PKE on BN254. Combined `ε_hybrid ≤ 2^{-100}` — no meaningful soundness loss. Zero-knowledge composes too: outer Groth16 reveals nothing about the inner STARK; the inner STARK reveals nothing about the witness. Both layers contribute ZK and they don't fight. ## The ideal functionality `F_RP` For the simulation-based proof we need a target. The ideal functionality: - On `(invoke, sid, C, state_priv, input_priv)` from user `U`: 1. Execute `C` locally, validate balance / range / membership. 2. Compute fee `f`. 3. Send `(transaction, sid, f)` to the adversary `A`. *That's all `A` learns.* 4. On `proceed` from `A`, update ideal state, ack to `U`. - On `(query, sid)` from `A`: return `(rt, N, f)` (the public state). `F_RP` never tells `A`: - which notes were consumed (only nullifiers); - which notes were created (only commitments); - the values, recipients, or program inputs; - the user's identity beyond what the fee `f` and the fact-of-existence reveal. This is the **only** thing the adversary should be able to learn in the real world either. ## Theorem 3.12 — simulation-based privacy **Statement.** For any PPT adversary `A` controlling the blockchain (full read of on-chain data, validator-side scheduling, transaction ordering), there exists a PPT simulator `S` such that: $$ \bigl\{\,\mathsf{View}_{\mathcal{A}}(\mathsf{Real}(\mathcal{A}, \mathcal{F}_{\mathrm{RP}}))\,\bigr\} \;\approx_c\; \bigl\{\,\mathsf{View}_{\mathcal{A}}(\mathsf{Ideal}(\mathcal{A}, \mathcal{S}))\,\bigr\} $$ where `S` learns only `(sid, f)` from `F_RP`. **Proof outline.** The simulator builds a fake transaction that is computationally indistinguishable from a real one without ever seeing the witness: 1. **Simulated nullifiers.** For each of `n_in` inputs, sample `nf_i` uniformly at random, verify it isn't already in `N`, retry on collision. 2. **Simulated commitments.** For each of `n_out` outputs, sample `r_j` uniformly and set `cm_j = Poseidon(r_j)`. Indistinguishable from real commitments by the hiding property of Poseidon. 3. **Simulated proof.** Invoke the ZK simulator of the hybrid proof system: `π̃ ← Sim_ZK(vk_C, x̃)` for `x̃ = ({nf_i}, {cm_j}, rt, f)`. For Groth16, `Sim_ZK` uses the simulation trapdoor `(α, β, γ, δ)` from the CRS to forge a valid-looking proof without a witness. 4. **Simulated encrypted notes.** For each output, sample a uniform-random ciphertext of the right length. Indistinguishable by CCA2 of the encryption scheme. The hybrid argument moves from the real distribution to the simulator's output through four hybrids, each indistinguishable from the previous one under one cryptographic assumption: - `H_0` → `H_1`: replace real proof with simulated proof. Bound: `ZK advantage of Π_hybrid`. - `H_1` → `H_2`: replace real commitments with random `Poseidon(r̃_j)`. Bound: `n_out · PRF advantage of Poseidon`. - `H_2` → `H_3`: replace real nullifiers with uniform random values. Bound: `n_in · PRF advantage`. - `H_3` → `H_4 = Sim`: replace real ciphertexts with random strings. Bound: `n_out · CCA2 advantage of the encryption scheme`. By the triangle inequality, the total distinguishing advantage is the sum of four negligible quantities — itself negligible. ∎ ## Theorem 3.13 — self-sovereignty This is the result that makes F_RP *relayerless*. **Game `Game_RF(A, λ)`.** Single honest user `U`. `A` controls all relayers, all other users, the entire network layer (delay/reorder/drop), and all off-chain infrastructure. `U` has a shielded note, the corresponding spending key, the ability to read the chain, and direct network access to at least one honest validator. `Game_RF = 1` if `U` successfully completes withdrawal of `v' ≤ v` to a public address of their choosing, paying `f` from the shielded balance, in a polynomial number of steps. **Statement.** $$ \Pr[\mathsf{Game}_{\mathrm{RF}}(\mathcal{A}, \lambda) = 1] \;=\; 1 - \mathsf{negl}(\lambda). $$ **Proof.** Walk through every phase of the withdrawal and confirm the user can do it alone: | Operation | Required resources | External party? | |-----------|-------------------|-----------------| | Read Merkle root | RPC (or direct ledger read) | No — public data | | Compute Merkle path | Local tree + on-chain commitment data | No | | Compute nullifier `PRF_sk(ρ)` | Local secret key | No | | Build witness | Local | No | | Generate Groth16 proof | Local CPU/GPU | No | | Sign tx | Local Ed25519 key (or TAB share) | No | | Broadcast tx | Direct connection to ≥1 honest validator | No (chain liveness) | | Pay fee `f` | Inside the proof — extracted from shielded balance | **No (SPST)** | Every row's "External party?" is "No". The single assumption is **`(Δ, p_live)`-liveness of the chain**: any valid transaction is included within Δ blocks with probability ≥ 1 − negl(λ). On Solana, `Δ ≈ 1-2 slots` (sub-second finality) and `p_live` is bounded by Tower BFT's safety guarantees. The success probability is: $$ \Pr[\mathsf{Game}_{\mathrm{RF}} = 1] \;=\; \Pr[\text{liveness holds}] \cdot \Pr[\text{honest proof verifies}] \;=\; (1 - \mathsf{negl}(\lambda)) \cdot 1. $$ The second factor is `1` by completeness of the proof system. ∎ **Corollary (Censorship Resistance).** No adversary can prevent the user from exercising their private withdrawal right, assuming only chain liveness. This is strictly stronger than every relayer-dependent protocol, where adversarial control of the relayer set is sufficient to deny service. ## Composability of UPEE programs Three composition modes from §3.4.5: ### Sequential composition `P_A ; P_B` Run `P_A` to commit intermediate state, wait for finality, then run `P_B` consuming that state. Soundness composes additively: $$ \epsilon_{\mathrm{seq}} \;\leq\; \epsilon_A + \epsilon_B + \mathsf{negl}(\lambda). $$ ### Parallel composition `P_A ‖ P_B` Both programs run in the same transaction over disjoint state. Combined circuit `C_{A‖B}` satisfies iff both `C_A` and `C_B` do. Soundness: $$ \epsilon_{A \| B} \;\leq\; \epsilon_A + \epsilon_B + \mathsf{negl}(\lambda). $$ ### Nested composition `P_A[P_B]` `P_A` calls `P_B` as a subroutine. State passes through Pedersen-committed values: $$ \mathsf{call}(P_B, \mathsf{Com}(\vec{\mathrm{args}}), \pi_{\mathrm{args\_valid}}) \;\to\; (\mathsf{Com}(\vec{\mathrm{result}}), \pi_{\mathrm{exec}}). $$ The caller verifies `π_exec` recursively inside its own circuit. Soundness includes a recursion-overhead term: $$ \epsilon_{\mathrm{nested}} \;\leq\; \epsilon_A + \epsilon_B + \epsilon_{\mathrm{recursive}} + \mathsf{negl}(\lambda). $$ `ε_recursive` is bounded by Theorem 3.8 (composed soundness of the hybrid proof architecture). ## Summary of security properties ## What's left We have the framework. We have the theorems. The question now is: does this actually fit on Solana? The next post drops the abstract `Π_hybrid` and gives concrete numbers — proof sizes in bytes, verification costs in CU, transaction layouts inside the 1,232-byte limit. ## Bibliography - Canetti, R. (2001). *Universally Composable Security: A New Paradigm for Cryptographic Protocols.* FOCS 2001. - Goldwasser, S., Micali, S., Rackoff, C. (1985). *The Knowledge Complexity of Interactive Proof-Systems.* STOC 1985. - Groth, J. (2016). *On the Size of Pairing-Based Non-Interactive Arguments.* EUROCRYPT 2016. - Ben-Sasson, E. et al. (2018). *Scalable, transparent, and post-quantum secure computational integrity (STARKs).* https://eprint.iacr.org/2018/046 - Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370 Previous: [Verifiable shuffles ←](/blog/verifiable_shuffles_for_privacy/) · Next: [Solana instantiation: 656 bytes →](/blog/solana_instantiation_656_bytes/) --- # Bayer-Groth verifiable shuffles for network-layer privacy Canonical: https://blog.skill-issue.dev/blog/verifiable_shuffles_for_privacy/ Description: F_RP Construction III, Approach C. Bayer-Groth verifiable shuffles obscure the input→output permutation of a batch with O(√n) proof size — used to cascade-mix pre-broadcast batches at the network layer. Published: 2026-05-06T15:00:00.000Z Tags: zk, cryptography, shuffles, mixnet, privacy, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; [Ring signatures and TAB](/blog/tab_threshold_anonymous_broadcast/) hide the submitter on-chain. They don't hide the **packet**: the TCP/QUIC frame that hits a Solana RPC node still has a source IP, a timing signature, and a propagation pattern. A passive adversary running a handful of nodes can do timing triangulation to identify which IP first broadcast a transaction, and that IP is enough to undo the cryptographic anonymity. This is post 6 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. The construction here addresses the network layer with **verifiable shuffles** — a primitive that lets a third party shuffle and re-randomise a batch of encrypted transactions without learning the permutation, then prove they did so honestly. ## What a verifiable shuffle is A vector of ElGamal ciphertexts arrives at a "shuffler" — a party (or chain of parties) that: 1. Permutes the order of the ciphertexts. 2. Re-randomises each ciphertext (changes the encryption randomness without changing the plaintext). 3. Outputs a new vector that is provably a permutation-and-re-randomisation of the input. Crucially, the shuffler's permutation π is **secret**. The proof attests "this output is some valid shuffle of the input" without revealing which one. **Definition.** Let `vec(C) = (C_1, ..., C_n)` be ElGamal ciphertexts encrypting messages `M_i` under a common public key `pk_dec`. A verifiable shuffle protocol produces: $$ \big(\,\vec{C}',\ \pi_{\mathrm{shuffle}}\,\big) \;\leftarrow\; \mathsf{Shuffle}(\vec{C}, \mathsf{pk}_{\mathrm{dec}}) $$ where `vec(C')` is a re-randomised permutation of `vec(C)` and the proof `π_shuffle` allows public verification without revealing π. For each `i ∈ [n]`: $$ C'_i \;=\; C_{\pi^{-1}(i)} + (r'_i \cdot G,\ r'_i \cdot \mathsf{pk}_{\mathrm{dec}}) $$ with `r'_i` sampled fresh. ## Bayer-Groth shuffle argument The Bayer-Groth construction [BG12] gives an honest-verifier zero-knowledge argument with **O(√n) proof size**. The pieces: ### 1. Permutation matrix commitment The shuffler commits to the permutation matrix `M_π ∈ {0,1}^{n×n}` using a Pedersen vector commitment: $$ \mathsf{Com}(\vec{a}) \;=\; \sum_{i=1}^n a_i \cdot H_i \;+\; r \cdot G, $$ where `vec(a)` encodes the permutation and `H_1, ..., H_n` are independent generators. ### 2. Multi-exponentiation argument For a verifier challenge `vec(x) = (x_1, ..., x_n)`, the shuffler proves: $$ \prod_{i=1}^n (C_i)^{x_{\pi(i)}} \cdot \mathsf{rerand} \;=\; \prod_{i=1}^n (C'_i)^{x_i}. $$ This is a batched ElGamal homomorphism check that requires the shuffle to be a valid permutation with correct re-randomisation. ### 3. Permutation argument (Schwartz-Zippel) The committed `(a_1, ..., a_n)` form a permutation of `(1, ..., n)` if and only if the polynomial identity $$ \prod_{i=1}^n (a_i - x) \;=\; \prod_{i=1}^n (i - x) $$ holds. The shuffler proves it by evaluating both sides at a random verifier-supplied `x`. Two degree-`n` polynomials that agree at a random point are identical with overwhelming probability. ### Sublinear proof via recursive blocks Bayer-Groth's main contribution is the recursive block structure that pushes proof size to O(√n). Split the n elements into √n blocks of √n; commit to each block; recurse. Verifier cost remains O(n) multi-scalar multiplications for the main check, plus O(√n) pairings/exponentiations for the permutation argument. ## Theorem 3.11 — shuffle privacy **Statement.** Under the DDH assumption on the underlying group and the zero-knowledge property of the Bayer-Groth argument, for any two permutations `π_0, π_1 ∈ S_n` and any PPT adversary observing `(vec(C), vec(C'), π_shuffle)`: $$ \bigl|\,\Pr[\mathcal{A}(\vec{C}, \mathsf{Shuffle}_{\pi_0}(\vec{C}), \pi_{\mathrm{shuffle}}) = 1] \;-\; \Pr[\mathcal{A}(\vec{C}, \mathsf{Shuffle}_{\pi_1}(\vec{C}), \pi_{\mathrm{shuffle}}) = 1]\,\bigr| \;\leq\; \mathsf{Adv}^{\mathsf{DDH}}_{\mathcal{A}}(\lambda) + \mathsf{negl}(\lambda). $$ **Proof sketch.** Reduce permutation identification to DDH. The reduction `B` receives a DDH challenge `(G, A = a·G, B = b·G, Z)` and sets `pk_dec = A`. To simulate the shuffle: - If `Z = ab·G` (real DDH tuple): `B` re-randomises position `i` using `r'_i = b` and the DDH structure correctly produces a valid shuffle under `π_0`. - If `Z` is uniform random: the re-randomisation introduces a random group element, making the shuffled ciphertexts independent of any specific permutation. `B` wins with `1/2 + ε/2` if `D` distinguishes shuffles with advantage ε. The proof itself is zero-knowledge by Bayer-Groth, leaking no additional information about π beyond what is already in `(vec(C), vec(C'))`. ∎ ## Cascade shuffles If `k` independent shufflers each shuffle in sequence, the adversary must corrupt **all `k`** to learn the overall permutation: $$ \mathsf{Adv}^{\mathrm{perm}}_{\mathrm{cascade}} \;\leq\; \prod_{j=1}^k \mathsf{Adv}^{\mathsf{DDH}}_j + k \cdot \mathsf{negl}(\lambda). $$ This is the standard mix-net argument — one honest shuffler is enough. A cascade of three shufflers means the adversary needs to compromise all three to deanonymise. ## Integration with F_RP E1[ElGamal encrypt] U2[User 2] --> E2[ElGamal encrypt] Un[User n] --> En[ElGamal encrypt] E1 --> S1[Shuffler 1
π_1, prove] E2 --> S1 En --> S1 S1 --> S2[Shuffler 2
π_2, prove] S2 --> S3[Shuffler 3
π_3, prove] S3 --> D[Threshold decrypt] D --> C[Solana RPC] classDef user stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff classDef mix stroke:#facc15,stroke-width:2px,fill:#0a0a0a,color:#fff class U1,U2,Un user class S1,S2,S3 mix `}/> The shuffle network sits **between the user and the Solana RPC**. Workflow: 1. User encrypts their SPST/PPST transaction under a shared public key `pk_dec` (held by a threshold-decrypter set). 2. User submits the ciphertext to a public mempool shared with other privacy-protocol users. 3. Shufflers (a chain of 2-5 independent operators) take the batch, shuffle, re-randomise, prove. 4. After the cascade, threshold-decrypter set decrypts the final shuffled ciphertexts. 5. Decrypted transactions are submitted to Solana validators directly. The validators see no IP / timing correlation back to the originating user. The shufflers see ciphertexts but not plaintexts. The threshold decrypter sees plaintexts but not the originator-to-position mapping. ## Tradeoffs vs. ring signatures + TAB Shuffles and TAB compose. The recommended stack: 1. **TAB** (or ring sig) for on-chain submitter anonymity. 2. **Shuffle cascade** for network-layer source-IP anonymity. 3. **Tor/I2P/Dandelion++** as belt-and-braces for IP-level anonymity even against in-mempool observers. ## Practical anonymity bounds For a TAB group of `n_tab = 100` and a shuffle cascade of size `n_shuffle = 50`, with a network-leakage parameter `μ ∈ [0, 1]` capturing how much side-channel info bleeds through: $$ H(\mathrm{submitter}) \;\geq\; \log_2(n_{\mathrm{tab}}) + (1 - \mu) \cdot \log_2(n_{\mathrm{shuffle}}) - \mathsf{negl}(\lambda). $$ With `μ = 0.3` (moderate leakage from timing patterns), this gives roughly 6.6 + 0.7·5.6 ≈ 10.5 bits of effective anonymity — about 1500 indistinguishable submitters. With `μ = 0` (Tor + Dandelion++), 12.2 bits ≈ 4700 submitters. ## Why this isn't deployed yet Shufflers are operational infrastructure. Each one is: - A long-running Linux process holding a Bayer-Groth proof generator. - Interactive with other shufflers and the threshold-decrypter set. - Subject to liveness assumptions (one going offline pauses the cascade, but doesn't break privacy). For F_RP's first deployment we ship without shufflers — TAB plus user-side Tor is enough for the initial threat model. The shuffle network is a Phase 2 hardening, designed to neutralise nation-state-level network observers. ## Bibliography - Bayer, S., Groth, J. (2012). *Efficient Zero-Knowledge Argument for Correctness of a Shuffle.* EUROCRYPT 2012. - Chaum, D. (1981). *Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms.* CACM 24(2). - Fanti, G. et al. (2018). *Dandelion++: Lightweight Cryptocurrency Networking with Formal Anonymity Guarantees.* SIGMETRICS 2018. - Monero Project. *Tor and I2P integration in monerod (master).* https://github.com/monero-project/monero/blob/master/docs/ANONYMITY_NETWORKS.md Previous: [TAB: ring sigs and FROST ←](/blog/tab_threshold_anonymous_broadcast/) · Next: [UPEE: composing the framework →](/blog/upee_universal_private_execution/) --- # TAB: hiding the submitter with ring signatures and FROST Canonical: https://blog.skill-issue.dev/blog/tab_threshold_anonymous_broadcast/ Description: F_RP Construction III. ZK proofs hide the contents but the wrapping Solana tx still leaks the submitter pubkey. TAB closes that gap with a Fujisaki-Suzuki ring signature and a FROST threshold Schnorr over Ed25519. Published: 2026-05-04T16:30:00.000Z Tags: zk, cryptography, ring-signatures, frost, monero, anonymity, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; [SPST](/blog/spst_self_paying_shielded_transactions/) hides what value moved. [PPST](/blog/ppst_private_programmable_state/) hides what program ran. Neither hides **who submitted the transaction**. On any chain that requires a signature on the outer transaction (Solana, Ethereum, Aptos, Sui — all of them), the public key of the submitter is right there in the transaction header. Without a relayer, the submitter must sign with their own key. The Ed25519 public key tells the chain exactly which private actor authorised the proof. ZK on the inside; perfect plaintext on the outside. This is post 5 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. Here we close the submitter-identification gap with two complementary network-layer primitives. ## The submitter identification problem, formally **Definition (Active Participant Set).** $\mathcal{S} = \{(\mathsf{pk}_i, \mathsf{sk}_i)\}_{i=1}^N$ — the set of active F_RP participants at a given epoch. Each holds a Curve25519 keypair registered on chain. **Definition (Anonymity Set Reduction Attack).** Adversary $\mathcal{A}$ with full read access to $\sigma$. Define: $$ \mathcal{A}_{\text{eff}}(\mathsf{tx}) = \{\, i \in \mathcal{S} : \Pr[\text{participant } i \text{ submitted } \mathsf{tx} \mid \mathsf{View}_{\mathcal{A}}] > 0 \,\}. $$ Naive relayerless setting: $|\mathcal{A}_{\text{eff}}| = 1$. Ed25519 signatures are strongly unforgeable — there is exactly one $\mathsf{pk}_i$ that verifies. Conditional entropy: $$ H(\text{submitter} \mid \mathsf{View}_{\mathcal{A}}) \;=\; 0. $$ Worst possible. Even though the *contents* of the transaction (the SPST/PPST proof) reveal nothing about which notes were spent, the submitter's pubkey reveals exactly who authorised the spend. Off-chain metadata (IP, timing, prior-deposit history, exchange KYC) collapses any remaining anonymity. ## Approach A — Fujisaki-Suzuki ring signature over Ed25519 Adapt the linkable ring signature framework of Fujisaki and Suzuki (2007) to the Ed25519 group. Let $\mathbb{G}$ be the prime-order Ed25519 subgroup with generator $G$ and order $\ell$. Two random oracles: $\mathsf{H}_p : \{0,1\}^* \to \mathbb{Z}_\ell$ and $\mathsf{H}_G : \{0,1\}^* \to \mathbb{G}$. **Sign** with ring $R = \{\mathsf{pk}_1, \ldots, \mathsf{pk}_n\}$ at signer index $s$: 1. **Key image.** $I = \mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$ — deterministic linkability tag, hides $s$. 2. **Commitment.** Sample $\alpha \xleftarrow{R} \mathbb{Z}_\ell$. Compute $L_s = \alpha G$, $R_s = \alpha \mathsf{H}_G(\mathsf{pk}_s)$. 3. **Challenge propagation.** For $i = s+1, s+2, \ldots, s-1 \pmod{n}$ sample $c_i, r_i \xleftarrow{R} \mathbb{Z}_\ell$ and compute $$ L_i = r_i G + c_i \mathsf{pk}_i, \quad R_i = r_i \mathsf{H}_G(\mathsf{pk}_i) + c_i I, \quad c_{i+1} = \mathsf{H}_p(m, L_i, R_i). $$ 4. **Close.** Set $c_{s+1} = \mathsf{H}_p(m, L_s, R_s)$, propagate to obtain $c_s$, compute $r_s = \alpha - c_s \mathsf{sk}_s \pmod{\ell}$. 5. **Output.** $\sigma_{\text{ring}} = (I, c_1, r_1, \ldots, r_n)$. **Verify.** Recompute every $L_i, R_i, c_{i+1}$. Accept iff $c_{n+1} = c_1$. **Signature size.** $I \in \mathbb{G}$ (32 B compressed) + $c_1 \in \mathbb{Z}_\ell$ (32 B) + $n$ scalars $r_i$ (32 B each) = $64 + 32n$ bytes. ### Solana transaction-size constraint With ~300 bytes reserved for transaction metadata + nullifiers + Groth16 proof + recent blockhash, ~930 bytes are available for the ring signature inside the 1,232-byte limit: $$ n_{\max} \;=\; \left\lfloor \frac{930 - 64}{32} \right\rfloor \;=\; 27. $$ Under SIMD-0296 (4,096-byte transactions, approved late 2025), this jumps to $n_{\max} \approx 119$. Verification cost: each ring member needs 2 scalar multiplications + 1 hash ≈ 5,300 CU. For $n = 27$, that's $\sim 143{,}100$ CU on top of the ~150,000-200,000 CU for SPST verification. Total: ~340,000 CU — about 24% of the 1.4M CU budget. ## Theorem 3.9 — Ring anonymity **Statement.** In the random oracle model, for any ring $R$, any indices $i, j \in [n]$, and any PPT distinguisher $\mathcal{D}$: $$ \bigl|\Pr[\mathcal{D}(m, R, \mathsf{RingSign}(\mathsf{sk}_i, m, R)) = 1] - \Pr[\mathcal{D}(m, R, \mathsf{RingSign}(\mathsf{sk}_j, m, R)) = 1]\bigr| = 0. $$ **Perfect** (information-theoretic) anonymity in the ROM. **Proof sketch (two steps).** *Step 1 — Key image indistinguishability.* $I_s = \mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$. Since $\mathsf{H}_G$ is a random oracle independent of $G$, $\mathsf{H}_G(\mathsf{pk}_s)$ is a uniform random group element. The product $\mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$ is uniform over $\mathbb{G}$ from the adversary's view (one-more discrete-log assumption). *Step 2 — Transcript simulation.* For any $s$, the tuple $(c_1, r_1, \ldots, r_n)$ is uniform over $\mathbb{Z}_\ell^{2n}$ subject to the ring-closure constraint. The simulator $\mathsf{Sim}(m, R)$ that knows no secret key produces an identically distributed output by sampling all $(c_i, r_i)$ uniformly and programming the random oracle to close the ring. The marginal distributions are identical for every $s \in [n]$, so $\mathsf{Adv}_{\mathcal{D}}^{\text{anon}} = 0$. ∎ **Corollary.** Ring signature of size $n$ provides $\log_2(n)$ bits of submitter anonymity. For $n = 27$ that's $\sim 4.75$ bits; for $n = 119$ (SIMD-0296) that's $\sim 6.9$ bits. Real-world anonymity is bounded by side-channel leakage (timing, IP) but the on-chain view alone provides exactly $\log_2(n)$. The signer is anonymous among the ring. The ring is public. The cost is linear in ring size. ## Approach B — FROST threshold Schnorr (TAB proper) Ring signatures grow linearly with $n$. For high-throughput deployments where $n \gg 27$ is desired, we want a **constant-size** signature. Threshold Schnorr is the answer. **Setup.** $n$ participants run a one-time Distributed Key Generation (Feldman VSS) producing: - A group public key $\mathsf{pk}_{\text{group}} = \mathsf{sk}_{\text{group}} \cdot G$ (the group secret is never reconstructed). - Individual shares $\mathsf{sk}_{\text{share},i}$ for each participant. - A threshold $t \leq n$. **Sign (FROST round structure):** Any subset $T \subseteq [n]$ with $|T| = t$ can co-produce a Schnorr signature on message $m$: 1. **Commitment round.** Each $i \in T$ samples nonces $d_i, e_i \xleftarrow{R} \mathbb{Z}_\ell$ and broadcasts $D_i = d_i G$, $E_i = e_i G$. 2. **Signing round.** Each $i$ computes $$ \rho_i = \mathsf{H}(i, m, \{(D_j, E_j)\}_{j \in T}), \quad R = \sum_{j \in T} (D_j + \rho_j E_j), $$ $$ c = \mathsf{H}(R, \mathsf{pk}_{\text{group}}, m), \quad \lambda_i = \prod_{j \in T \setminus \{i\}} \frac{j}{j - i} \pmod \ell, $$ $$ z_i = d_i + \rho_i e_i + c \lambda_i \mathsf{sk}_{\text{share},i} \pmod \ell. $$ 3. **Combine.** $\sigma_{\text{threshold}} = (R, z)$ with $z = \sum_{i \in T} z_i$. **Verify.** Standard Schnorr verification against $\mathsf{pk}_{\text{group}}$: $$ z G \;\stackrel{?}{=}\; R + c \cdot \mathsf{pk}_{\text{group}}. $$ **Signature size.** $(R, z)$ = 32 + 32 = **64 bytes**. *Independent of $n$ and $t$.* Identical to a standard Ed25519 signature. ## Theorem 3.10 — TAB privacy **Statement.** For any two subsets $T, T' \subseteq [n]$ with $|T| = |T'| = t$, and any PPT $\mathcal{A}$ controlling up to $t-1$ participants, the threshold signature produced by $T$ is computationally indistinguishable from the one produced by $T'$. **Proof structure.** Hybrid argument over the FROST protocol: - **Hybrid 0**: real $T$. Adversary observes final $(R, z)$ + $t-1$ partial signatures from corrupted parties. - **Hybrid 1**: replace $R$ with a uniform random $\mathbb{G}$ element. Honest participants' nonces $d_j, e_j$ for $j \in T \setminus \mathcal{C}$ are uniform; sum is uniform. Distribution identical. - **Hybrid 2**: replace $z$ with the deterministic value $z = R/G + c \cdot \mathsf{sk}_{\text{group}}$ (well-defined given $R, c, \mathsf{pk}_{\text{group}}$). Same distribution. - **Hybrid 3**: real $T'$. Same argument. Honest partial signatures are never revealed to $\mathcal{A}$ (they're consumed in combination). The final $(R, z)$ depends only on the *honest contribution to $R$* — uniform regardless of $T$. ∎ **Anonymity:** **Unbounded.** As long as $|T| \geq t$ and at least one honest participant in $T$ exists, the adversary cannot determine which subset signed. With $n$ in the thousands and $t$ in the hundreds, $|T|$ choices are combinatorial and indistinguishable. ## Tradeoffs at a glance ## Why both, not one or the other The two approaches cover different deployment regimes: - **Bootstrapping / low coordination**: ring signatures. No DKG required; any user can sign with any ring composed of $n$ on-chain pubkeys. Anonymity scales to the size of the ring you can pack into the transaction. - **Established network with stable participants**: TAB / FROST. One-time DKG cost amortises across all transactions; signatures are minimum-size; anonymity is bounded by the group size, not the transaction size. In practice, F_RP starts in the ring-signature regime and migrates to TAB once the network has enough committed participants for a meaningful DKG. The constructions are not mutually exclusive — the on-chain verifier can accept either type and the wrapping Solana transaction looks identical in size in the TAB case. ## What's still missing Even with TAB, two leakage channels remain: 1. **Network metadata.** The TCP/QUIC packet that hits a Solana RPC node has a source IP. Without Tor, I2P, or Dandelion++, that IP links directly to the user. [Post 6](/blog/verifiable_shuffles_for_privacy/) addresses this with verifiable shuffles at the network layer. 2. **Timing correlation.** A user who shields and spends within the same minute is still linkable via temporal proximity, regardless of how many ring members they hide in. Mitigations are about user behaviour and client-side delay sampling. ## Bibliography - Fujisaki, E., Suzuki, K. (2007). *Traceable Ring Signature.* PKC 2007. - Komlo, C., Goldberg, I. (2020). *FROST: Flexible Round-Optimized Schnorr Threshold Signatures.* SAC 2020. https://eprint.iacr.org/2020/852 - Feldman, P. (1987). *A Practical Scheme for Non-Interactive Verifiable Secret Sharing.* FOCS 1987. - Goodell, B., Noether, S. (2020). *Concise Linkable Ring Signatures and Forgery Against Adversarial Keys (CLSAG).* https://eprint.iacr.org/2019/654 - Bernstein, D. J. et al. (2012). *High-speed high-security signatures.* Journal of Cryptographic Engineering. Previous: [PPST: private programmable state ←](/blog/ppst_private_programmable_state/) · Next: [Bayer-Groth verifiable shuffles →](/blog/verifiable_shuffles_for_privacy/) --- # On the death of the trusted setup Canonical: https://blog.skill-issue.dev/blog/on_the_death_of_the_trusted_setup/ Description: Universal SRS, transparent FRI, and why Groth16's per-circuit ceremony feels anachronistic in 2026 — even when, as ZERA does, you're still using one. A history of the ceremonies that worked, the ones that didn't, and what comes next. Published: 2026-05-04T16:00:00.000Z Tags: groth16, plonk, kzg, fri, trusted-setup, ceremony, zk, phd, opinion import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The first time I sat down to deploy a Groth16 circuit in anger, I spent more time on the **ceremony** — the multi-party computation that produces the per-circuit proving and verification keys — than I did on the circuit itself. We ran a Phase 2 ceremony with eleven participants, scattered across four time zones, each contributing a fresh entropy beacon to a 250 MB blob, with the contributions chained over a Phase 1 Powers-of-Tau output we trusted because Aztec's 2019 ceremony had convinced us. None of the eleven participants was cryptographically obligated to behave; we trusted that *at least one* of them was honest, that none of them coordinated, and that the entropy was actually random. Eight years on from the first big Groth16 ceremony — Zcash's [Sapling ceremony in 2018](https://z.cash/technology/paramgen/) — the dominant attitude in the ZK research community is that this whole exercise is *anachronistic*. Universal SRS systems (PLONK, Marlin) let you reuse a single Powers-of-Tau output across every circuit. Transparent setup systems (FRI / STARKs) need no ceremony at all. The cost difference between *running a ceremony* and *not running one* is, by 2026, much larger than the cost difference between *Groth16 proofs* and *PLONK proofs*. So why do we still ship Groth16? This post is the long answer. It is also part defence, part eulogy, part roadmap. I am writing this as someone whose [SDK still ships per-circuit Groth16](/blog/zera_sdk_scaffolding/) — and who, if I were starting over today, probably wouldn't. ## What a trusted setup actually is To prove a statement in Groth16, the prover needs a **proving key** and the verifier needs a **verification key**. Both are derived from a *toxic-waste secret* $\tau$ that, if it ever leaked, would let an attacker fabricate proofs. The job of the ceremony is to compute the proving and verification keys *without anyone — including all ceremony participants combined — ever holding $\tau$ in plaintext*. It works because of a property called *MPC-with-1-of-n trust*: as long as at least one ceremony participant securely deletes their portion of the toxic waste, the secret is destroyed for everyone. You can run the ceremony with 1,000 participants and the security argument requires only that *one* of them was honest. Phase 1 is *circuit-independent* and produces a Powers-of-Tau structured reference string usable by any circuit up to a max constraint count. Phase 2 is *circuit-specific* — you have to run a fresh ceremony every time the circuit changes. That second sentence is the entire problem. The reason "trusted" setups are required is that for cryptographic schemes that need them, there is "toxic waste" data that is generated as part of the protocol that must be deleted; if it is not deleted, an attacker who has it can break the cryptographic scheme. ## A short history of ceremonies that mattered Three numbers tell the story: - **Zcash Sapling (2018):** 87 participants, three months of coordination, 220 GB of intermediate transcript. - **Tornado Cash Phase 2 (2019):** 1,114 participants, web-based contributor tooling, two weeks. - **Ethereum KZG Summoning (2022–23):** 141,416 participants, *running for over a year*, web + CLI + browser-extension contributor tooling. The Ethereum ceremony is the high-water mark and the one that most decisively shifts the conversation. With 141,000+ participants, a 1-of-n honesty assumption is *practically* indistinguishable from no honesty assumption at all. The probability that *every single one* of 141,000 participants colluded to leak $\tau$, and then kept that secret without it leaking out the back, is below the operational threshold of any threat model worth taking seriously. So: **the Ethereum KZG ceremony output is, in 2026, treated as a publicly trustworthy SRS for any circuit that fits inside its size budget.** PLONK / Marlin / Halo2-KZG / any KZG-using protocol can reuse it. Aztec Ignition's 2018 output played the same role for BN254 G1 prior; the Ethereum ceremony is bigger, fresher, and run with 2024-vintage tooling. The ceremonies that *didn't* work matter too. The early-Zcash Sprout ceremony was scrutinised after the fact for inadequate transcript retention and contributor non-determinism. Several smaller projects ran ceremonies with 3–5 contributors and predictable entropy beacons, and the cryptographic community treats their outputs as effectively untrusted. The line between "ceremony" and "ceremony that closes the trust gap" is mostly *participant count* and *entropy-source diversity*. ## Why per-circuit ceremonies feel anachronistic There are three setup models in 2026, and they cleanly divide: The argument *against* Groth16 in 2026 is not that the per-circuit ceremony is hard — the tooling is much better than it was in 2018. It's that: 1. **The proof-size advantage has narrowed.** Groth16 proofs are ~200 bytes, KZG-based PLONK proofs ~600 bytes. On a chain that prices verification by *gas* and not *bytes*, that's a marginal difference. 2. **The verification-cost advantage has narrowed.** Modern PLONK / Halo2 verifiers on the EVM are within a factor of 2-3 of Groth16's gas cost, down from 5-10× in 2020. 3. **The agility cost is large.** Every circuit change requires a fresh ceremony. For a fast-moving project that wants to upgrade circuits quarterly, this is a real recurring cost. 4. **The composability cost is large.** Two Groth16 circuits with separate ceremonies cannot share a verifier; on a universal SRS, two PLONK circuits can. Groth16 today is the right choice for *frozen circuits in stable deployments* — circuits you expect to ship once and then run for years without modification. It's the wrong choice for *active research and iteration*, which describes most ZK projects in 2026. ## Why Groth16 isn't dead, even so Two reasons, both engineering: **On-chain verifier ergonomics.** Solana's `sol_alt_bn128_pairing` syscall is built for Groth16; on-chain PLONK verification on Solana costs hundreds of thousands of compute units more. This is what keeps [zera-sdk](/blog/zera_sdk_scaffolding/) on Groth16 today: the marginal-cost calculation for a *deposit* is dominated by the on-chain verifier cost, and Solana's verifier surface is BN254-Groth16-shaped. **The accumulated zkey ecosystem.** Every Groth16 circuit ever shipped has a tested, audited zkey artifact and a corresponding Solidity / Solana / Move verifier contract. Migrating off Groth16 means either (a) re-running ceremonies for the universal SRS path or (b) waiting for the chain's verifier surface to support transparent setup. (b) is in progress on multiple chains; (a) is mostly done on Ethereum and not yet on Solana. The death of the trusted setup, like most deaths, is gradual. Groth16 is dying in 2026 the way SHA-1 was dying in 2014 — still everywhere, still working, increasingly the wrong choice for new builds. ## The migration path I'd actually take If I were starting a new ZK project this quarter, the decision tree would be: 1. **Do you need EVM verification?** If yes, **Halo2-KZG** (Axiom fork) and reuse the Ethereum KZG SRS. No fresh ceremony required for circuits up to ~$2^{28}$ constraints. 2. **Do you need Solana verification?** If yes, **Groth16 + per-circuit Phase 2 ceremony**, until Solana ships a transparent-setup-compatible verifier syscall. Track the [SIMD threads](https://github.com/solana-foundation/solana-improvement-documents) for this. 3. **Do you need no on-chain verification at all (zkVM, off-chain proving, audit logs)?** **Plonky3** with BabyBear or Mersenne31. Transparent setup, fastest prover, smallest deployment surface. 4. **Are you proving recursive computation across many steps (zkVMs, rollups)?** Folding scheme — **Nova** or **ProtoStar** — over Pasta or Pasta-style cycle. Transparent. The two cells in this matrix that still pin you to Groth16 are *Solana on-chain* and *very-low-gas EVM verification* (rare in 2026 since EVM gas costs have crashed for Halo2 verifiers). For everything else, the universal-or-transparent path is strictly better. ## What this means for ZERA today We ship Groth16. The Phase 2 ceremony for the deposit, transfer, and withdraw circuits ran in late 2025 with 23 participants and is documented in the SDK repo. The output is reproducible; the contributor transcripts are public; we are comfortable with the security argument *for the threat model we ship under* (consumer privacy on a public L1, not state-actor adversaries). We will migrate when one of two things happens: 1. **Solana ships a STARK-compatible verifier syscall** — at which point the on-chain side stops constraining the off-chain choice, and we move to Plonky3 over BabyBear. 2. **We ship a meaningful circuit upgrade** that requires a re-ceremony anyway — at which point the marginal cost of switching to a universal-SRS protocol is much smaller, and we move to PLONK over the Ethereum KZG SRS. Until one of those happens, Groth16. The cypherpunk part of me wishes (1) had already happened. The shipping part of me knows (1) hasn't, and that "we use the same proof system as Aztec, Tornado Cash, Iden3, and most of the early Zcash mainnet" is not the worst place to be parked in mid-2026. ## What I would change about ceremony culture in 2027 Three things, in order of how much I'd actually push for them: 1. **Standardised contributor transcripts.** Every ceremony rolls its own transcript format, contributor verification flow, and beacon-source documentation. A single `ceremony-transcript.toml` schema — adopted across snarkjs / Trusted-Setup-CLI / community tooling — would make multi-ceremony auditing dramatically easier. 2. **Public ceremony reuse registry.** "What's the freshest Phase 1 over BN254 right now?" is a question I ask quarterly and answer by reading other people's repos. A simple registry of *ceremony output → SRS constraints → audit status → known users* would close that gap. 3. **Browser-native ceremony participation.** The Ethereum KZG ceremony shipped a beautiful browser participant. Most other ceremonies have not, and the contributor pool reflects that. A reusable browser-ceremony-participation library would broaden the contributor demographics for any future Phase 2. None of these are research questions. They're community-tooling questions, and they're the kind of work that doesn't get done because it doesn't publish. ## Further reading - [How do trusted setups work?](https://vitalik.eth.limo/general/2022/03/14/trustedsetup.html) — Vitalik Buterin (2022) — the most readable summary - [PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge](https://eprint.iacr.org/2019/953) — Gabizon, Williamson, Ciobotaru (2019) — universal SRS, the alternative to per-circuit ceremonies - [Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS](https://eprint.iacr.org/2019/1047) — Chiesa, Hu, Maller, Mishra, Vesely, Ward (2019) - [Scalable, transparent, and post-quantum secure computational integrity](https://eprint.iacr.org/2018/046) — Ben-Sasson, Bentov, Horesh, Riabzev (2018) — the no-setup direction - [Ethereum KZG Summoning Ceremony](https://ceremony.ethereum.org/) — the largest ceremony ever run, with 141,416+ contributors - [Halo2 in 2026: what changed since the Zcash era](/blog/halo2_in_2026_what_changed/) — sister post on the KZG-based universal-SRS workhorse - [Plonky3, the small-fast-cheap revolution](/blog/plonky3_small_fast_cheap/) — sister post on the no-setup STARK-family alternative --- # WASM-native proving for ZK SDKs: an SDK author's take Canonical: https://blog.skill-issue.dev/blog/wasm_native_proving_sdk_authors_take/ Description: Why zera-sdk ships native Rust on Node and snarkjs in the browser — and what it would actually cost to ship a WASM-compiled Rust prover for the browser path. A design post about the dual-target build pipeline. Published: 2026-05-03T19:00:00.000Z Tags: wasm, sdk, neon, rust, snarkjs, arkworks, zera, zk, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The most-asked engineering question on every ZK SDK call I take is some shape of: > "Why are you using snarkjs in the browser when you have a Rust core?" The honest answer is that we made a decision in March 2026, captured it in [RFC 001](/docs/001-zera-sdk-monorepo-shape/) under the heading *"notes are too sensitive to round-trip through WASM"*, and have been quietly re-evaluating it ever since. The dishonest answer is that we shipped what was working. Both answers contain something true. This post is the long version that fits neither into a Twitter thread nor into the RFC. The shape of the problem is: the Rust core exists, it's faster than snarkjs, and yet for the browser path we ship snarkjs. Why? And what would it actually cost to swap? ## The dual-target shape Every ZK SDK in 2026 has the same engineering shape, even if its authors don't admit it: N[neon-rs - native node bindings] C --> W[wasm-pack target] N --> NJ[zera-sdk on Node and Electron] W --> WB[Browser path - planned] C2[circuits .circom] --> SNK[snarkjs WASM prover] C2 --> ARK[arkworks-circom Rust prover] ARK --> N ARK --> W SNK --> WB2[Browser path - shipped today] classDef ship fill:#0a4014,stroke:#4ade80,color:#fff classDef plan fill:#3a2a0a,stroke:#facc15,color:#fff class NJ,WB2 ship class WB plan`}/> There is a Rust core that does *crypto primitives* (Poseidon, Merkle, nullifiers, note construction). The Rust core compiles two ways: 1. **Native, via [`neon-rs`](https://neon-rs.dev/)**, into a Node.js addon that ships zero-copy across the Buffer ABI. 2. **WebAssembly, via `wasm-pack` / `wasm-bindgen`**, for browser environments. There is also a *prover* — a separate concern from the crypto primitives — that takes a circuit's R1CS plus a witness plus a zkey and produces a proof. The prover is structurally separate from the core and ships as one of: 1. **snarkjs**, a JavaScript prover with a hand-tuned WASM bigint inside it. Browser-native, mature. 2. **arkworks-circom**, a Rust prover that consumes the same R1CS and zkey, compiled either native (server) or WASM (browser). ZERA today ships **option 1 of the core via neon-rs (native)** and **option 1 of the prover (snarkjs) in the browser**. The path that doesn't exist is *the Rust prover compiled to WASM*. That's the gap this post is about. ## Why we deferred the WASM prover Three reasons, in honest order of how much each weighed: ### 1. The marshalling cost of crypto-primitive calls is real When the SDK computes a Poseidon commitment, it calls `zera-core` from TypeScript. Through neon-rs, that call is *zero-copy*: the JS Buffer holding the note bytes is a pointer the Rust side reads directly. Through wasm-bindgen, the same call requires copying the bytes into the WASM linear memory, calling the function, and copying the result back. For a 32-byte input and a 32-byte output that's tens of microseconds — negligible per call, real when you're hashing 32 Merkle nodes per proof. **Measured numbers**, on a 2024 MacBook Air M3, hashing one BN254 Poseidon node: | Path | Cost | Notes | |---|---|---| | `zera-core` via neon-rs | ~12 µs | Native Rust, zero-copy | | `circomlibjs` Poseidon | ~280 µs | Pure JS BigInt | | `zera-core` via wasm-bindgen | ~85 µs | Marshalling dominates | | `zera-core` via wasm-bindgen, batched 32 | ~430 µs (= ~13 µs/hash) | Marshalling amortises | The batched WASM call is competitive with the native path because the marshalling overhead is paid once per batch and not once per hash. *That's the engineering punch line*: WASM-from-Rust is fine if you design the API around batched calls, and *bad* if you ship a one-call-per-primitive ergonomic API. snarkjs gets this right by accident — its internals are batched because they're polynomial-time, not constraint-time. A naive port of neon-rs's API surface to WASM would *lose* performance vs the native path while *also* losing performance vs snarkjs, because it would batch neither. ### 2. The wasm-bindgen-rayon deployment story is fragile Multi-threaded Rust in the browser depends on the [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) adapter, which depends on `SharedArrayBuffer`, which depends on the [cross-origin isolation headers](https://web.dev/articles/coop-coep) `Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp` being served by your CDN. Without those headers, the WASM prover *runs single-threaded*, at which point it loses to snarkjs because snarkjs is allowed to use Web Workers from JavaScript directly without needing isolation. That's not theoretical. Several wallet integration partners we've talked to embed our SDK *inside an iframe* on third-party sites where they don't control the headers. snarkjs works there. wasm-bindgen-rayon does not. Until the embedding situation improves — `Worker` threads as a first-class WASM feature, ideally via the [`wasi-threads` proposal](https://github.com/WebAssembly/wasi-threads) — the deployment surface for a Rust-WASM prover is *narrower* than the deployment surface for snarkjs, even if the prover itself is faster on the supported subset. ### 3. snarkjs, today, is good enough for the circuits we ship This is the part the cypherpunk in me hates and the shipping engineer in me has made peace with. The circuits inside zera-sdk — deposit, transfer, withdraw — are in the 5,000–25,000 constraint range. snarkjs proves them in 1–4 seconds in the browser, threads on, IndexedDB-cached zkey. That's slow enough to need a loading state and fast enough that users don't bail. (Numbers from [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/).) The arkworks-WASM prover would prove the same circuits in 0.5–1.2 seconds — a 3–5× win. That's a real win and not a transformative one. *Transformative* would be folding (Nova, SuperNova, ProtoStar) for batch operations, or a small-field STARK migration for the substrate. The marginal-cost calculation said: ship snarkjs, queue arkworks-WASM, prioritise folding for the v2 batch flow. ## What the WASM prover would actually cost Concretely, if I were spec'ing the work: | Task | Estimate | Risk | |---|---|---| | Vendor `arkworks-circom` and pin to a known-good commit | 2 days | Low | | Build it for the `wasm32-unknown-unknown` target with `wasm-bindgen-rayon` | 3 days | Low | | Add COOP/COEP headers to the SDK reference deployment | 1 day | Low | | API parity with the snarkjs path (proof format, zkey loader) | 5 days | Medium — proof byte-format differences exist | | Browser benchmark suite + regression tests | 5 days | Medium | | Iframe-fallback path that auto-degrades to snarkjs without isolation | 5 days | High — this is the actual hard part | | Documentation, partner integration guides | 5 days | Medium | | **Total** | ~5 person-weeks | The fallback path is the load-bearing risk | The genuinely hard part isn't compiling the prover. It's *the fallback path*. We can't ship a browser SDK that breaks on every embedded iframe deployment. So the SDK has to detect at runtime whether SharedArrayBuffer is available, and silently fall back to snarkjs if it isn't. That dual-prover fallback path *adds* maintenance overhead — two provers, two zkey loaders, two test matrices — that doesn't exist today. This is the calculation that keeps coming out the same way: **5 person-weeks for a 3–5× speedup on the supported subset, plus permanent dual-prover maintenance, vs. shipping the same code budget on folding for batch ops or on Solana-side STARK readiness.** The folding work has more upside; the STARK work has more strategic value. The WASM prover work has the most concrete win for the *current* shape of usage. We're going to ship the WASM prover. It's on the v0.5 milestone. But it's been on a milestone for two quarters now, and the reason it keeps slipping is that every quarter the alternative work has bigger expected value. ## The four-way SDK-author tradeoff ## What changed my mind in 2026 Two things, both external: **Header support went mainstream.** Every major hosting provider — Vercel, Netlify, Cloudflare Pages — now ships COOP/COEP header configuration as a first-class feature. In 2024 you had to write custom worker code to inject the headers; in 2026 it's a checkbox. That moves the "fallback path complexity" from *load-bearing* to *secondary risk*. **The Mopro / zkMopro project published clean comparison numbers.** [Their Circom prover comparison](https://zkmopro.org/blog/circom-comparison/) gives a third-party benchmark that I can point partners at when I'm justifying a 3–5× speedup. Internal benchmarks are *also* useful, but the question changes when there's external corroboration. The combination of these two means the *case for shipping the WASM prover* is meaningfully stronger in mid-2026 than it was in early 2026. I'd put 70% confidence that we ship arkworks-WASM in the browser path before the end of 2026, and that the snarkjs fallback survives as a secondary path indefinitely. ## A note on the bigger architectural question The deeper question — the one I think about more often than I write about — is whether the *crypto-primitives core* and the *prover* should even be the same artefact. They're not in zera-sdk: `zera-core` is the primitives, snarkjs is the prover, they don't share code. That separation has been quietly excellent for shipping velocity. What we *don't* do is share the same separation in our partner SDKs. Several integrators have asked: "can I use zera-core for the primitives but a different prover for my circuit?" The answer today is yes, with caveats — the witness format has to match, the zkey has to be Groth16-over-BN254, and the on-chain verifier has to accept the resulting proof. In practice nobody has done this yet. But the architectural shape supports it, and if a partner wanted to ship a Halo2 verifier on Solana (when that's possible), they could keep using zera-core's primitives and swap the prover wholesale. This is the right shape, in retrospect. The crypto core is *small, well-tested, audited*. The prover is *big, fast-moving, swappable*. Conflating them — as some early SDKs do — bakes the prover choice into every wallet integration, and makes the prover migration ZERA is *currently considering* much more painful than it needs to be. ## What I'd ship differently for v0.5 Three concrete deliverables, in order of how much they'd actually move the needle: 1. **`@zera-labs/sdk-prover-wasm`** — a separate npm package containing arkworks-circom compiled to WASM with `wasm-bindgen-rayon`. Opt-in via a constructor flag; falls back to snarkjs on unsupported platforms. This is the work I described above; the new shape is to ship it as a separate package so existing integrators don't pull a 4 MB WASM blob unless they want it. 2. **MCP-side prover-selection tool**. The [`@zera-labs/mcp-server`](/blog/mcp_server_inside_zera_sdk/) currently uses snarkjs unconditionally. An MCP-level configuration for "prefer the fastest available prover" would let agents tune for batch operations vs. one-shot transactions. More upside than it sounds. 3. **A shared zkey-loader abstraction.** Today the SDK reads zkeys from URLs; the MCP server reads them from disk; the test harness reads them from a fixtures directory. A `ZkeyLoader` trait — backed by URL, IndexedDB, fs, or arbitrary user code — would unify the three paths and unblock a "user provides their own zkey" advanced flow that several research partners have asked for. ## Further reading - [RFC 001: zera-sdk monorepo shape](/docs/001-zera-sdk-monorepo-shape/) — the design doc this post discusses - [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — the SDK itself - [Mopro: comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — the external benchmark that changed my prior on the WASM-prover decision - [`iden3/snarkjs`](https://github.com/iden3/snarkjs) — what we ship in the browser today - [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) — what we'd use for the Rust-WASM prover path - [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — the prover-time data that informed this post - [The MCP server inside zera-sdk](/blog/mcp_server_inside_zera_sdk/) — the third audience this SDK serves - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the day-one architecture --- # Plonky3, the small-fast-cheap revolution Canonical: https://blog.skill-issue.dev/blog/plonky3_small_fast_cheap/ Description: Why plonky3 — small fields, FRI commitments, no trusted setup — is the proof system to watch in 2026. The Mersenne31 / BabyBear / Goldilocks landscape, the FRI folding step, and why your laptop is suddenly a viable prover. Published: 2026-05-02T17:00:00.000Z Tags: plonky3, fri, stark, mersenne31, babybear, goldilocks, zk, phd import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx"; For a decade the dominant question in proof-system engineering was *which curve*. BN254 because Ethereum verifies it cheaply. BLS12-381 because Zcash and Filecoin standardised on it. The conversation orbited 254-bit and 381-bit *pairing-friendly* prime fields, and the engineering economy followed: every multiplier, every NTT, every MSM was tuned for those sizes. Then Polygon Zero shipped [plonky2](https://github.com/0xPolygonZero/plonky2) in 2022, then [plonky3](https://github.com/Plonky3/Plonky3) in 2024, and the question changed. The new question is *which 31-bit prime*. Mersenne31. BabyBear. KoalaBear. Fields small enough that two limbs fit in a single 64-bit word. Fields where AVX-512 SIMD lanes hold sixteen field elements at once. Fields where a consumer laptop is suddenly a viable prover for circuits that used to require a small datacentre. This is the small-fast-cheap revolution. It is also the most underrated story in production cryptography in 2026, because most of the conversation about it is happening inside Polygon, Succinct, and a handful of zkVM teams, and it hasn't yet hit the popular "ZK in 2026" articles. This post is my attempt to write the article I keep wishing existed. ## The case for small fields Every proof-system operation eventually reduces to *multiply two field elements modulo a prime*. The cost of one of those multiplies is essentially: $$ \text{cost}(\mathbb{F}_p) = O(\lceil \log_2 p / W \rceil^2) $$ where $W$ is your machine's word size (typically 64 bits) — i.e., the cost is *quadratic* in the number of machine words required to hold a field element. For BN254's 254-bit prime that's 4 limbs, so $\sim 16$ low-level multiplies per high-level field multiplication. For Mersenne31 — the prime $p = 2^{31} - 1$ — that's *one* limb, so *one* low-level multiply. Sixteen times faster on the floor. The headline cost is fewer cycles per multiply. The hidden cost — and the one that actually shifts the deployment landscape — is *SIMD parallelism*. AVX2 holds eight 32-bit lanes; AVX-512 holds sixteen. With BN254 you can fit two field elements in an AVX-512 register and parallelism is awkward. With Mersenne31 you fit sixteen, and operations like NTTs become embarrassingly parallel. There is one cost. **Soundness.** A 31-bit prime gives you ~31 bits of security per query in a STARK / FRI-based protocol. To get to the standard 100-bit security, you query the FRI oracle multiple times (~100 queries), or you work in a *quadratic / quartic / quintic extension field* during the protocol's soundness-critical steps. Plonky3 does both: prover work happens in the base field for speed, and the random-evaluation challenges (where soundness lives) happen in an extension field. This is the core trick. **Big fields where you need security; small fields everywhere else.** It buys an order of magnitude in prover time without compromising the threat model. ## The four small-field contenders There are four primes the 2026 ecosystem cares about. They're all chosen because they admit fast modular reduction (no expensive division per multiply) and they all fit comfortably in a 64-bit word. | Field | Prime | Why this prime | |---|---|---| | **Mersenne31** | $p = 2^{31} - 1$ | Mersenne prime — reduction is one shift + one add; smallest sensible prime field | | **BabyBear** | $p = 2^{31} - 2^{27} + 1$ | NTT-friendly — has a 2-adicity of 27, so domain sizes up to $2^{27}$ admit fast FFTs | | **KoalaBear** | $p = 2^{31} - 2^{24} + 1$ | NTT-friendly — slightly worse 2-adicity (24) but better extension-field arithmetic | | **Goldilocks** | $p = 2^{64} - 2^{32} + 1$ | 64-bit prime; used by plonky2 and Risc Zero; fits in one machine word | Plonky3 supports all of them and lets you pick at compile time. The choice changes the constant in front of the prover time and the security analysis but doesn't change the protocol shape. In production: - **plonky2** (the older Polygon Zero proof system, still widely deployed) uses Goldilocks. - **plonky3** primarily ships with BabyBear or KoalaBear as the recommended defaults. - **Risc Zero's zkVM** uses Goldilocks. - **Succinct's SP1** uses BabyBear. - **Stwo / StarkWare's next-gen** uses Mersenne31 (the M31 / `circle-stark` program). The convergence is striking: every serious 2026 zkVM is on a small field. The big-field era for *zkVMs specifically* is closing. G[2016: Groth16 - BN254] G --> P[2019: PLONK + KZG] P --> H[2020: Halo2 - Pasta IPA] H --> H2[2024: Halo2 - KZG/BN254] G --> S[2018: STARK - Goldilocks] S --> P2[2022: plonky2 - Goldilocks] P2 --> P3[2024: plonky3 - BabyBear] P3 --> ZK1[zkVMs: SP1, RISC0, Stwo] H2 --> EVM[EVM rollups] classDef big fill:#3a0a0a,stroke:#f87171,color:#fff classDef small fill:#0a4014,stroke:#4ade80,color:#fff class G,P,H,H2,EVM big class S,P2,P3,ZK1 small`}/> ## FRI — the polynomial commitment behind everything small The reason small fields work in proof systems at all is **FRI** (Fast Reed-Solomon Interactive Oracle Proof), introduced in [Ben-Sasson, Bentov, Horesh, Riabzev (2018)](https://eprint.iacr.org/2018/046). FRI is a *polynomial commitment scheme* that works over any field — no pairing-friendliness required, no trusted setup, no SRS. The trade-off is proof size: FRI proofs are tens of kilobytes, where KZG proofs are 600 bytes. For the prover, FRI is the most expensive thing in the protocol. Most of it is *folding*: at each round you take a polynomial of degree $d$ and reduce it to a polynomial of degree $d/2$ by combining adjacent coefficient pairs. Repeat $\log_2 d$ times and you arrive at a constant-degree polynomial that the verifier can check directly. The folding step is one line of arithmetic: $$ f'(x^2) = \frac{f(x) + f(-x)}{2} + r \cdot \frac{f(x) - f(-x)}{2x} $$ where $r$ is a random challenge from the verifier. If $f$ has degree $d$, $f'$ has degree $\lfloor d/2 \rfloor$. The verifier checks consistency at a small number of *query points* drawn at random. Below is a tiny Sandpack demo that visualises the folding step on a small polynomial — you pick a degree-7 polynomial, the demo folds it to degree-3, then degree-1, then a constant, and shows the coefficients at each step. 0n) { if (e & 1n) r = mul(r, b); b = mul(b, b); e >>= 1n; } return r; } // Evaluate polynomial f at x. function evalPoly(coeffs: bigint[], x: bigint): bigint { let acc = 0n; for (let i = coeffs.length - 1; i >= 0; i--) acc = add(mul(acc, x), coeffs[i]); return acc; } // Split coefficients into even-indexed and odd-indexed parts. // f(x) = f_even(x^2) + x * f_odd(x^2) function split(coeffs: bigint[]): [bigint[], bigint[]] { const even: bigint[] = []; const odd: bigint[] = []; for (let i = 0; i < coeffs.length; i++) { if (i % 2 === 0) even.push(coeffs[i]); else odd.push(coeffs[i]); } return [even, odd]; } // FRI folding: given f(x) and challenge r, return // f'(y) = f_even(y) + r * f_odd(y) // where y = x^2. The new polynomial has half the degree. function fold(coeffs: bigint[], r: bigint): bigint[] { const [even, odd] = split(coeffs); const out: bigint[] = []; const n = Math.max(even.length, odd.length); for (let i = 0; i < n; i++) { const e = i < even.length ? even[i] : 0n; const o = i < odd.length ? odd[i] : 0n; out.push(add(e, mul(r, o))); } return out; } const out = document.getElementById("out")!; const reroll = document.getElementById("reroll") as HTMLButtonElement; function fmt(coeffs: bigint[]): string { return "[ " + coeffs.map((c) => c.toString().padStart(2, " ")).join(", ") + " ]"; } function run() { // A degree-7 polynomial over F_101. const f = [3n, 1n, 4n, 1n, 5n, 9n, 2n, 6n]; let lines = []; lines.push("FRI folding over F_101"); lines.push("======================"); lines.push(""); lines.push(\`degree-7 poly: \${fmt(f)}\`); // Random challenges for each fold. let curr = f; let round = 0; while (curr.length > 1) { const r = BigInt(Math.floor(Math.random() * 100) + 1); const folded = fold(curr, r); lines.push(""); lines.push(\`round \${round + 1}: r = \${r}\`); lines.push(\` before: \${fmt(curr)} (degree \${curr.length - 1})\`); lines.push(\` after: \${fmt(folded)} (degree \${folded.length - 1})\`); curr = folded; round++; } lines.push(""); lines.push(\`final constant: \${curr[0]}\`); lines.push(""); lines.push("verifier checks consistency between rounds at randomly chosen"); lines.push("evaluation points — those are the FRI query points."); out.textContent = lines.join("\\n"); } reroll.addEventListener("click", run); run(); `, "/index.html": `
starting...
`, }} /> What's worth internalising from the demo: each fold is a *linear combination over field elements*. There's nothing exotic here. The reason FRI is fast in production is that the inner loop of "combine pairs of coefficients with a random multiplier" is exactly the kind of thing AVX-512 was built for. Sixteen lanes. Per cycle. Per core. ## Why "consumer hardware" matters in 2026 Here are wall-clock prover times for a 1-million-cycle zkVM trace, measured across the major 2026 zkVM stacks on a *consumer* machine — a 2024 MacBook Pro with M3 Max, 14 cores, 48 GB RAM. (Numbers from public benchmarks, normalised to the same reference input.) | Stack | Field | Prover time | Notes | |---|---|---|---| | RISC Zero (zkVM) | Goldilocks | ~3 minutes | STARK + AIR | | SP1 (zkVM) | BabyBear | ~95 seconds | plonky3-based | | Stwo (zkVM) | Mersenne31 | ~80 seconds | circle-STARK on M31 | | zkSync (Boojum) | Goldilocks | ~5 minutes | older arithmetisation | Two years ago, none of these were under five minutes. Today the leaderboard is a tight band between 80 seconds and 3 minutes, and the difference is dominated by *which small field*. The big-field equivalent (a pure BN254 PLONK prover at the same trace) would take 30+ minutes on the same machine. This is what "consumer hardware is now a viable prover" means in 2026. The substantial barrier — the one that kept zkVMs *off* consumer hardware until 2024 — was the cost of MSMs and NTTs over big fields. Small fields removed that barrier. ## The four-prime tradeoff ## Why this should change how you think about ZK costs The dominant ZK cost model from 2018 to 2024 was: *more constraints = more dollars*. Field arithmetic was the bottleneck, the constants were huge, and a million-constraint circuit was a real research expense. The 2026 cost model is different. *Constraint count still matters,* but the constants have collapsed. A million-constraint Plonky3 trace proves on a $1500 laptop in under two minutes. That's three orders of magnitude cheaper than the equivalent BN254 PLONK prover four years ago. Prover-side cost is no longer the binding constraint for most applications. The *new* binding constraints are: 1. **Memory bandwidth.** Big NTTs are memory-bound, not compute-bound. The win from small fields is partly that more elements fit in cache. 2. **Verifier complexity in non-EVM environments.** Plonky3 proofs are 50–200 KB; verifying them on Ethereum requires either an EVM-friendly final wrap (which is what the SP1 / RISC0 / Stwo verifiers do) or a Solana-style permissive compute budget. 3. **Ecosystem maturity.** snarkjs / Halo2-axiom / circomlib have a decade of accreted gadgets; Plonky3 is in year three of its current incarnation. The libraries are catching up but they're not at parity yet. ## Where this leaves zera-sdk Inside [zera-sdk](/blog/zera_sdk_scaffolding/) the substrate is BN254 + Groth16 because *Solana's verifier is BN254-and-only-BN254 today*. There's no equivalent of `sol_alt_bn128_pairing` for any of the small-field protocols. That means Plonky3 is not a choice we get to make for the deposit / transfer / withdraw circuits — the on-chain side fixes the curve. What we *do* track is the [Solana CPI proposal for STARK verification](https://github.com/solana-labs/solana) (no number yet; was last discussed in 2025) and the related "compute-budget-friendly Halo2 verifier" path. The day Solana ships either of those, the prover-side win from migrating off BN254 is large enough to justify a circuit rewrite. Until then, BN254 it is. For *off-chain* proving — CI checks, offline auditing, batch verification — Plonky3 is already the right tool, and we're using it inside the test harness for cross-validating circuit semantics. ## What I'd build differently in 2027 Three follow-ups, in order of how much I expect them to matter: 1. **A small-field shielded pool.** Every privacy pool today is BN254 + Groth16 + per-circuit ceremony. The day Solana (or any high-throughput L1) ships a STARK verifier, the design space opens: no ceremony, faster proving, smaller wallets. Someone will publish this design before the verifier ships and they'll be right to. 2. **A unified extension-field abstraction.** Plonky3 has different extension-field arithmetic per base field. A single `Ext` with consistent ergonomics would make cross-field experimentation trivial. The team is aware; not yet shipped. 3. **A small-field Poseidon variant.** Poseidon-128 is parameterised for BN254. The recommended hash for BabyBear is *Monolith* or *Poseidon2 over BabyBear*, and the constraint counts are different enough that constraint-counting intuition from BN254 doesn't transfer. A "Poseidon constraint cost calculator" that takes a field as input and emits constraint counts for common circuits would close a real reasoning gap. ## Further reading - [`Plonky3/Plonky3`](https://github.com/Plonky3/Plonky3) — the toolkit; the README is the closest thing to a paper - [Polygon Plonky3 is Production Ready](https://polygon.technology/blog/polygon-plonky3-the-next-generation-of-zk-proving-systems-is-production-ready) — Polygon's announcement, summarising the small-field bet - [Scalable, transparent, and post-quantum secure computational integrity](https://eprint.iacr.org/2018/046) — Ben-Sasson, Bentov, Horesh, Riabzev (2018) — the FRI / STARK paper - [Risc Zero zkVM proof system](https://dev.risczero.com/proof-system-in-detail.pdf) — the contrast point: Goldilocks + STARK in production - [Halo2 in 2026: what changed since the Zcash era](/blog/halo2_in_2026_what_changed/) — sister post on the big-field / KZG lineage - [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — what Plonky3 means for in-browser proving (spoiler: enormous, eventually) - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash that's being re-parameterised for small fields --- # Recursive proof composition without the abyss: Halo to Nova Canonical: https://blog.skill-issue.dev/blog/recursive_proofs_halo_to_nova/ Description: The path from Halo's accumulation scheme to Nova's folding scheme, derived from the recurrence relation. Where Halo2, Nova, SuperNova, and HyperNova actually differ, and which one to reach for in 2026. Published: 2026-05-02T16:00:00.000Z Tags: cryptography, recursive-snark, halo2, nova, folding, zk, phd, math import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx"; A recursive SNARK is a proof that proves *another proof was checked correctly*. A program that runs for $T$ steps and produces a proof of correct execution at each step can — recursively — collapse all $T$ proofs into one. The verifier work goes from $O(T)$ to $O(1)$. This is the structural reason ZK rollups exist. It is also the reason "incrementally verifiable computation" stopped being a research curiosity in 2020 and became a deployment target. The two papers that sit underneath every recursive SNARK shipped today are [Halo (Bowe, Grigg, Hopwood 2019)](https://eprint.iacr.org/2019/1021) and [Nova (Kothapalli, Setty, Tzialla 2022)](https://eprint.iacr.org/2021/370). They take very different routes to the same destination. This post is the math of both, the trade-off table for picking one in 2026, and a Rust skeleton for the Nova folding step. ## The problem recursive SNARKs are solving You have a program that runs for $T$ steps. At each step you produce a proof that the step was executed correctly. Naively, the verifier checks all $T$ proofs — verifier cost $O(T)$, no better than re-running the program. Useless. The recursive trick: at step $i$, instead of producing a fresh proof, you produce a proof that says *"step $i$ was executed correctly, **and** the proof from step $i-1$ verifies."* The proof for step $i$ recursively absorbs the proof for step $i-1$. After $T$ steps you have one proof; the verifier checks one proof; the cost is $O(1)$ in the program length. The hardest part is making the inner verification cheap. If the verifier work for one proof is $V$ and you embed that work in the circuit for the next proof, you've blown up the prover cost by $V$. Recursion is only useful if $V$ is constant or near-constant in the original circuit size — which is exactly what Groth16, Halo, and Nova all aim for in different ways. Pr0[proof pi_0] end subgraph S1[Step 1] P1[execute step 1] --> V1[verify pi_0] V1 --> Pr1[proof pi_1: 'step 1 ran AND pi_0 verified'] end subgraph S2[Step 2] P2[execute step 2] --> V2[verify pi_1] V2 --> Pr2[proof pi_2: 'step 2 ran AND pi_1 verified'] end Pr0 --> V1 Pr1 --> V2 Pr2 --> Final[final verifier checks ONE proof] classDef step fill:#0a0a0a,stroke:#4ade80,color:#4ade80 class S0,S1,S2 step`}/> The math problem reduces to one question: *how cheaply can you verify a SNARK inside a SNARK?* ## The Halo trick: accumulation without recursion-in-circuit Pre-Halo recursion required a **cycle of pairing-friendly elliptic curves**. Two curves $E_1, E_2$ with the property that the scalar field of $E_1$ is the base field of $E_2$ and vice versa, so that arithmetic over one curve can be expressed natively in the other curve's circuit. Pasta (Pallas / Vesta) and MNT4/6 are the canonical cycles. The reason this matters: if you want to verify a Groth16 proof inside a Groth16 circuit, you need pairing-friendly arithmetic *inside the circuit*, which means the circuit field has to support the pairing curve. A cycle gives you two curves where each can verify proofs over the other. The cycle constraint is annoying. Pasta's curves don't have efficient pairings (they're cycle-friendly, not pairing-friendly), so they trade pairing efficiency for cycle availability. MNT cycles have very large fields and slow arithmetic. There's no free lunch. [Halo (Bowe, Grigg, Hopwood 2019)](https://eprint.iacr.org/2019/1021) was the first practical example of recursive proof composition that broke this constraint. The insight: instead of *verifying* the inner proof inside the circuit, you **accumulate** the most expensive part of the verification (the multiscalar multiplication, MSM) into a running sum, and defer the actual MSM check to the end of the recursion. Formally: the verifier of an inner-product-argument-based proof has to check an equation of the form $$ \sum_i s_i \cdot G_i = P $$ for some derived scalars $s_i$ and group elements $G_i$. This is the bottleneck — it's a multiscalar multiplication of size linear in the circuit. The accumulation-scheme trick is: at step $k$, instead of *checking* this equation, you produce a fresh "accumulator" $\text{acc}_k = (G_k^{\text{folded}}, P_k^{\text{folded}})$ that combines the current step's MSM with the previous accumulator. After $T$ steps you have one accumulator and one MSM check. Verifier cost: $O(\log T)$ for the recursion plus one final MSM. The Halo paper formalises this as a **polynomial commitment with deferred opening**. It works because the recursive composition can defer expensive arithmetic, not because it embeds full verification in-circuit. From the abstract: We present Halo, the first practical example of recursive proof composition without a trusted setup, using only the discrete logarithm assumption over normal cycles of elliptic curves. Recursion is achieved by amortizing away the expensive verification procedures from within the proof verification cycle, deferring them until the end of the recursion. Halo2, the Zcash production deployment, uses the same construction over Pasta and ships it in Orchard (Zcash's NU5 upgrade in 2022). Halo2 is also the basis of Aztec Connect, the Scroll zkEVM, and the Filecoin Snark-pack. ## The Nova trick: folding instead of accumulating Two years after Halo, [Nova (Kothapalli, Setty, Tzialla 2022)](https://eprint.iacr.org/2021/370) reframed the problem entirely. Instead of accumulating an MSM, Nova introduces a **folding scheme**: a primitive that takes two instances of a relation and folds them into a single instance, with prover cost $O(|F|)$ for some step circuit $F$ and **no SNARK at all** in the recursion. The Nova relation is a relaxed R1CS instance: $$ \mathbf{A} \mathbf{z} \circ \mathbf{B} \mathbf{z} = u \mathbf{C} \mathbf{z} + \mathbf{e} $$ where $\mathbf{A}, \mathbf{B}, \mathbf{C}$ are the constraint matrices, $\mathbf{z}$ is the witness extended with public inputs, $u$ is a slack scalar (1 in the standard R1CS case), and $\mathbf{e}$ is an *error vector* (zero in the standard case). The "relaxed" part is that $u$ and $\mathbf{e}$ are allowed to be nonzero — that's what makes folding possible. Given two relaxed R1CS instances $(u_1, \mathbf{z}_1, \mathbf{e}_1)$ and $(u_2, \mathbf{z}_2, \mathbf{e}_2)$, the folding scheme produces a single instance $(u, \mathbf{z}, \mathbf{e})$ via a random challenge $r$: $$ u = u_1 + r \cdot u_2, \quad \mathbf{z} = \mathbf{z}_1 + r \cdot \mathbf{z}_2, \quad \mathbf{e} = \mathbf{e}_1 + r \cdot \mathbf{T} + r^2 \cdot \mathbf{e}_2 $$ with $\mathbf{T}$ a "cross-term" the prover sends to the verifier. The folded instance is satisfying iff both originals were (with overwhelming probability). Crucially, **folding does not require the verifier to do any expensive cryptographic work**: $\mathbf{T}$ is a vector commitment, $r$ is a Fiat-Shamir challenge, and the new $(u, \mathbf{z}, \mathbf{e})$ is a linear combination. No pairings. No SNARK. The Nova **incrementally verifiable computation (IVC) recurrence** is then: $$ (u_{i+1}, \mathbf{z}_{i+1}, \mathbf{e}_{i+1}) = \text{Fold}\big( (u_i, \mathbf{z}_i, \mathbf{e}_i), \; (u_F, \mathbf{z}_F^{(i)}, \mathbf{0}) \big) $$ where the second instance is a fresh R1CS encoding of "step $i$ of the program $F$ executed correctly." After $T$ steps, you have one relaxed R1CS instance, and you produce a single SNARK that proves it's satisfying. The SNARK runs *once*, at the end. Every step in between is folding. The cost asymmetry is the entire pitch. Halo's per-step cost is $O(|F|)$ for the step plus $O(\log T)$ for the recursion. Nova's per-step cost is just $O(|F|)$ — no recursion overhead. For long computations ($T \gg 1$) Nova is significantly cheaper. The trade-off: Nova gives you one final SNARK to verify, while Halo gives you a SNARK at every step. R0[R1CS instance z_0] end subgraph N1[Nova step 1] F1[execute F at step 1] --> R1[fresh R1CS z_1] Acc0[accumulator U_0] --> Fold1[fold] R1 --> Fold1 Fold1 --> Acc1[accumulator U_1] end subgraph N2[Nova step 2] F2[execute F at step 2] --> R2[fresh R1CS z_2] Acc1 --> Fold2[fold] R2 --> Fold2 Fold2 --> Acc2[accumulator U_2] end R0 --> Acc0 Acc2 --> SNARK[final SNARK proves U_T satisfying] classDef step fill:#0a0a0a,stroke:#4ade80,color:#4ade80 class N0,N1,N2 step`}/> The folding scheme is the entire idea. Everything else in Nova is bookkeeping around it. ## Halo2, Nova, SuperNova, HyperNova — what's the difference Four production-grade recursive systems, four design points. The two questions that decide which one to reach for in 2026: 1. **Is your computation a uniform step that repeats, or a heterogeneous instruction set?** Uniform: Nova. Heterogeneous: SuperNova. (HyperNova handles both via CCS.) 2. **Do you need a SNARK at every step, or can you defer to one final SNARK?** Every step: Halo2. Defer: Nova / SuperNova / HyperNova. For a privacy-pool transfer, neither of these is the right shape — you want a single SNARK per spend, no recursion. For [zera-sdk](/blog/zera_sdk_scaffolding/) we ship Groth16 and don't recurse. For a *rollup* settling many transfers in a batch, Nova-flavoured folding is the right structural answer because the per-step cost is dominated by the transfer logic and the final SNARK only runs once per epoch. ## A Nova folding step, in Rust The cleanest way to see what's actually happening in a folding step is to write one out. The skeleton below is a Nova-shaped folding step over a toy R1CS — no pairings, no real curve, no soundness, but the linear-combination structure is the real thing. {`// A Nova-shaped folding step over a toy "relaxed R1CS" instance. // This is INTENTIONALLY toy-shaped: scalars are u128 mod a prime, the // "commitment" is a hash, and there's no curve arithmetic. The shape of // the linear combinations is real; the soundness is not. // // Reference: Nova: Recursive Zero-Knowledge Arguments from Folding Schemes // https://eprint.iacr.org/2021/370 const MODULUS: u128 = (1u128 << 61) - 1; // toy Mersenne prime #[derive(Clone, Debug)] struct RelaxedR1CS { /// Slack scalar: 1 for a fresh instance, accumulates after folding. u: u128, /// Witness extended with public inputs. z: Vec, /// Error vector: 0 for a fresh instance, accumulates after folding. e: Vec, /// Vector commitment to z. (Toy: just a checksum.) com_z: u128, /// Vector commitment to e. com_e: u128, } fn add(a: u128, b: u128) -> u128 { (a.wrapping_add(b)) % MODULUS } fn mul(a: u128, b: u128) -> u128 { (a.wrapping_mul(b)) % MODULUS } fn vec_add(a: &[u128], b: &[u128]) -> Vec { a.iter().zip(b.iter()).map(|(x, y)| add(*x, *y)).collect() } fn vec_scale(a: &[u128], s: u128) -> Vec { a.iter().map(|x| mul(*x, s)).collect() } // Toy "commitment": rolling hash. A real Nova uses Pedersen commitments. fn commit(v: &[u128]) -> u128 { let mut h: u128 = 0xCAFE_BABE_DEAD_BEEF; for x in v { h = h.wrapping_mul(0x100000001b3).wrapping_add(*x); } h % MODULUS } /// Fold two relaxed R1CS instances into one, using random challenge r. /// Returns the folded instance and the cross-term T (which the prover /// sends to the verifier in a real protocol). fn fold( inst1: &RelaxedR1CS, inst2: &RelaxedR1CS, r: u128, ) -> (RelaxedR1CS, Vec) { // Cross term T. In real Nova: T = Az_1 o Bz_2 + Az_2 o Bz_1 // - u_1 * C z_2 - u_2 * C z_1. // Toy: just a placeholder of the right shape. let len = inst1.e.len(); let cross_term: Vec = (0..len) .map(|i| { let t1 = mul(inst1.u, inst2.z.get(i).copied().unwrap_or(0)); let t2 = mul(inst2.u, inst1.z.get(i).copied().unwrap_or(0)); add(t1, t2) }) .collect(); // Folded slack scalar: u = u_1 + r * u_2 let u = add(inst1.u, mul(r, inst2.u)); // Folded witness: z = z_1 + r * z_2 let z = vec_add(&inst1.z, &vec_scale(&inst2.z, r)); // Folded error: e = e_1 + r * T + r^2 * e_2 let r2 = mul(r, r); let e = vec_add( &vec_add(&inst1.e, &vec_scale(&cross_term, r)), &vec_scale(&inst2.e, r2), ); let folded = RelaxedR1CS { u, com_z: commit(&z), com_e: commit(&e), z, e, }; (folded, cross_term) } fn fresh_instance(z: Vec) -> RelaxedR1CS { let e = vec![0u128; z.len()]; let com_z = commit(&z); let com_e = commit(&e); RelaxedR1CS { u: 1, z, e, com_z, com_e } } fn main() { // Step 0: fresh instance with witness z_0. let acc = fresh_instance(vec![3, 5, 7, 11]); println!("step 0: u={}, |z|={}, com_z={:#x}", acc.u, acc.z.len(), acc.com_z); // Step 1: fold in a fresh R1CS instance from running F at step 1. let step1 = fresh_instance(vec![13, 17, 19, 23]); let r1: u128 = 0xDEADBEEF; // Fiat-Shamir challenge in real protocol let (acc, _t) = fold(&acc, &step1, r1); println!("step 1: u={}, com_z={:#x}, |e|={}", acc.u, acc.com_z, acc.e.len()); // Step 2: fold in another step. let step2 = fresh_instance(vec![29, 31, 37, 41]); let r2: u128 = 0xFEEDFACE; let (acc, _t) = fold(&acc, &step2, r2); println!("step 2: u={}, com_z={:#x}", acc.u, acc.com_z); // After T steps, the final accumulator is one relaxed R1CS instance. // The protocol proves it's satisfying via a single SNARK at the end. println!("\\nfinal accumulator captures all 3 steps in one instance."); println!("a SNARK proves this instance is satisfying — 1 proof for any T."); } `} The shape is the thing. The fold is just three linear combinations: $u' = u_1 + r u_2$, $\mathbf{z}' = \mathbf{z}_1 + r \mathbf{z}_2$, $\mathbf{e}' = \mathbf{e}_1 + r \mathbf{T} + r^2 \mathbf{e}_2$. The cross-term $\mathbf{T}$ is what the prover sends; the challenge $r$ is Fiat-Shamir over the transcript. In real Nova the witness $\mathbf{z}$ is replaced by a Pedersen commitment to it (so the verifier never sees the witness), and the error vector $\mathbf{e}$ is replaced by a commitment as well. The linear structure of the fold is preserved by the additive-homomorphic property of the Pedersen commitment, which is the entire reason Pedersen is the right primitive here. ## Where this lands for ZERA The honest answer about recursion in [zera-sdk](/blog/zera_sdk_scaffolding/) v1: we don't use it. A privacy transfer is one Groth16 proof per spend, and there's nothing to recurse over. The advantage of recursion shows up when: - You're settling many transfers in a batch (rollup shape) and want to compress them into one proof. - You're running a zkVM (Lurk, Jolt, RISC0) where the program has many uniform steps. - You're building a light client that has to verify a long chain of proofs cheaply. For ZERA's transfer flow, none of these apply. For the eventual settlement layer that sits *underneath* a chain of ZERA transfers (think: a state-root proof every epoch), Nova-style folding is the right shape, and the design seam is in `crates/zera-sdk-core/src/recursion.rs`. Empty file today. We've left the door open. The reason I wrote this post anyway is that recursion is the part of the ZK stack that's most actively moving in 2026. HyperNova landed at CRYPTO 2024 with a CCS-based unification of R1CS / AIR / Plonkish that was supposed to take five more years. The next two years are going to compress IVC primitives down to "one folding scheme, three commitment choices, pick your poison." Anyone deploying a ZK system today should know what shape that compressed primitive will be, because the migration cost will be the difference between a clean refactor and a rewrite. ## Further reading - [Halo: Recursive Proof Composition without a Trusted Setup](https://eprint.iacr.org/2019/1021) — Bowe, Grigg, Hopwood (2019) — the accumulation-scheme paper. - [Nova: Recursive Zero-Knowledge Arguments from Folding Schemes](https://eprint.iacr.org/2021/370) — Kothapalli, Setty, Tzialla (CRYPTO 2022) — the folding-scheme paper. - [SuperNova: Proving universal machine executions without universal circuits](https://eprint.iacr.org/2022/1758) — Kothapalli, Setty (2022) — the per-instruction folding extension. - [HyperNova: Recursive Arguments for Customizable Constraint Systems](https://eprint.iacr.org/2023/573) — Kothapalli, Setty (CRYPTO 2024) — CCS-based unification. - [microsoft/Nova](https://github.com/microsoft/Nova) — the canonical Rust implementation. - [Halo2 book](https://zcash.github.io/halo2/) — the production deployment behind Zcash NU5. - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash function inside the recursion circuit. - [Why BN254, and when to switch off it](/blog/why_bn254_and_when_to_switch/) — the curve choice underneath the SNARK that closes the recursion. --- # PPST: extending SPST to arbitrary private computation Canonical: https://blog.skill-issue.dev/blog/ppst_private_programmable_state/ Description: F_RP Construction II. Generalises SPST to private programmable state: arbitrary arithmetic circuits over committed pre/post-state, with R1CS-embedded program execution and atomic PPST-SPST composition. Published: 2026-05-02T15:00:00.000Z Tags: zk, cryptography, circuits, r1cs, aleo, aztec, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; [SPST](/blog/spst_self_paying_shielded_transactions/) gave us private value transfer with self-paying fees on a smart-contract chain. That's the Solana analogue of Zcash's Sapling — and exactly what every existing relayer-dependent privacy mixer (Tornado, RAILGUN, Light v1) does, just without the relayer. But Tornado-style protocols are not the goal. The goal is **Turing-complete** private computation: a Solana program that runs on encrypted state and produces a proof of correct execution without leaking what the state was, what the inputs were, or what the program output. That's PPST. This is post 4 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. Reading [post 3](/blog/spst_self_paying_shielded_transactions/) first will help, but the construction here stands alone. ## What "private program" means here **Definition (Private Program).** A *private program* is an arithmetic circuit $$ C : \mathbb{F}_p^{n_{\mathsf{in}}} \to \mathbb{F}_p^{n_{\mathsf{out}}} $$ over the BN254 scalar field, specified by an R1CS constraint system $(A, B, C)$ of size $N_C$. Each program is identified by $$ \mathsf{program\_id} \;=\; \mathsf{Poseidon}(\mathsf{vk}_C), $$ where $\mathsf{vk}_C$ is the Groth16 verification key. The program identifier is a **public, deterministic commitment to the program's logic**. **Definition (Private State).** A vector $\mathsf{state} \in \mathbb{F}_p^k$ committed as $$ \mathsf{cm}_{\mathsf{state}} \;=\; \mathsf{Poseidon}(\mathsf{state}[0], \ldots, \mathsf{state}[k-1], r_{\mathsf{state}}). $$ State commitments are leaves in a state Merkle tree $\mathcal{T}_S$ of depth 32, root $\mathsf{rt}_S$. This is a separate tree from the SPST note-commitment tree. **Definition (State Transition).** A triple $(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}, \mathsf{state}_{\mathsf{post}})$ where $\mathsf{aux}$ is private auxiliary input and $$ C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) \;=\; \mathsf{state}_{\mathsf{post}}. $$ The transition consumes $\mathsf{cm}_{\mathsf{pre}}$ via nullification and produces $\mathsf{cm}_{\mathsf{post}}$ as a new tree leaf. **The program logic $C$ is never revealed to the verifier — only $\mathsf{program\_id}$ is.** ## The PPST relation The relation $\mathcal{R}_{\mathsf{PPST}}$ is the set of $(x, w)$ pairs: **Public instance** $x = \bigl(\mathsf{rt}_{\mathsf{pre}}, \mathsf{rt}_{\mathsf{post}}, \mathsf{nf}_{\mathsf{state}}, \mathsf{cm}_{\mathsf{post}}, \mathsf{program\_id}, f\bigr)$. **Private witness** $w = \bigl(\mathsf{state}_{\mathsf{pre}}, r_{\mathsf{pre}}, \mathsf{path}_{\mathsf{pre}}, sk_{\mathsf{state}}, \mathsf{aux}, \mathsf{state}_{\mathsf{post}}, r_{\mathsf{post}}, \mathsf{vk}_C\bigr)$. Nine constraints, all enforced by the outer PPST circuit: | # | Name | Constraint | |---|------|-----------| | P1 | Program identification | $\mathsf{program\_id} = \mathsf{Poseidon}(\mathsf{vk}_C)$ | | P2 | Pre-state commitment | $\mathsf{cm}_{\mathsf{pre}} = \mathsf{Poseidon}(\mathsf{state}_{\mathsf{pre}}, r_{\mathsf{pre}})$ | | P3 | Pre-state membership | $\mathsf{MerkleVerify}(\mathsf{rt}_{\mathsf{pre}}, \mathsf{cm}_{\mathsf{pre}}, \mathsf{path}_{\mathsf{pre}}) = 1$ | | P4 | State nullification | $\mathsf{nf}_{\mathsf{state}} = \mathsf{PRF}_{sk_{\mathsf{state}}}(\mathsf{cm}_{\mathsf{pre}})$ | | **P5** | **Program execution** | $C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) = \mathsf{state}_{\mathsf{post}}$ | | P6 | Post-state commitment | $\mathsf{cm}_{\mathsf{post}} = \mathsf{Poseidon}(\mathsf{state}_{\mathsf{post}}, r_{\mathsf{post}})$ | | P7 | Post-state tree update | $\mathsf{rt}_{\mathsf{post}} = \mathsf{MerkleInsert}(\mathsf{rt}_{\mathsf{pre}}, \mathsf{cm}_{\mathsf{post}})$ | | P8 | Fee extraction | value-bearing state OR companion SPST | | P9 | State authorization | $\mathsf{pk}_{\mathsf{state}} = \mathsf{PRF}_{sk_{\mathsf{state}}}(0)$ embedded in pre-state | P5 is the heart of the construction. The user-defined program $C$ — written in Circom, Noir, Leo, or any high-level circuit DSL — is **embedded as a sub-circuit** inside the outer PPST relation. The R1CS for $C$ becomes constraints inside the R1CS for $\mathcal{R}_{\mathsf{PPST}}$. ## How the program embedding works private fn swap{a,b}
in Noir/Leo/Circom] --> B[Compiler] B --> C[R1CS for C
~10K-1M constraints] C --> D[Merge into PPST circuit
R1CSPPST = R1CSoverhead + R1CSC] D --> E[Groth16.Setup
per-program ceremony] E --> F[vk_C → on-chain PDA
at addr_C = PDA{H{vk_C}}] classDef step stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff class A,B,C,D,E,F step `}/> The outer circuit sizes look like: $$ N_{\mathsf{PPST}} \;\approx\; N_{\mathsf{overhead}} \;+\; N_C $$ where $N_{\mathsf{overhead}} \approx 25{,}000$ R1CS constraints (Merkle paths, commitment hashes, PRF evaluations — see SPST §3.1.6) and $N_C$ is the program circuit size. | Program complexity | $N_C$ | Total PPST | Groth16 prove (M2) | |--------------------|-------|------------|---------------------| | **Simple** (token transfer, vote, ACL) | 10³ — 10⁴ | 35,000 — 50,000 | 1 — 3 s | | **Moderate** (private AMM swap, auction bid, credential) | 10⁵ — 10⁶ | 125,000 — 10⁶ | 5 — 60 s | | **Complex** (private ML inference, DB queries) | > 10⁷ | impractical for direct Groth16 | minutes — hours | Complex programs need IVC. PPST extends naturally: decompose the computation into $T$ uniform steps each running $C_{\mathsf{step}}$, fold them with Nova or SuperNova, then wrap the final accumulator in a Groth16 decider proof. **The on-chain verifier always sees a constant-size 128-byte proof regardless of $T$.** Off-chain proving is $O(T \cdot |C_{\mathsf{step}}|)$ but the chain doesn't care. ## Theorem 3.6 — PPST soundness **Statement.** If Groth16 is knowledge-sound and Poseidon is collision-resistant, no PPT adversary can cause the PPST verifier to accept a transaction corresponding to an invalid state transition (one where $C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) \neq \mathsf{state}_{\mathsf{post}}$) except with negligible probability. **Proof sketch.** Suppose $\mathcal{A}$ produces a valid PPST transaction whose underlying transition is invalid. By Groth16 knowledge soundness, the extractor $\mathcal{E}$ recovers a witness $w^*$ satisfying constraints P1–P9 — including P5: $C(\mathsf{state}^*_{\mathsf{pre}}, \mathsf{aux}^*) = \mathsf{state}^*_{\mathsf{post}}$. Direct contradiction. ∎ **Corollary (State Integrity).** The state tree $\mathcal{T}_S$ maintains the invariant that every leaf is a commitment to a state that resulted from a valid execution of an authorized program starting from a previously valid state. By induction on accepted transactions, this invariant holds at all times. ## Theorem 3.7 — PPST zero-knowledge **Statement.** PPST reveals nothing about $\mathsf{state}_{\mathsf{pre}}$, $\mathsf{state}_{\mathsf{post}}$, $\mathsf{aux}$, or the internal logic of $C$ beyond the public outputs. **Proof sketch.** Direct from perfect ZK of Groth16. The simulator $\mathcal{S}$ depends only on the public instance $x$, not on the witness. For any two valid witnesses $w_0, w_1$ consistent with the same $x$, the proof distributions are identical. What does leak: - **`program_id`** is intentionally public. It identifies which program executed so the verifier can pick the right verification key. *Full function privacy* (hiding the program identity) requires a universal circuit or a commitment-to-vk argument and is left as a future extension. - The fact that *some* state transition occurred under that program. - The fee $f$. What does not leak: - The specific state values. - The auxiliary inputs. - Which specific leaf in $\mathcal{T}_S$ was consumed. ## Theorem 3.8 — PPST-SPST composability This is the magic. PPST and SPST compose into a single atomic transaction: **execute a private program AND transfer shielded value, in one ZK proof.** Construct the composite relation $\mathcal{R}_{\mathsf{PPST+SPST}} = \mathcal{R}_{\mathsf{PPST}} \wedge \mathcal{R}_{\mathsf{SPST}}$ with a *linking constraint*: $$ \mathsf{link} \;=\; \mathsf{Poseidon}(\mathsf{nf}_{\mathsf{state}}, \mathsf{nf}_1, \ldots, \mathsf{nf}_n) $$ binding the PPST state nullifier to the SPST input nullifiers. Both sub-proofs reference the same `link` value as a public input. Cross-constraint (Value Mediation): if the program outputs a transfer amount $\Delta v$, the SPST component enforces $$ \sum_i v^{(\mathsf{SPST})}_{\mathsf{in},i} \;=\; \sum_j v^{(\mathsf{SPST})}_{\mathsf{out},j} \;+\; f \;+\; \Delta v_{\mathsf{to\_program}}. $$ That is — value flowing from the SPST shielded pool *into* the program's state (or out of it) is reconciled inside the proof. An observer cannot tell whether the program consumed value, produced value, or merely transferred it. **Practical realisation.** For a moderate program ($N_C \sim 50{,}000$): $$ N_{\mathsf{comp}} = N_{\mathsf{PPST}} + N_{\mathsf{SPST}} + N_{\mathsf{link}} \;\approx\; (50{,}000 + 25{,}000) + 24{,}000 + 400 \;\approx\; 100{,}000 \text{ constraints}. $$ Groth16 prover time on commodity hardware: 5–10 seconds. **Single 128-byte proof on-chain.** ~200,000 CU verification cost on Solana. ## Comparison with Aleo and Aztec The thing PPST gets that Aleo and Aztec don't is **deployment as a protocol layer on a high-performance Layer-1**. Aleo and Aztec each require running their own consensus or sequencer. PPST runs as a Solana program on the same validators as Jupiter and Helium — inheriting Solana's TPS, finality, and infrastructure. ## What's left PPST plus SPST gives us private value + private computation. That's two of the three privacy properties. The remaining gap is **submitter anonymity**: even with a perfect ZK proof, the wrapping Solana transaction is signed by an Ed25519 key whose public key is on-chain. Address graph analysis trivially links the "private" transaction to the submitter's identity. The next post is about closing that gap — without a relayer, without a mixing service, and without a separate L1. ## Bibliography - Bowe, S., Chiesa, A., Green, M., Miers, I., Mishra, P., Wu, H. (2020). *ZEXE: Enabling Decentralized Private Computation.* IEEE S&P 2020. - Chiesa, A., Hu, Y., Maller, M., Mishra, P., Vesely, N., Ward, N. (2020). *Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS.* EUROCRYPT 2020. https://eprint.iacr.org/2019/1047 - Aztec Network. *Client-side Proof Generation.* https://aztec.network/blog/client-side-proof-generation - Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370 - Noir Language Documentation. https://noir-lang.org/docs/ Previous: [SPST: self-paying shielded transactions ←](/blog/spst_self_paying_shielded_transactions/) · Next: [TAB: threshold-anonymous broadcast →](/blog/tab_threshold_anonymous_broadcast/) --- # Halo2 in 2026: what changed since the Zcash era Canonical: https://blog.skill-issue.dev/blog/halo2_in_2026_what_changed/ Description: A survey of the Halo2 ecosystem six years after the Zcash team published it — what stayed the same (PLONKish, lookups, IPA), what evolved (KZG, gadget libraries, fork landscape), and what we ship today. Published: 2026-05-01T15:30:00.000Z Tags: halo2, plonk, zcash, kzg, lookups, zk, phd import { Mermaid, RustPlayground, TradeoffTable, Aside, Quote } from "@/components/mdx"; When Zcash open-sourced [Halo2](https://github.com/zcash/halo2) in 2020, it was a research artefact attached to a single deployment target — Zcash's Orchard pool — and a single arithmetisation choice — IPA over the Pasta cycle of curves. Six years later it is a small ecosystem of forks, used by Scroll, Taiko, Axiom, and roughly half the EVM rollups under construction in 2026. The original repository has been in maintenance mode since 2024. This post is the orientation I wish someone had handed me when I started auditing Halo2 circuits seriously. *What stayed the same?* The arithmetisation. The lookup argument. The mental model of *chips inside regions inside columns*. *What evolved?* The polynomial commitment scheme, the curve choices, the gadget library, and most of all the fork landscape. By the end you should have a defensible answer to "which Halo2 do you mean?" — which is the question every serious ZK conversation in 2026 reduces to within five minutes. ## What stayed: the PLONKish arithmetisation The shape of a Halo2 circuit hasn't changed since 2020. You define **columns** — advice (witness), fixed (constants), and instance (public inputs) — and a **rectangular grid** of cells indexed by row and column. The prover assigns values to advice cells; the constraint system asserts polynomial relations on those values, evaluated at every row. Three families of constraints make up a Halo2 circuit: 1. **Custom gates.** A polynomial identity that must hold on every row, possibly gated by a selector column. `q_mul · (a · b - c) = 0` is the canonical example: when `q_mul = 1`, the constraint forces `a · b = c`; when `q_mul = 0`, the constraint vanishes. 2. **Permutation arguments.** Cells that should be equal across rows or columns are wired into a permutation. This is what gives you "this output of gate A is the input to gate B" without paying the cost of an extra constraint per copy. 3. **Lookup arguments.** A cell must be in some pre-declared table. This is what makes range checks ($x < 2^{16}$) cost ~1 row per check instead of 16, and what makes XOR / S-box / SHA tables tractable inside a SNARK. The novelty in 2020 wasn't any single one of those — PLONK had given us custom gates and permutations, plookup had given us lookups — but the *combination*, the *user-facing API* (chips, regions, layouters), and the *recursion-friendly proof system* underneath it. The arithmetisation is so durable that every Halo2 fork in 2026 still uses the same `Circuit` trait, the same `Layouter`, the same `Region`, the same `Selector`. If you wrote a Halo2 chip in 2021, it compiles in 2026 against PSE-Halo2 with one or two trait-bound tweaks. That's an extraordinary track record for a 6-year-old framework. ## What evolved: from IPA to KZG The original Zcash Halo2 used **IPA** (inner-product argument) over the Pasta cycle of curves (Pallas + Vesta). That choice was deliberate: IPA needs *no trusted setup*, and the Pasta cycle let Zcash do recursion without pairings. Beautiful in theory; expensive in practice. IPA proofs are kilobytes; verification is logarithmic in circuit size and dominated by group operations. The dominant 2026 fork — Privacy Scaling Explorations' [`privacy-scaling-explorations/halo2`](https://github.com/privacy-scaling-explorations/halo2) — replaced IPA with **KZG** over BN254. The trade-off: - **You give up:** trustless setup. KZG needs a Powers-of-Tau ceremony. - **You get:** constant-size proofs (~600 bytes), pairing-based verification that's an order of magnitude cheaper, and Solidity verifier compatibility — which is the single feature that turned Halo2 from "Zcash internal tool" into "the EVM rollup substrate". This is the trade-off every serious ZK design makes once. (The same trade-off shows up in [Plonky3, the small-fast-cheap revolution](/blog/plonky3_small_fast_cheap/) on a different axis.) Trusted setup is back on the table in 2026 because the Ethereum KZG ceremony — 140,000+ participants — is *good enough* for the threat model most rollups operate under. See [On the death of the trusted setup](/blog/on_the_death_of_the_trusted_setup/) for the argument. I[IPA + Pasta curves] Z --> P[PLONKish + lookups + chips] P --> P1[Custom gates] P --> P2[Permutations] P --> P3[Lookups] Z --> F[Forks 2022-2026] F --> PSE[PSE Halo2 - KZG over BN254] F --> AX[Axiom fork] F --> SCROLL[Scroll fork] F --> ZK[zkEVM forks: Taiko, Linea] PSE --> SOL[Solidity verifier compat] PSE --> M[Maintenance mode Jan 2025] AX --> ACTIVE[Active 2026] classDef ship fill:#0a4014,stroke:#4ade80,color:#fff classDef warn fill:#3a2a0a,stroke:#facc15,color:#fff class SOL ship class ACTIVE ship class M warn`}/> ## The fork landscape in 2026 | Fork | Backend | Status | What it's for | |---|---|---|---| | `zcash/halo2` | IPA, Pasta | Maintenance / archival | The reference, where the model originated | | `privacy-scaling-explorations/halo2` | KZG, BN254 | Maintenance since Jan 2025 | The EVM-compatible workhorse | | `axiom-crypto/halo2-axiom` | KZG, BN254 | Active | The PSE successor for new features | | Scroll's `halo2` | KZG, BN254 | Active inside Scroll | zkEVM-tuned, custom gates for EVM ops | | `appliedzkp/halo2-base` | KZG, BN254 | Active | Higher-level chip-authoring API on top of PSE / Axiom | The headline event of 2025 was that PSE-Halo2 went into maintenance and the community migrated to Axiom's fork as the upstream for new feature work. Existing deployments did not move — the API surface is identical and PSE-Halo2 still receives security backports — but the energy is on `axiom-crypto/halo2-axiom` and on `halo2-base` for ergonomic chip authoring. ## What evolved: gadgets, lookups, and the Lagrange-form witness Three quieter shifts since 2022 actually changed how circuits are written: **Gadget libraries got serious.** The original Halo2 shipped with `halo2_gadgets::poseidon` and not much else. By 2026 the [`halo2-base`](https://github.com/axiom-crypto/halo2-lib) and [`halo2-axiom`](https://github.com/axiom-crypto/halo2-axiom) crates ship range checks, ECC, Poseidon, Keccak, RSA, ECDSA, BN254 pairing, and a battery of lookup tables shared across circuits. The "I have to hand-roll a SHA chip" era is over for 90% of use cases. **Lookups became table-shareable.** Halo2's original lookup design assumed each circuit declared its own tables. With circuits hitting 10 million rows, *table reuse* across sub-circuits became necessary. Both Axiom and PSE landed APIs for declaring a lookup table once and binding it across regions. The constraint-count savings on big circuits are 30–50%. **Lagrange-form witness committed.** The witness used to be committed in coefficient form, requiring an NTT before commitment. Modern forks commit in *Lagrange* form (point-value), saving an NTT per commitment. On large circuits this is a 15–20% prover-time win — the kind of thing that doesn't show up in marketing copy and matters enormously when you're proving a million constraints. ## A skeleton chip you can read in 30 seconds Halo2 chips look intimidating. They are not. The shape is: declare the columns you need, declare the constraints in `configure`, and use them in `synthesize`. Below is a contrived multiplier chip — the smallest Halo2 chip that does anything — written against the kind of trait surface every fork shares. {`// Sketch of a Halo2 multiplier chip — c = a * b per row. // Will not compile standalone; depends on halo2_proofs traits. // Treat as the structural shape, not a runnable program. use std::marker::PhantomData; // Stand-in types so the file is self-documenting. struct Column(PhantomData); struct Selector; struct Cell(PhantomData); trait Field {} // === The chip's column layout === struct MulConfig { a: Column<()>, // advice (witness) b: Column<()>, // advice c: Column<()>, // advice q_mul: Selector, // selector — turns the gate on or off } struct MulChip { config: MulConfig, _marker: PhantomData, } impl MulChip { // configure() is called once at circuit-definition time. It declares // which columns are used and what custom gates fire on which selectors. pub fn configure(/* meta: ConstraintSystem */) -> MulConfig { // Pseudocode for the constraint system: // // meta.create_gate("multiplier", |meta| { // let a = meta.query_advice(a, Rotation::cur()); // let b = meta.query_advice(b, Rotation::cur()); // let c = meta.query_advice(c, Rotation::cur()); // let q = meta.query_selector(q_mul); // vec![ q * (a * b - c) ] // }); // // The vec![] returned must evaluate to 0 on every row where q_mul = 1. // // That single line — q * (a * b - c) — is the ENTIRE arithmetisation // of the multiplier. Permutation arguments handle copy-equality; // lookup arguments handle range checks; everything else is layered // on top of this primitive. unimplemented!("see halo2_proofs::plonk::ConstraintSystem") } // synthesize() is called once per proof. It assigns concrete values // to advice cells. pub fn assign_mul(&self, /* layouter, */ a_val: F, b_val: F) -> Cell { // Pseudocode: // // layouter.assign_region(|| "mul", |mut region| { // self.config.q_mul.enable(&mut region, 0)?; // region.assign_advice(|| "a", self.config.a, 0, || Ok(a_val))?; // region.assign_advice(|| "b", self.config.b, 0, || Ok(b_val))?; // let c_val = a_val * b_val; // region.assign_advice(|| "c", self.config.c, 0, || Ok(c_val)) // }) unimplemented!("see halo2_proofs::circuit::Layouter") } } fn main() { // The shape above generalises to every chip in halo2-axiom and // halo2-base. Configure once; assign per proof; gate constraints with // selectors so chips can coexist on the same columns. println!("see axiom-crypto/halo2-lib for production examples"); } `} ## The proving-time tradeoff in 2026 ## When to actually pick Halo2 in 2026 The honest 2026 answer: - **Pick Halo2 (Axiom fork) when** your target is the EVM, your circuit is dominated by lookups (range checks, table-driven hash functions, RLC-heavy state-transition circuits), and you want a battle-tested gadget library. - **Don't pick Halo2 when** your target is a non-EVM L1 (Solana, Aptos) where Solidity verifiers don't help, when your circuit is small (under ~5,000 constraints — Groth16 is faster per shot), or when you need transparent setup (use Plonky3 / RISC0). ## What I'd build differently if I were Halo2 in 2027 Three things, in order of how much I'd actually use them: 1. **Native folding integration.** Halo2's original recursion path (IPA + Pasta cycle) was elegant but slow. A folding scheme — Nova, ProtoStar, HyperNova — bolted onto KZG-Halo2 would unlock zkVMs and batch proving without a rewrite. Several teams are working on this; nothing is in main yet. 2. **A real type system for chips.** `Cell` is structurally typed by row/column position. There's no compile-time guarantee that "this cell holds a u8" or "this cell holds a Boolean" without re-asserting it inside every chip. A phantom-type-driven cell typing would catch a class of audit findings before the auditor ever opens the file. 3. **A standardised lookup-table registry.** Range checks, byte tables, S-box tables — every fork ships its own. A shared `halo2-tables` crate, content-addressed and reusable, would prevent the "every circuit re-declares the same range-16 table" anti-pattern. I expect (1) within a year and (2)/(3) never. Halo2 is in the *durable* phase of its life — the kind of framework you build *on*, not *into*. ## Further reading - [The Halo2 Book](https://zcash.github.io/halo2/) — Zcash's canonical guide to the original framework - [Halo: Recursive Proof Composition without a Trusted Setup](https://eprint.iacr.org/2019/1021) — Bowe, Grigg, Hopwood (2019, last revised Feb 2020) — the paper Halo2 names itself after - [PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge](https://eprint.iacr.org/2019/953) — Gabizon, Williamson, Ciobotaru (2019) — the underlying arithmetisation - [`privacy-scaling-explorations/halo2`](https://github.com/privacy-scaling-explorations/halo2) — the KZG fork that drove EVM adoption - [`axiom-crypto/halo2-lib`](https://github.com/axiom-crypto/halo2-lib) — the gadget library to build on in 2026 - [Circom, by example](/blog/circom_by_example/) — the R1CS sister-substrate; useful comparison for arithmetisation cost models - [On the death of the trusted setup](/blog/on_the_death_of_the_trusted_setup/) — why KZG is fine even in a transparent-setup era --- # From sailor to CEO in three acts Canonical: https://blog.skill-issue.dev/blog/sailor_to_ceo_three_acts/ Description: A short memoir of a strange decade — Navy reactor compartments, a bitcoin mine, ConsenSys-USAA-PMG, and the arc that ended at Zera Labs. The interesting question is not how I got here. It is where everyone else is going. Published: 2026-05-01T08:00:00.000Z Tags: career, narrative, navy, foundry, consensys, zera, memoir This blog has accumulated, at this point, several long-form posts covering individual chapters of how I got from a Navy reactor compartment to running Zera Labs. The [Navy origin post](/blog/nuclear_reactors_taught_me_to_ship/). The [Foundry post](/blog/what_running_a_bitcoin_mine_taught_me/). The [founding letter](/blog/why_i_started_zera_labs/). The [CEO-still-shipping post](/blog/being_ceo_and_still_shipping_code/). This is the shorter post that exists because *people who don't read the long posts* still ask the question. *How did the Navy guy end up building a ZK SDK?* The version that fits on LinkedIn. Three acts. One arc. I'll keep it under two thousand words. ## Act one: the watch I came up as a Nuclear Electronics Technician in the US Navy. The Navy nuclear pipeline is — there is no nice way to say this — an unreasonable amount of school. You go through a screening that washes out most of the people who applied. Then you go through Nuclear Power School, which is twenty-six weeks of physics, reactor theory, thermo, fluids, and mathematics at a pace that is calibrated to break you exactly enough to find out whether you bend back. Then you go through prototype, where you actually run a real reactor, in a real plant, for thousands of hours of supervised watchstanding. Then you go to a hull, which is when the actual job starts. Along the way you are taught — not as a soft skill, but as a hard skill — that the panel does not lie, the procedure is the contract, and the most dangerous person on the watchstation is the one who decides the indications are *probably* fine. I got out with a stack of qualifications, a security clearance whose paperwork I am still slightly anxious about, and a bone-deep instinct for how safety-critical engineering is actually done. That instinct does not show up on a resume. You only see it when the system is on fire, and even then you only see it as the absence of panic. If you want the long version: *[Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/).* ## Act two: the chips and the code Out of the Navy, I took an unexpected detour through industrial Bitcoin mining at **Foundry Digital**. (`TODO: Dax confirm length of stint and exact role title — keeping the short version short here.`) ASICs in racks. Megawatts of power. Heat going out by every method physics allows. The unit economics live on a five-input spreadsheet, and the spreadsheet does not lie either. The thing nobody tells you about working in mining operations is that *it is the closest thing the modern economy has to a reactor compartment*. The discipline transfers exactly. The watchstanding is the same. The brutal physical immediacy of a ten-thousand-amp electrical bus is a familiar object to a former reactor electronics tech. So I did a chapter, learned what I needed to learn about the depreciation curve and the cost of an electron, and moved on. If you want the long version: *[What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/).* After Foundry I went where most former-military, vaguely-technical thirty-somethings end up: into software. **PMG** first (`TODO: Dax confirm`). Then **USAA** (`TODO: Dax confirm`). Then **ConsenSys**, which is where the Web3 part of the story started — open-source work across the product surface, mentoring junior engineers, and starting to speak publicly at *Permissionless* and *EthGlobal* on developer experience and supply-chain risk. The supply-chain risk thread is the one that became the [Rusty Pipes series](/blog/rusty_pipes/) on this blog — a research thread on Rust binaries injected into npm packages, which is the kind of attack that is funny on a slide and very much not funny in a customer's CI. The series is the longest-running thing I've written and the thing I am most consistently invited to talk about. It also lined up, in a way I did not plan, with the technical posture I'd later need at Zera Labs: *the moment your dependency surface is non-trivial, the registry becomes your threat model, not your library.* In act two I learned to ship software the way the modern industry ships it. Continuous deployment. Cloud-native everything. PR review culture, sometimes good, sometimes bad. I learned what a senior IC actually does, then I learned what a staff engineer actually does, then I started to notice that the ceiling was not where the interesting problems were. ## Act three: the company The third act starts with three things being true at the same time, in the same year. ZK got fast enough to be boring. Solana got cheap enough to make tokenisation a *naming* decision instead of a budgeting decision. AI agents stopped being demos and started being tools that needed to interact with money. Sitting at the intersection of those three things, I incorporated **Zera Labs**. The technical surface — visible in [github.com/Dax911](https://github.com/Dax911) — is the [zera-sdk](https://github.com/Dax911/zera-sdk) (Solana-native ZK SDK with a Rust core), [zera-wallet-demo](https://github.com/Dax911/zera-wallet-demo) (Tauri 2 with Groth16 in WASM), [z_trade](https://github.com/Dax911/z_trade) (zeraswap — first compressed-token AMM on Solana), [zera_med_demo](https://github.com/Dax911/zera_med_demo) (a ZK-FHIR gateway because someone asked us to prove it works for things other than crypto bros), and a public Zera Design System we use across the product. Plus an MCP server for AI agents to call any of it. We are small (`TODO: Dax confirm headcount when comfortable disclosing`), the work is technically dense, and the schedule is short. If you want the long version: *[Why I started Zera Labs](/blog/why_i_started_zera_labs/)*. If you want the inside-the-week version: *[Being CEO and still shipping code](/blog/being_ceo_and_still_shipping_code/)*. ## What the arc actually is Looking at the three acts side by side, the arc is not "Navy guy gets into crypto." That is the LinkedIn-recruiter version. The actual arc is something narrower: > Each chapter forced me to take seriously the gap between *the system is correct* and *I have correctly observed that the system is correct.* In the Navy that gap is closed by watchstanding, two-person verification, and casualty drills. In mining it is closed by telemetry, redundant temperature sensors, and on-call. In software at scale (PMG → USAA → ConsenSys) it is closed by tests, code review, and post-mortems. In ZK it is closed by a Groth16 proof — a piece of math that *is* the closure of the gap. The whole story, condensed, is that I kept moving up the stack of "ways to know that the system is correct," and each step gave me a little more leverage than the last. The reactor watchstander's tools are slow, expensive, and limited to a single plant. The miner's tools are faster and parallel, but only over a single workload. The senior IC's tools are general-purpose but soft — they assume an honest reviewer. The cryptographic tools at the end of the arc are general-purpose, fast, *and* don't assume an honest reviewer. They are, in a real sense, what every prior chapter was reaching for. If I had to pin the arc to one sentence I'd say: *I have spent fifteen years getting better at proving that systems are doing what they claim to be doing*, and the most productive place to do that work, today, is at a company whose entire surface is about producing those proofs. ## What I'd tell my younger selves Three notes to three different versions of me — because I find this is the most useful summary for people whose careers are a few chapters earlier than mine. To the kid in the reactor compartment: the discipline you are absorbing is the most valuable thing you are going to learn, and you will not realise this until you have been out for five years. Don't lose the watchstanding habits. Don't soften the procedure-in-hand instinct. Find a civilian career where they still apply. To the operator at the mining site: pay attention to the unit economics. The five-input spreadsheet is a model that generalises beyond mining. Carry it with you. Whatever business you eventually find yourself in, you will be able to think about it more clearly than the people around you because you have seen what real unit economics actually look like. To the senior IC at ConsenSys: the Rusty Pipes work is the start of a research thread, not the end of one. Don't drop it after the first post. The supply-chain question is going to be one of the defining infrastructure problems of the next decade, and you are early. To present me: don't get cocky. ## And to the reader That's how I got here. The interesting question, though, is not how I got here. It is where everyone else is going. If you came up the same way — Navy, military, technical — and you are thinking about the civilian transition, my email is at the bottom of every page. Reach out. The civilian-tech industry will tell you a lot of things about your discipline, almost none of them flattering, almost none of them correct. The reactor instincts are a superpower in a software org that has lost them. Bring them with you. If you are a senior IC who is wondering whether to keep going up the staff ladder or jump sideways into a founder seat, I hope the [CEO-still-shipping post](/blog/being_ceo_and_still_shipping_code/) is useful. The math is not as bad as the canon makes it sound. If you are at the third act yourself, building cryptographic infrastructure or anything adjacent, I want to know. The ecosystem is small enough that we should know each other. And if you are at the very beginning — looking at a screening package, or a Navy recruiter, or a dev bootcamp acceptance, or a first software job — pick the chapter that gives you the deepest *forcing function*. Pick the chapter that demands the most discipline up front. The discipline is the thing that compounds. The technology is the thing that changes. That's the LinkedIn version. The longer versions are linked above. Thanks for reading. --- # SPST: a self-paying shielded transaction model Canonical: https://blog.skill-issue.dev/blog/spst_self_paying_shielded_transactions/ Description: First construction in F_RP. The SPST relation, balance conservation under DLOG, double-spend resistance under collision-resistant PRF, unlinkability under DDH, simulation-extractable non-malleability. Published: 2026-04-30T17:30:00.000Z Tags: zk, cryptography, pedersen, groth16, zcash, solana, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; In the [previous post](/blog/the_fee_paradox/) I argued that on account-model chains the fee paradox is what forces relayer dependence. The cleanest resolution — Approach A — extracts the transaction fee from inside the ZK proof itself. This post specifies that resolution. The construction is called **SPST** (Self-Paying Shielded Transactions). It is the foundation that PPST, TAB, and UPEE build on. It also stands alone as a complete protocol for private value transfer with self-paying fees — the Solana analogue to Zcash's Sapling spend description, but adapted to a smart-contract environment. ## The setting Work over a prime-order elliptic curve group $\mathbb{G}$ of order $p$ with two independent generators $g, h \in \mathbb{G}$ for which no party knows $\log_g h$. Let $\mathbb{F}_p$ denote the scalar field. We use: - **Poseidon** as the SNARK-friendly hash (width $t = 5$, HADES rounds $R_F = 8$ full + $R_P = 57$ partial, S-box $x^5$, ~324 R1CS constraints per 2-to-1 compression). - **PRF** keyed by $sk \in \mathbb{F}_p$, instantiated as a domain-separated Poseidon evaluation. - **Groth16** over BN254 as the on-chain verifier (alt_bn128 syscalls on Solana). - **Indexed Merkle Trees** for nullifier non-membership (depth-32 over 254-bit values; ~10,948 R1CS constraints per non-membership proof, vs 82,296 for naive sparse Merkle). ## Definitions **Definition 3.1 (Shielded Note).** A *shielded note* is a tuple $$ \mathsf{note} = (\mathsf{pk}, v, \rho, r) $$ where $\mathsf{pk} = \mathsf{PRF}_{sk}(0)$ is the owner's public spending key, $v \in \{0, \ldots, 2^{64}-1\}$ is the note value, $\rho \in \mathbb{F}_p$ is unique per-note serial randomness, and $r \in \mathbb{F}_p$ is the commitment trapdoor. **Definition 3.2 (Note Commitment).** $$ \mathsf{cm} \;=\; \mathsf{Poseidon}(\mathsf{pk},\, v,\, \rho,\, r) \;\in\; \mathbb{F}_p. $$ Note commitments are appended as leaves to a global Merkle tree $\mathcal{T}$ of depth $d = 32$ (capacity $2^{32}$ notes). The Merkle root at any epoch is $\mathsf{rt}$. **Definition 3.3 (Nullifier).** $$ \mathsf{nf} \;=\; \mathsf{PRF}_{sk}(\rho) \;=\; \mathsf{Poseidon}(sk, \rho). $$ Upon spending, $\mathsf{nf}$ is published to a global nullifier set $\mathcal{N}$. Double-spending is prevented by rejecting any transaction whose $\mathsf{nf}$ already lives in $\mathcal{N}$. **Definition 3.4 (SPST Transaction).** With $n$ inputs and $m$ outputs: $$ \mathsf{tx} \;=\; \bigl(\, \{\mathsf{nf}_i\}_{i=1}^{n},\; \{\mathsf{cm}_j\}_{j=1}^{m},\; \mathsf{rt},\; f,\; \pi \,\bigr) $$ where $f \in \{0, \ldots, 2^{64}-1\}$ is the public fee and $\pi$ is a Groth16 proof of the SPST relation. The validator accepts iff (i) $\pi$ verifies, (ii) $\mathsf{rt}$ is a recent root, (iii) every $\mathsf{nf}_i \notin \mathcal{N}$, and (iv) $f \geq f_{\min}$. ## The SPST relation The relation $\mathcal{R}_{\mathsf{SPST}}$ is the set of $(x, w)$ pairs: **Public instance** $x = \bigl(\{\mathsf{nf}_i\}, \{\mathsf{cm}_j\}, \mathsf{rt}, f\bigr)$. **Private witness** $w = \bigl(\{(\mathsf{note}_i, \mathsf{path}_i, sk_i)\}, \{\mathsf{note}'_j\}\bigr)$. Eight constraints, all enforced by the circuit: | # | Name | Constraint | |---|------|-----------| | C1 | Spending key validity | $\mathsf{pk}_i = \mathsf{PRF}_{sk_i}(0)$ | | C2 | Nullifier correctness | $\mathsf{nf}_i = \mathsf{PRF}_{sk_i}(\rho_i)$ | | C3 | Input commitment well-formedness | $\mathsf{cm}^{(\mathsf{in})}_i = \mathsf{Poseidon}(\mathsf{pk}_i, v_i, \rho_i, r_i)$ | | C4 | Merkle membership | $\mathsf{MerkleVerify}(\mathsf{rt}, \mathsf{cm}^{(\mathsf{in})}_i, \mathsf{path}_i) = 1$ | | C5 | Output commitment well-formedness | $\mathsf{cm}_j = \mathsf{Poseidon}(\mathsf{pk}'_j, v'_j, \rho'_j, r'_j)$ | | **C6** | **Value conservation with fee** | $\sum_i v_i = \sum_j v'_j + f$ | | C7 | Non-negative output values | $v'_j \in \{0, \ldots, 2^{64}-1\}$ (bit decomposition) | | C8 | Non-negative fee | $f \in \{0, \ldots, 2^{64}-1\}$ | C6 is the load-bearing constraint. It is what makes the transaction self-paying: the prover can only produce a valid proof if the input notes' values sum to exactly the output values plus the fee. ## The self-paying property (Theorem 3.1) **Theorem.** Let $\mathsf{tx} = (\{\mathsf{nf}_i\}, \{\mathsf{cm}_j\}, \mathsf{rt}, f, \pi)$ be a valid SPST transaction. Then: 1. The fee $f$ is funded entirely from consumed shielded notes. 2. No external account, relayer, or gas sponsor is required. 3. Validators extract $f$ as inclusion compensation without learning the private inputs/outputs beyond $f$ itself and the validity of $\pi$. **Proof sketch.** (1) follows directly from C6. (2) follows because $\mathsf{tx}$ is a self-contained data structure that any party can broadcast; the on-chain verifier decrements the privacy program's lamport reserve by $f$ and credits the validator. (3) is the perfect zero-knowledge property of Groth16: the validator sees $f$ as a public input but learns nothing about $v_i$ or $v'_j$. The full proof is in §3.1.3 of the paper. The takeaway: **on Solana, the privacy program's PDA holds a reserve. Each shield deposit increments it. Each SPST transaction's proof authorises the validator to take $f$ from it.** Replenishment is automatic. ## Theorem 3.2 — Balance / value conservation **Statement.** No PPT adversary can produce a valid SPST transaction that creates value (i.e., one for which $\sum_j v'_j + f > \sum_i v_i$) except with negligible probability. The proof gives two complementary arguments — one from the SNARK's knowledge soundness, one from an independent Pedersen commitment cross-check that provides defense in depth. ### Argument 1 (SNARK soundness) C6 enforces $\sum_i v_i = \sum_j v'_j + f$ over $\mathbb{F}_p$. C7 and C8 enforce $v'_j, f \in [0, 2^{64})$. With at most $n \leq 2^{16}$ inputs each bounded by $2^{64}$, $\sum_i v_i < 2^{80} \ll p \approx 2^{254}$ — so field arithmetic faithfully represents integer arithmetic and no modular wraparound is possible. By Groth16 knowledge soundness in the AGM, an extractor $\mathcal{E}$ can recover the witness $w^*$ satisfying all constraints C1–C8. C6 in the extracted witness gives $\sum_i v_i = \sum_j v'_j + f$ as an integer equation. Contradiction with the assumed inflation. ### Argument 2 (Pedersen cross-check) As defense in depth, attach Pedersen value commitments to each note. With $C_{\mathsf{in},i} = v_i \cdot g + r^{(\mathsf{vc})}_i \cdot h$ and $C_{\mathsf{out},j} = v'_j \cdot g + r^{(\mathsf{vc})}_j \cdot h$, the verifier checks $$ \sum_i C_{\mathsf{in},i} \;=\; \sum_j C_{\mathsf{out},j} \;+\; f \cdot g \;+\; r_\Delta \cdot h $$ where $r_\Delta = \sum_i r^{(\mathsf{vc})}_i - \sum_j r^{(\mathsf{vc})}_j$. Suppose an adversary passes this check but with $\sum_j v'_j + f \neq \sum_i v_i$. Let $\delta = \sum_i v_i - \sum_j v'_j - f \neq 0$. Then $\delta \cdot g = r'_\Delta \cdot h$ for some $r'_\Delta$, which yields $\log_g h = \delta / r'_\Delta$ — contradicting DLOG. ∎ ## Theorem 3.3 — Double-spend resistance **Game.** $\mathcal{A}$ may adaptively deposit and spend; wins if it produces two accepted transactions consuming the same note $\mathsf{note}^*$. **Cases.** - **Case 1:** The two transactions publish the same nullifier. Rejected by the protocol's nullifier-set check. - **Case 2:** They publish different nullifiers $\mathsf{nf} \neq \mathsf{nf}'$ but consume the same note. By C4 both proofs authenticate the same commitment $\mathsf{cm}^*$. By C1 we have $\mathsf{pk}^* = \mathsf{PRF}_{sk^*}(0) = \mathsf{PRF}_{sk'}(0)$. - If $sk^* = sk'$, then $\mathsf{nf} = \mathsf{nf}'$. Contradiction. - If $sk^* \neq sk'$, then $\mathsf{Poseidon}(sk^*, 0) = \mathsf{Poseidon}(sk', 0)$ — a collision in Poseidon. Reduces to collision resistance. Both cases reach a contradiction. ∎ ## Theorem 3.4 — Transaction unlinkability **Statement.** Under perfect zero-knowledge of Groth16 and computational hiding of Pedersen commitments under DDH, the SPST scheme satisfies transaction unlinkability: no PPT adversary can determine which input notes fund which output notes with non-negligible advantage. **Proof structure.** Hybrid argument: - **Hybrid 0**: real game. - **Hybrid 1**: replace all Groth16 proofs with simulated proofs. By perfect ZK of Groth16, indistinguishable. - **Hybrid 2**: in the simulated view, the multisets of nullifiers, commitments, roots, fees are identical for both branchings of the challenge. The fee is identical by construction. Each $\mathsf{cm}_j = \mathsf{Poseidon}(\mathsf{pk}'_j, v'_j, \rho'_j, r'_j)$ with fresh random $r'_j$ is computationally indistinguishable from a uniform field element. Each nullifier $\mathsf{nf}_i = \mathsf{PRF}_{sk_i}(\rho_i)$ with unique $\rho_i$ is pseudorandom. The Pedersen value commitments are computationally hiding under DDH. Result: $\mathsf{Adv}_{\mathcal{A}} \leq \mathsf{negl}(\lambda)$. ∎ ## Theorem 3.5 — Non-malleability **Statement.** No PPT adversary can take a valid SPST transaction and produce a *distinct* valid transaction with altered public inputs (e.g., a different fee), except with negligible probability. **Proof.** Relies on the **simulation-extractability** of Groth16 in the Random Oracle Model — the Bowe-Gabizon construction (2019), refined by Ràfols-Baghery-Pindado (2023). An adversary mauling the proof to alter $f$ would need to extract a witness with a *different* $f'$ satisfying C6, but C6 plus the unchanged input commitments and output commitments uniquely determines $f$. Contradiction. ## Circuit complexity For an SPST circuit with $n$ inputs and $m$ outputs: $$ C_{\mathsf{total}} \;\approx\; 11{,}500 \cdot n \;+\; 452 \cdot m \;+\; 64. $$ A canonical 2-in / 2-out transaction: $$ C_{2,2} \;=\; 23{,}000 + 904 + 64 \;\approx\; 24{,}000 \text{ constraints}. $$ | Component | Per-input | Per-output | Subtotal | |-----------|-----------|------------|----------| | Note commitment (input) | 388 | — | $388n$ | | Merkle path (depth 32) | 10,400 | — | $10{,}400n$ | | PRF evaluations (pk + nf) | 648 | — | $648n$ | | Range proof (input value) | 64 | — | $64n$ | | Note commitment (output) | — | 388 | $388m$ | | Range proof (output value) | — | 64 | $64m$ | | Fee range proof | — | — | 64 | On commodity hardware (Apple M2, 8-core), Groth16 proving for ~24,000 constraints takes **0.5–1.5 seconds** with arkworks or snarkjs. Proof size is **128 bytes** compressed (BN254 G1/G2 compression on Solana). Verification is **~150,000–200,000 CU** via `sol_alt_bn128_*` syscalls. ## What SPST is not SPST handles private *value transfer* — the Solana analogue of Zcash's Sapling. It does **not**: - Handle private *computation*. The next post ([PPST](/blog/ppst_private_programmable_state/)) extends the relation to arbitrary arithmetic circuits over private state. - Hide the *submitter*. The transaction submitter is still publicly identified by their Ed25519 signature on the wrapping Solana transaction. [TAB](/blog/tab_threshold_anonymous_broadcast/) addresses that. - Hide the *fee amount*. $f$ is necessarily public for validator compensation. But it does the load-bearing thing: the user becomes self-sovereign with respect to fee payment. Combined with TAB and PPST, that's the whole framework. ## Bibliography - Ben-Sasson, E. et al. (2014). *Zerocash.* IEEE S&P 2014. https://eprint.iacr.org/2014/349 - Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf - Groth, J. (2016). *On the Size of Pairing-based Non-interactive Arguments.* EUROCRYPT 2016. https://eprint.iacr.org/2016/260 - Bowe, S., Gabizon, A. (2019). *Making Groth's zk-SNARK Simulation Extractable.* https://eprint.iacr.org/2019/197 - Ràfols, C., Baghery, K., Pindado, Z. (2023). *Simulation Extractable versions of Groth's zk-SNARK Revisited.* https://doi.org/10.1007/s10207-023-00750-7 - Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991. - Grassi, L. et al. (2021). *Poseidon: A New Hash Function for Zero-Knowledge Proof Systems.* USENIX Security 2021. https://eprint.iacr.org/2019/458 - Aztec Documentation. *Indexed Merkle Tree (Nullifier Tree).* https://docs.aztec.network/ Previous: [The fee paradox ←](/blog/the_fee_paradox/) · Next: [PPST: private programmable state →](/blog/ppst_private_programmable_state/) --- # Circom, by example Canonical: https://blog.skill-issue.dev/blog/circom_by_example/ Description: A DSL primer told through one circuit — proving knowledge of a Poseidon pre-image. Every Circom keyword annotated as it appears, the constraint graph drawn out, and the R1CS fall-through to a witness. Published: 2026-04-30T13:00:00.000Z Tags: circom, dsl, r1cs, zk, snark, poseidon, phd import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx"; There are two ways to write a zero-knowledge circuit. You can spell out the algebraic constraints by hand — `(a) * (b) === c`, one line per multiplication, every wire indexed manually — or you can write something that *reads like a program* and let a compiler emit the constraints. The first approach gives you total control and zero leverage. The second approach gives you Circom. Circom is the DSL Iden3 designed in 2018 to make Groth16-style circuit authoring tractable. Six years later, in 2026, it is still the language most production ZK pipelines reach for first. The reason is not that it is the most expressive — Halo2 and the Noir frontend [Aztec](https://aztec.network) ships are both more powerful — but that its compilation target (R1CS) is the format every Groth16 toolchain on Earth speaks, and its tooling (snarkjs, circomlib, circomlibjs) is the deepest in the ecosystem. This post is a walk through Circom from the inside out, told via one circuit: *prove I know `x` such that `Poseidon(x, 0) == y` without revealing `x`*. Pre-image of a hash. The "hello world" of shielded systems. By the end you'll have read every Circom keyword that matters, seen the constraint graph it generates, and watched the witness get computed in your browser. ## What R1CS actually is, in five paragraphs Before any DSL, the substrate. A **rank-1 constraint system** is a list of constraints of the form $$ (\mathbf{a}_i \cdot \mathbf{w})\,(\mathbf{b}_i \cdot \mathbf{w}) - (\mathbf{c}_i \cdot \mathbf{w}) = 0 $$ where $\mathbf{w}$ is the **witness vector** (every wire in your circuit, including inputs, outputs, intermediates, and a leading constant `1`), and $\mathbf{a}_i, \mathbf{b}_i, \mathbf{c}_i$ are constant vectors that pick out which wires participate in the *i*-th constraint. Every constraint is of the form *(linear combination)* × *(linear combination)* = *(linear combination)*. Hence "rank 1": each side is at most one multiplication. What this *means* is: every constraint can express **exactly one multiplication of two wires**, plus arbitrary additions and constant scalings on either side. `(2*x + 3*y) * (z) === w + 1` is one R1CS constraint. `x * y * z === w` is two — you need an intermediate `t = x*y` and then `t * z === w`. You can feel the shape of the cost function: addition is free, multiplication is expensive. Why this exact shape? Because Groth16 (and its predecessors in the Pinocchio/QAP family) reduces an R1CS to a polynomial-divisibility check, and that reduction works exactly when each constraint is rank 1. The circuit's *number of constraints* becomes the dominant factor in proof time and zkey size. Constraints, not wires, not gates. In production Circom, you'll see constraint counts ranging from ~50 (a single range check) to ~10,000 (a Merkle-32 path with Poseidon nodes) to ~2,000,000 (a circuit verifying an EVM block). Every increment is a multiplication that someone wrote, intentionally or not. **A good Circom programmer thinks like an accountant.** The witness is generated *outside* the constraint system, by a witness-generator program the Circom compiler emits as WebAssembly. The constraint system *checks* the witness; it does not compute it. This separation is fundamental to how SNARKs work: prover knows everything, verifier checks much less. ## A first circuit — knowledge of a Poseidon pre-image ```circom pragma circom 2.1.5; include "circomlib/poseidon.circom"; template KnowsPreimage() { signal input x; // private witness — the value being hidden signal output y; // public output — the published hash component hash = Poseidon(2); // 2-input Poseidon hash gadget hash.inputs[0] <== x; // wire x in hash.inputs[1] <== 0; // pad with 0 y <== hash.out; // expose the result } component main { public [y] } = KnowsPreimage(); ``` That's the whole circuit. Every keyword, in order: - `pragma circom 2.1.5` — version pin. Circom is post-1.0; the language has minor breaking changes between minor versions, and circomlib's gadgets target specific ranges. Pin or suffer. - `include "circomlib/poseidon.circom"` — the include resolves against the `--node-modules` flag or `circomlib`'s install path. Includes are textual — there's no module system in the npm sense, only file inclusion. - `template KnowsPreimage()` — a parameterised circuit fragment. Templates are like generic functions: you instantiate them with `component foo = KnowsPreimage();`. The lowercase/uppercase convention (Templates uppercase, components lowercase) is community style, not enforced. - `signal input x;` — a wire that flows *in* to this template. `signal output y;` — flows *out*. Without `input` or `output`, `signal foo;` is an internal wire. - `component hash = Poseidon(2);` — instantiate a sub-circuit. `Poseidon` is a template defined in circomlib; the `(2)` is its parameter (number of inputs). Components compose hierarchically; the compiler inlines them at constraint-emission time. - `hash.inputs[0] <== x;` — the **constraint operator**. `<==` does *two* things at once: it (a) emits the R1CS constraint that wires `x` and `hash.inputs[0]` are equal, and (b) marks the right-hand side as the source for witness generation (so the WASM witness generator knows to copy `x`'s value into `hash.inputs[0]`). - `y <== hash.out;` — same operator, exposing the hash output. - `component main { public [y] } = KnowsPreimage();` — the entry point. The `public` annotation says: when the verifier checks the proof, `y` is the public input. Everything else (here, just `x`) is private to the prover. Three operators every Circom programmer types daily: | Operator | What it does | Witness side | Constraint side | |---|---|---|---| | `<--` | witness only | assigns | no constraint emitted | | `===` | constraint only | no witness assignment | emits constraint | | `<==` | both | assigns | emits constraint | `<--` shows up when you compute something the constraint system can't (square-root, division, lookup) and then post-hoc constrain it with `===`. `===` shows up alone when the relationship is implicit and you want to assert it. `<==` is the day-to-day workhorse. ## What that circuit compiles to The Circom compiler (`circom2`) emits four artifacts: 1. **`circuit.r1cs`** — the constraint system, in a binary format the rest of the toolchain consumes. 2. **`circuit.wasm`** — the witness generator, a WASM module that takes the inputs as JSON and returns the witness vector. 3. **`circuit.sym`** — symbol table mapping wire indices back to source-code names. Invaluable for debugging. 4. **`circuit.json`** *(optional, `--json`)* — the constraint system in human-readable JSON. Slow to parse; useful for one-off inspection. For our pre-image circuit, the R1CS file contains roughly the constraints below — Poseidon's S-box rounds, MDS multiplications, output binding. The constraint graph looks like this: H0[Poseidon input 0] Z[constant 0] --> H1[Poseidon input 1] H0 --> S1[round 1 S-box] H1 --> S1 S1 --> M1[MDS mix] M1 --> S2[round 2 S-box] S2 --> M2[...64 more rounds...] M2 --> O[Poseidon output] O --> Y[public output y] classDef pub fill:#0a4014,stroke:#4ade80,color:#fff classDef priv fill:#3a0a0a,stroke:#f87171,color:#fff class Y pub class X priv`}/> Total constraint count for `KnowsPreimage` against BN254-Poseidon-128 with $t=3, R_F=8, R_P=57$: **243 constraints** for the hash, plus ~3 for the input wiring. Call it 246 R1CS constraints. snarkjs Groth16 will prove it in under 80 ms in a browser, including witness generation. (Numbers from [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/).) ## Compile and run, in your browser `circomlibjs` ships a browser build that includes the witness generator and the Poseidon constants without requiring you to install the Rust-based `circom2` compiler. Below is a Sandpack `node` template that takes the inputs to our circuit, computes the witness, and emits the expected hash. It's not a full proof — the proving step needs the zkey, which is megabytes — but it's the *witness generation* half of the pipeline, end-to-end, in a browser. What you should see when this runs: a public output `y` that is the Poseidon hash of `(1234567890, 0)` over BN254. That value is what would be posted on-chain or shipped over the wire. The proof would convince the verifier that someone knew an `x` mapping to that `y`, without revealing `x`. ## Some Circom patterns worth internalising A handful of patterns recur across every real circuit. They're idioms more than language features. **Bit decomposition.** Circom doesn't have a native `< 2^n` predicate. You decompose into bits and constrain each bit to be 0 or 1: ```circom template Num2Bits(n) { signal input in; signal output out[n]; var lc1 = 0; var e2 = 1; for (var i = 0; i < n; i++) { out[i] <-- (in >> i) & 1; // witness only out[i] * (out[i] - 1) === 0; // constrain to 0 or 1 lc1 += out[i] * e2; e2 *= 2; } lc1 === in; // re-aggregate must match } ``` The `<--` followed by `===` is the canonical witness-then-check pattern. The bits are computed outside the constraint system (you can't shift in R1CS) and *then* constrained to be valid. **Conditional selection.** R1CS has no `if`. You select between two values with a Boolean: ```circom // out = sel ? a : b, where sel must be 0 or 1 out <== a + (b - a) * (1 - sel); ``` **MUX trees.** A common pattern in Merkle paths: at each level, pick the left or right sibling based on the path bit. circomlib's `MultiMux1` template does this efficiently for `t`-element vectors. ## Circom vs the alternatives in 2026 The case *for* Circom in 2026 is one word: **circomlib**. Six years of accreted gadgets — Poseidon, MiMC, Pedersen, EdDSA, Merkle, range checks, Sigma protocols, set membership — that all interoperate cleanly because they target the same R1CS-over-BN254 substrate. The case *against* is also one word: **expressivity**. Circom is a templating engine over arithmetic constraints. It can't loop over a runtime-known length, can't recurse, has no first-class strings or arrays beyond fixed-size. For complex circuits the workarounds get baroque. Inside [zera-sdk](/blog/zera_sdk_scaffolding/) we use Circom for the deposit / transfer / withdraw circuits because circomlib's Poseidon and MerkleTreeChecker gadgets are fight-tested and because snarkjs is the only browser prover that ships in a single npm install. The day we need lookups (or recursion) at scale, the discussion is Halo2 vs Noir, not Circom. ## What I would change if I were Circom 2.5 Three things, ranked by how much I'd actually use them. 1. **First-class lookup tables.** Halo2 has them and they cut range-check costs by orders of magnitude. Plookup-as-a-tagged-include in Circom would close most of that gap. 2. **Module system.** `include` is textual. Circular includes silently drop. A real module graph with explicit exports would prevent a class of bug I see in every audit. 3. **Compiler-level constraint optimisation.** The compiler already does basic linear-combination flattening. Aggressive common-subexpression elimination across templates would shave 10–20% off circomlib's bigger gadgets at zero source-code cost. None of these are coming, as far as I can tell. The Iden3 team has moved most of its energy to [Polygon ID](https://polygon.technology/polygon-id) and the Circom roadmap has been relatively quiet through 2025–2026. That's fine — the language is *done* in the way that good DSLs eventually become done. If you want what comes next, you go look at Noir and Halo2. ## Further reading - [Circom 2 documentation](https://docs.circom.io) — the canonical language reference - [CIRCOM: A Robust and Scalable Language for Building Complex Zero-Knowledge Circuits](https://www.techrxiv.org/articles/preprint/CIRCOM_A_Robust_and_Scalable_Language_for_Building_Complex_Zero-Knowledge_Circuits/19374986) — Bellés-Muñoz, Isabel, Muñoz-Tapia, Rubio, Baylina (2022) - [`iden3/circom`](https://github.com/iden3/circom) — the compiler source - [`iden3/circomlib`](https://github.com/iden3/circomlib) — the gadgets library that makes Circom usable in production - [`iden3/circomlibjs`](https://github.com/iden3/circomlibjs) — JavaScript port of the cryptographic primitives, what the Sandpack above uses - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — what the hash gadget actually is - [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — what happens after `circom` finishes compiling --- # Proving in the browser, by the numbers Canonical: https://blog.skill-issue.dev/blog/proving_in_the_browser_by_the_numbers/ Description: What is actually feasible inside a browser tab in 2026 — Groth16 prover times for Poseidon, Range, and Merkle circuits, the WASM threading story, and where the main thread stops being a viable home for your prover. Published: 2026-04-29T16:00:00.000Z Tags: wasm, groth16, snarkjs, arkworks, browser, zk, phd, performance import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx"; The first time I watched a Groth16 proof finish inside a Chrome tab — Poseidon-128, two-input Merkle membership, a couple of range checks — the spinner ran for **11.4 seconds**. The user expected something between *Apple Pay* and *autocomplete*. Eleven seconds is forever. Two years and several browser releases later, the same circuit on the same laptop ([2024 MacBook Air, M3, 8 cores, 16 GB](https://support.apple.com/en-us/SP891)) finishes in **2.1 seconds**, with a warm zkey, threads pinned, and SIMD on. That's still not Apple Pay, but it is inside the *I just clicked something* envelope where users don't bail. The gap between those two numbers is the entire content of this post: what part of the browser stack moved, what didn't, and what the limit looks like in 2026. This is not a tutorial. It's a benchmark walk and a tradeoff inventory. If you're picking a prover for a wallet or a dApp this quarter — and inside [zera-sdk](/blog/zera_sdk_scaffolding/) we just made this call again, see [RFC 001](/docs/001-zera-sdk-monorepo-shape/) — the numbers below are the ones that informed our pick. ## What "in the browser" actually means in 2026 A modern browser gives a WASM prover three things it didn't have when snarkjs first shipped in 2019: 1. **WebAssembly threads.** A `SharedArrayBuffer` plus the `Atomics` API plus `wasm-bindgen-rayon` lets a Rust prover spawn a worker pool from a single `.wasm` module. This needs cross-origin isolation (`Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp`) — see the [`wasm-bindgen-rayon` README](https://github.com/RReverser/wasm-bindgen-rayon) for the headers your CDN needs. 2. **128-bit SIMD.** WebAssembly's [fixed-width SIMD proposal](https://github.com/WebAssembly/simd) is shipped on Chrome, Firefox, Safari. For BN254 prover work — multi-scalar multiplication, NTTs, big-integer reduction — SIMD is the difference between *feasible* and *please install our desktop app*. 3. **Bulk memory operations.** `memory.copy` / `memory.fill` cut several ms off witness allocation for circuits with hundreds of thousands of wires. The fourth thing the browser stack gives you is a *worker model* that decouples proving from rendering. If you call your prover on the main thread, every microtask boundary stalls the React fibres and the user sees a frozen UI. The same prover, moved into a `Worker`, keeps the page interactive while pegging another core. Almost every wallet that ships ZK in 2026 — including the ones that look fast — does this. |postMessage proof input| W[Worker] W -->|spawns rayon pool| WS[Shared WASM memory] WS --> T1[thread 1 - MSM] WS --> T2[thread 2 - MSM] WS --> T3[thread 3 - NTT] WS --> T4[thread 4 - NTT] T1 --> G[gather] T2 --> G T3 --> G T4 --> G G -->|postMessage proof| UI`}/> ## The benchmark numbers, on three workhorse circuits The numbers below are for three circuits I keep coming back to because every shielded-pool design I've shipped uses some flavour of all three: - **Poseidon-128, 2-to-1.** ~243 R1CS constraints. The hash building block. (Background: [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/).) - **Range-16.** Prove $0 \le x < 2^{16}$ via 16 bit decomposition + Boolean constraints. ~50 R1CS constraints. The "this amount is positive and not absurd" check. - **Merkle-32.** Membership in a depth-32 Poseidon Merkle tree. ~32 × 243 ≈ **7,800** constraints. All numbers below are wall-clock proof generation time, with a warm zkey loaded into IndexedDB and the prover already instantiated. Cold-start (first load, parsing the zkey) adds 2–6 s on top depending on the circuit size and the user's network. **That cold-start is usually the bigger UX problem** — see the closing notes. | Circuit | snarkjs 0.7 (1 thread) | snarkjs 0.7 (4 threads) | arkworks-circom WASM (4 threads) | |---|---|---|---| | Poseidon-128 | ~95 ms | ~50 ms | ~25 ms | | Range-16 | ~40 ms | ~30 ms | ~15 ms | | Merkle-32 | ~2,400 ms | ~900 ms | ~410 ms | The arkworks numbers come from a Rust prover compiled to WASM with `wasm-bindgen-rayon` and the same R1CS the snarkjs path consumes. The 4× cliff between snarkjs and arkworks-WASM at Merkle-32 is the thing to internalise: at the constraint counts that real applications hit, the gap between "JavaScript with WASM hot loops" and "Rust compiled to WASM" is roughly **5×** of proving time. That ratio is consistent with the Mopro team's [comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — they measure native Rust provers at 5–10× snarkjs speed, with the WASM Rust prover sitting roughly halfway between them. ## A field-arithmetic micro-benchmark you can run right now Before getting to prover-level numbers, the floor of any of this is *how fast can the browser raise a 254-bit BigInt to the fifth power*. That's the inner loop of every Poseidon round. Here's a tiny `vanilla-ts` benchmark that times $x^5$ over BN254's prime for 10,000 iterations and reports ops/sec. Run it on your laptop and on your phone — the gap is the gap between "proving on a wallet" and "proving on a desktop". 1_000_000) return (opsPerSec / 1_000_000).toFixed(2) + " Mops/s"; if (opsPerSec > 1_000) return (opsPerSec / 1_000).toFixed(1) + " Kops/s"; return opsPerSec.toFixed(0) + " ops/s"; } function run() { out.textContent = "running 10,000 x^5 mod p ops over BN254...\\n"; // Run several rounds so the median is meaningful. const rounds = 5; const results: number[] = []; for (let r = 0; r < rounds; r++) { const ops = bench(10_000); results.push(ops); out.textContent += \`round \${r + 1}: \${format(ops)}\\n\`; } results.sort((a, b) => a - b); const median = results[Math.floor(rounds / 2)]; out.textContent += \`\\nmedian: \${format(median)}\\n\`; out.textContent += \`\\nfor reference:\\n\`; out.textContent += \` snarkjs WASM prover: ~10x this\\n\`; out.textContent += \` arkworks compiled to WASM: ~20-30x this\\n\`; out.textContent += \` native Rust on the same CPU: ~50-100x this\\n\`; } runBtn.addEventListener("click", run); run(); `, "/index.html": `
starting...
`, }} /> On my M3 Air this run reports about **0.9 Mops/s** for raw `BigInt` $x^5$. The published snarkjs WASM prover for the same operation hits roughly **9 Mops/s** — a 10× win from hand-rolled big-int arithmetic in WASM. Compiled-Rust BigInt code (`ark-ff` over BN254) hits **20–35 Mops/s** in WASM. Native Rust hits **70–100+ Mops/s** depending on assembly tuning. That stack of orders-of-magnitude is why prover libraries are not written in JavaScript even when the deployment target is the browser. ## The four-way prover tradeoff The take-home from running these benchmarks for a year is simple: **for circuits under ~10k constraints the choice barely matters; for circuits over ~100k constraints the choice is the entire performance story.** Most wallet circuits live in the murky middle — 5k to 50k constraints — where snarkjs is fine for now and arkworks-WASM is a 2026 upgrade I keep on the roadmap. ## When the main thread is fine, and when it isn't A sloppy heuristic that I've found holds up: $$ t_{\text{prove}} > 100\text{ ms} \implies \text{move to a Worker} $$ Below 100 ms the cost of `postMessage` round-trips (serialising witness inputs, copying the proof back) eats most of the win. Above that, you're in user-perceptible territory and the main thread stops being viable. The empirical numbers in the table above mean: **Poseidon and Range can stay on the main thread; Merkle paths and anything wallet-shaped should move to a Worker.** A second heuristic, less popular but more important: **don't put your prover in a `requestIdleCallback`**. The user clicked *Send*. They are waiting. Promote the work, don't defer it. ## Where the cold-start really lives Proof generation time is the metric people quote. Cold-start is the metric people *feel*. The pieces of cold-start, in order of size: 1. **Zkey download.** A Merkle-32 zkey is ~25 MB. A two-input shielded-pool circuit zkey can be 80+ MB. Download time dominates everything else on a phone on LTE. 2. **Zkey parse + prover instantiation.** snarkjs parses the zkey eagerly into typed-array views; arkworks-WASM mmap-parses lazily. The gap is 1.5–4 s on a Merkle-32 zkey. 3. **WASM compilation.** `WebAssembly.instantiateStreaming` with the right MIME type lets the browser pipeline compile and download. Without it you pay the full compile after the download finishes. This is a CDN-config bug in the wild more often than it should be. 4. **Worker pool spin-up.** ~50 ms per worker. Pre-spin them on page load, not on first proof. If you can only optimise one thing, it should be (1). IndexedDB-backed lazy chunks of the zkey, served with `Cache-Control: immutable, max-age=31536000`, change first-load from "ten seconds of nothing" to "one second of yellow flicker, then proof". This is what we do in the [zera-sdk](/blog/zera_sdk_scaffolding/) wallet path and it's the single biggest UX win we shipped in Q1 2026. ## What I'd build differently in 2027 Three things, ranked. 1. **Prover pre-warming on idle.** The moment a user authenticates, fire the worker pool and pre-load the zkey. By the time they tap *Send*, the prover is hot. This is just engineering, not cryptography, but it's the missing piece in every wallet I've benchmarked. 2. **Move to a folding-friendly proving system for batch operations.** A user spending three notes from a UTXO pool is doing three Merkle paths back-to-back. Folding (Nova / SuperNova / ProtoStar) makes the *N*th proof nearly free; Groth16 makes the *N*th proof exactly *N* times the cost. 3. **Replace the per-vendor zkey format with something content-addressed.** Today every project ships its own `.zkey` blobs and every wallet has to host them. A `zkey://sha256/abc...` resolver — backed by IPFS or an HTTP CDN — would let multiple wallets share the same zkey load and the same browser cache. ## What this means for ZERA today Inside zera-sdk the in-browser path is still snarkjs (per [RFC 001](/docs/001-zera-sdk-monorepo-shape/)). The neon-rs Node path is a native Rust prover and ~30× faster, but that's not what a web wallet runs. The arkworks-WASM upgrade is on the roadmap as a "browser v2" target — see the open issue thread linked from the SDK repo. The decision-driver was simple: snarkjs is good enough for one-shot deposits and transfers. The day we want to make a 10-note batch tx feel instantaneous, we need either folding (Nova) or a faster underlying prover (arkworks-WASM). For now: snarkjs, threads on, SIMD on, zkey pinned to IndexedDB, prover lifted to a Worker. **That gets us 2 seconds of proving time at Merkle-32 on a mid-range laptop in 2026.** The next 50% will come from arkworks; the 5× after that will come from folding. The 50× after *that* will come from someone else's algorithmic breakthrough that I don't yet know about. ## Further reading - [snarkjs](https://github.com/iden3/snarkjs) — Iden3, the reference WASM Groth16 prover; benchmark table in the README - [Mopro: comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — community benchmark of snarkjs / arkworks / native Rust at matched circuits, 2024 - [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) — RReverser, the SharedArrayBuffer-backed Rayon adapter that makes multi-threaded Rust WASM work in browsers - [WebAssembly fixed-width SIMD proposal](https://github.com/WebAssembly/simd) — the standard your prover wants enabled - [Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS](https://eprint.iacr.org/2019/1047) — Chiesa, Hu, Maller, Mishra, Vesely, Ward (2019) — the paper that made universal SRS practical - [Nova: Recursive Zero-Knowledge Arguments from Folding Schemes](https://eprint.iacr.org/2021/370) — Kothapalli, Setty, Tzialla (2021) — the folding paper, for context on why batch proving is becoming a different game - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the inner loop your browser is running 65 times per Merkle level --- # Merkle inclusion proofs over compressed account state on Solana Canonical: https://blog.skill-issue.dev/blog/merkle_inclusion_compressed_solana/ Description: How a 32-byte hash and a logarithmic path replace a multi-kilobyte account. Walk the tree-height math, the Light Protocol compressed-account model, and an inclusion-proof construction you can run in Node. Published: 2026-04-29T15:00:00.000Z Tags: cryptography, merkle, solana, light-protocol, compression, zk, phd import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx"; The cheapest piece of state in a privacy pool — and the most contested one — is the **commitment tree**. Every shielded note's commitment goes in. Every spend proves an inclusion. The tree is read on every transfer and written on every deposit. If the tree state is expensive, every operation is expensive. If the inclusion proofs are big, every spend is big. In 2024 Light Protocol shipped [ZK Compression](https://www.zkcompression.com/references/whitepaper) on Solana, and the production primitive for "store a lot of state cheaply, prove inclusion in zero-knowledge" became standard. This post is the math behind that primitive, the deployment shape, and a runnable inclusion-proof construction. It's a sibling piece to [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) and [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — those tell you what we put *into* the tree; this one tells you how we prove things *about* the tree. ## The minimum tree A binary Merkle tree is the simplest commitment scheme that supports logarithmic-size inclusion proofs. Start with a sequence of leaves $\ell_0, \ell_1, \dots, \ell_{N-1}$. Define the tree recursively: $$ \text{node}(i, j) = H(\text{node}(i, m) \,\|\, \text{node}(m+1, j)), \quad m = \lfloor (i+j)/2 \rfloor, $$ with $\text{node}(i, i) = \ell_i$ at the leaves. The **root** is $\text{node}(0, N-1)$. To prove that $\ell_k$ is in the tree, you reveal the root plus the **co-path** — for each level $j$, the sibling node along the path from leaf $k$ to the root. There are exactly $\log_2 N$ siblings. The proof size is $\log_2 N$ hash outputs. At $N = 2^{32}$ leaves and 32-byte hashes, that's **1024 bytes**. At $N = 2^{20}$ (about a million leaves), it's 640 bytes. The verifier cost is $\log_2 N$ hashes. Both numbers are unreasonably small compared to the account-state cost of storing all $N$ leaves directly on chain. That's the entire shape. Two equations and a co-path. The reason it shows up everywhere is that nothing else hits the same combination of small proof, cheap verifier, and append-only update path. A[h_AB] R --> B[h_CD] A --> A1[h_A] A --> A2[h_B] B --> B1[h_C] B --> B2[h_D] A1 --> L0[leaf 0: commitment_0] A2 --> L1[leaf 1: commitment_1] B1 --> L2[leaf 2: commitment_2] B2 --> L3[leaf 3: commitment_3] classDef leaf fill:#0a0a0a,stroke:#4ade80,color:#4ade80 classDef node fill:#1a1a1a,stroke:#a3a3a3,color:#e8e8e8 class L0,L1,L2,L3 leaf class R,A,B,A1,A2,B1,B2 node`}/> To prove inclusion of leaf 1 (`commitment_1`), the prover reveals the co-path `[h_A, h_CD]`. The verifier hashes up: `h_B = leaf_1`, `h_AB = H(h_A || h_B)`, `root = H(h_AB || h_CD)`, and checks that `root` matches the public root. ## Inclusion proof size, exactly For a tree of height $h$ (so $N \le 2^h$ leaves) with hash output size $s$ bytes, the inclusion proof is exactly $h \cdot s$ bytes plus the leaf index (typically 4 bytes). For Poseidon over BN254 with $s = 32$: $$ |\pi_{\text{inclusion}}| = 32 h + 4 \text{ bytes}. $$ A useful table for production planning: Light Protocol's state trees default to $h = 26$ — see [their account-compression program](https://github.com/Lightprotocol/light-protocol/tree/main/programs/account-compression) — which gives 67 million leaves of capacity per tree. For [zeraswap](/blog/zeraswap_compressed_amm/) and the [`Dax911/z_trade/programs/zeraswap`](https://github.com/Dax911/z_trade/tree/main/programs/zeraswap) program, we use the same $h = 26$ default for the same reason: it's the right balance between proof size and capacity, and it's what the on-chain compression program is parameterised for. ## Compressed accounts, in one diagram The Light Protocol model is the cleanest way to think about "Solana accounts that don't take Solana account space." A compressed account is a tuple of fields hashed together; the hash is a leaf in a state tree; the tree's root is what lives in account state on chain. AD[discriminator] A --> AO[owner] A --> AL[lamports] A --> ADH[data hash] A --> AAH[address hash] end subgraph OnChain[On-chain state tree] H[hash to leaf] --> L[leaf in Merkle tree] L --> R[Merkle root] R --> SA[Solana account: just the root] end A --> H classDef cell fill:#0a0a0a,stroke:#4ade80,color:#4ade80 classDef chain fill:#1a1a1a,stroke:#737373,color:#a3a3a3 class A,AD,AO,AL,ADH,AAH cell class H,L,R,SA chain`}/> The on-chain footprint is the root (32 bytes) plus the rolling-hash update buffer (a few KB amortised across many writes). The account data, the discriminator, the owner, the lamports — none of that lives in account state. It lives in the indexer's Postgres or in a Photon RPC node, and it's reconstructed at proof-construction time. The inclusion proof is the trick that makes this work. To execute against a compressed account, the client constructs an inclusion proof against a recent root, the on-chain program verifies the proof against the root it has stored, and the program operates on the (now-trusted) account contents. The root is the only piece of state that has to live on chain. Everything else is reconstruction. Compressed accounts are stored as leaves in append-only Merkle trees, with only the tree's root maintained in Solana account state. State validity is enforced through inclusion proofs verified by the on-chain program at execution time, allowing arbitrary amounts of state to be referenced at constant on-chain storage cost. The reason this is *the* primitive for production privacy on Solana is that Solana's account-state cost is the load-bearing constraint. A normal Solana account is rent-exempt at roughly 0.002 SOL per kilobyte, meaning a megabyte of state costs ~2 SOL ($300+ at 2026 prices). A compressed account is storage-amortised across the tree, and the cost per leaf is sub-cent. Five orders of magnitude. ## The Node simulator Here is a working Merkle inclusion-proof construction over a 16-leaf tree, with the prover-side path construction and the verifier-side root check. It uses SHA-256 for readability — a real ZERA tree uses Poseidon — but the algorithmic shape is identical. createHash("sha256").update(Buffer.concat([left, right])).digest(); function buildTree(leaves) { // Pad to power of two with zero-leaves (Light Protocol does this with a // canonical default-leaf hash, so empty subtrees have known roots). const padTo = 1 << Math.ceil(Math.log2(Math.max(leaves.length, 1))); const padded = leaves.slice(); while (padded.length < padTo) padded.push(Buffer.alloc(32)); // Bottom-up construction. const levels = [padded]; while (levels[levels.length - 1].length > 1) { const cur = levels[levels.length - 1]; const next = []; for (let i = 0; i < cur.length; i += 2) { next.push(H(cur[i], cur[i + 1])); } levels.push(next); } return { root: levels[levels.length - 1][0], levels }; } function inclusionProof(levels, index) { const path = []; const directions = []; let idx = index; for (let lvl = 0; lvl < levels.length - 1; lvl++) { const sibling = idx % 2 === 0 ? levels[lvl][idx + 1] : levels[lvl][idx - 1]; path.push(sibling); directions.push(idx % 2); // 0 = we are left, 1 = we are right idx = idx >> 1; } return { path, directions }; } function verifyInclusion(leaf, index, path, root) { let cur = leaf; let idx = index; for (const sibling of path) { cur = idx % 2 === 0 ? H(cur, sibling) : H(sibling, cur); idx = idx >> 1; } return cur.equals(root); } // Demo --------------------------------------------------------- function leafFor(i) { // In a real shielded pool: leaf = Poseidon(amount, asset, secret, ...) // Here: a synthetic leaf so we can read the demo output. return createHash("sha256").update(Buffer.from(\`commitment-\${i}\`)).digest(); } const N = 16; const leaves = Array.from({ length: N }, (_, i) => leafFor(i)); const { root, levels } = buildTree(leaves); console.log(\`tree height: \${levels.length - 1}\`); console.log(\`leaf count: \${N}\`); console.log(\`root: \${root.toString("hex").slice(0, 24)}...\`); console.log(""); // Prove inclusion of leaf 7 const idx = 7; const { path, directions } = inclusionProof(levels, idx); const ok = verifyInclusion(leaves[idx], idx, path, root); console.log(\`proving inclusion of leaf \${idx}\`); console.log(\`co-path length: \${path.length}\`); console.log(\`directions: [\${directions.join(", ")}]\`); console.log(\`verifies: \${ok}\`); console.log(""); // Tampering: try to claim leaf 7 = leaves[3] (a different commitment) const fake = verifyInclusion(leaves[3], idx, path, root); console.log(\`tampered claim verifies: \${fake} (expected false)\`); // Proof size in bytes const proofBytes = path.reduce((s, p) => s + p.length, 0) + 4; // +4 for index console.log(""); console.log(\`inclusion proof size: \${proofBytes} bytes\`); console.log(\`(at h=26 it would be \${26 * 32 + 4} bytes)\`); `, "/package.json": `{ "name": "merkle-demo", "version": "1.0.0", "main": "index.js" }`, }} /> Two things to notice when you run this. The proof is *132 bytes* for a 16-leaf tree (4 hashes + an index). Scaled to $h = 26$, it's 836 bytes — independent of how many leaves are in the tree. That's the `O(\log n)` argument with the constant factor pinned down. The other thing: the tampering attempt at the end fails because `verifyInclusion(leaves[3], 7, path, root)` re-hashes up the wrong path. The directions array is what makes this work; without it, the verifier doesn't know whether the sibling goes on the left or the right. ## Batched inclusion via Merkle Mountain Ranges The basic Merkle tree is append-only but expensive to grow — every new leaf forces a recompute of $\log_2 N$ internal nodes. For workloads that batch many leaves at once (rollups, periodic deposit windows, settlement layers), the **Merkle Mountain Range** is the structural improvement. An MMR is a forest of perfect binary trees. New leaves are appended to the rightmost tree; when two trees of equal height exist, they're merged. The peaks (one per tree) are then "bagged" — hashed together — to produce the MMR root. The math: $$ |\text{peaks}| = \text{popcount}(N) $$ so for $N = 2^k$ there's exactly one peak, and for $N$ between powers of two the peak count is bounded by $\lceil \log_2 N \rceil$. An inclusion proof for a leaf in an MMR is the path within its containing perfect tree (size $\le \log_2 N$), plus the peaks of the other trees (size $< \log_2 N$). Total: $$ |\pi_{\text{MMR}}| \le 2 \log_2 N \cdot s $$ with $s$ the hash output size. Slightly larger than a balanced tree, but the *update* is $O(\log N)$ amortised with constant cost per appended leaf in the steady state, which matters for high-throughput deposit workloads. [Todd's original spec](https://github.com/opentimestamps/opentimestamps-server/blob/master/doc/merkle-mountain-range.md) and [Robinson's optimality result (2025)](https://eprint.iacr.org/2025/234) are the references; FlyClient and Mina use MMRs in production. For [zeraswap](/blog/zeraswap_compressed_amm/) we don't use an MMR — the deposit cadence is interactive enough that the balanced tree dominates — but the design seam is in `programs/zeraswap/src/state.rs` so we can swap if/when the deposit pattern shifts. ## What the on-chain program checks The on-chain inclusion check is two operations: 1. Recompute the root from the leaf, the index, and the co-path. This is $\log_2 N$ Poseidon hashes. On Solana with the `sol_poseidon` syscall, that's roughly $26 \times 1500 = 39{,}000$ compute units at $h = 26$. 2. Compare the recomputed root against a recent canonical root stored on-chain. Light Protocol keeps a sliding window of recent roots (the rolling-hash update buffer) so that proofs constructed against a root from 5-10 slots ago still verify. This is what makes high-throughput compressed accounts work — you can't force every prover to re-derive a proof on every slot tick. The on-chain program does not store the leaves. It does not store the inner nodes. It stores the root and the change history, period. The state-cost difference between "32 bytes on chain plus an indexer" and "32 KB per account" is the entire reason ZK Compression got to mainnet on Solana. ## What lives where, in the ZERA stack To make this concrete: ``` crates/zera-sdk-core/src/merkle.rs # Rust prover-side path construction crates/zera-sdk-onchain/src/lib.rs # on-chain root verification packages/sdk/src/merkle.ts # JS path construction (browser wallet) programs/zeraswap/src/state.rs # commitment-tree state account layout ``` All four read the same Poseidon parameters (see [the Pedersen post](/blog/pedersen_commitments_in_production/) for why we have four implementations of Poseidon and how they're cross-validated). The same way the Poseidon hash has to agree byte-for-byte across the four implementations, the Merkle tree construction has to agree on: - Leaf encoding (which fields go into the leaf hash, in what order) - Internal node encoding (left || right, both 32 bytes, big-endian) - Default-leaf hash for empty subtrees (Light Protocol uses Poseidon of zero; we match) - Root format (single 32-byte field element) If any of these drift, the prover and verifier disagree silently and the protocol stops accepting proofs. We have integration tests in `tests/merkle_cross_impl.rs` that build the same tree from the JS and Rust sides and assert equality at every level. They run in CI on every commit. ## Where this leaves the design space The thing I keep coming back to: a Merkle tree is the cheapest possible primitive for "prove this thing is in this set" with a logarithmic-size proof. There is no clever lattice, no exotic accumulator, that comes close on the cost-per-byte budget Solana imposes. Verkle trees are interesting on paper and impractical in production for this surface (the polynomial commitment overhead dominates at the leaf counts we care about). KZG-based vector commitments are interesting for trusted-setup-tolerant rollups and overkill for a privacy pool. So the answer is the boring one. A balanced binary Merkle tree, height 26, Poseidon hash, default-zero subtrees, sliding window of 32 recent roots. It is what Light Protocol shipped, what [zera-sdk](/blog/zera_sdk_scaffolding/) ships, and what every serious Solana privacy stack will ship through the rest of the decade. The interesting work is in the layers above (the SNARK that uses the inclusion proof, the nullifier set that enforces single-spend, the curve choice underneath the hash) — see [the curve post](/blog/why_bn254_and_when_to_switch/) for that. ## Further reading - [ZK Compression Whitepaper](https://www.zkcompression.com/references/whitepaper) — Light Protocol, 2024 — the canonical compressed-account spec. - [Merkle Mountain Ranges are optimal](https://eprint.iacr.org/2025/234) — Robinson (2025) — proves the MMR is space-optimal among append-only authenticated dictionaries. - [Light Protocol account-compression program](https://github.com/Lightprotocol/light-protocol/tree/main/programs/account-compression) — the on-chain implementation. - [`Dax911/z_trade/programs/zeraswap`](https://github.com/Dax911/z_trade/tree/main/programs/zeraswap) — production Rust shape for the compressed-AMM use case. - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — what we hash into the leaves. - [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — the dual structure that prevents double-spends. - [zeraswap: a compressed AMM](/blog/zeraswap_compressed_amm/) — sister piece on the trading layer that lives on top of these trees. --- # The fee paradox: why every smart-contract privacy mixer needs a relayer Canonical: https://blog.skill-issue.dev/blog/the_fee_paradox/ Description: On account-model chains the very act of paying a transaction fee deanonymises the recipient. This post formalises the paradox, walks through three resolutions, and sets up the SPST construction that resolves it inside the ZK proof itself. Published: 2026-04-28T16:00:00.000Z Tags: zk, cryptography, privacy, tornado-cash, railgun, pedersen, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The first time I read the Tornado Cash whitepaper I missed the fee paragraph. I noticed the Merkle inclusion, the nullifier hash, the snarkjs circuit. I did not notice the part where, *to actually withdraw to a fresh address*, you need somebody else to broadcast the transaction. It was buried under "operator/relayer" — a word I generously read as "convenience". It is not a convenience. It is the load-bearing wall of every smart-contract privacy mixer in production. This post is post 2 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. [Post 1](/blog/relayerless_privacy_intro/) introduced the framework $\mathcal{F}_{\text{RP}}$ and outlined what the rest of the series builds. Here we slow down on the fee paradox itself — what it is, why every system using fees in a public token suffers it, and the three approaches that resolve it. ## Definition: the fee paradox On any blockchain $\mathcal{B}$ with a fee-based inclusion mechanism: 1. To submit a transaction, the submitter's address must pay a gas fee $f > 0$. 2. To hold gas, the address must have been previously funded. 3. Funding an address creates an on-chain link to the funding source. Therefore, **any address that submits a transaction has a traceable funding history**. Any privacy-preserving withdrawal to a fresh (unfunded) address requires an external party to pay the gas fee. Formally, in the **fee paradox game**: 1. User $\mathcal{U}$ deposits value $v$ into a shielded pool. 2. $\mathcal{U}$ wishes to withdraw to a fresh $\mathsf{addr}_{\mathsf{recv}}$ with no prior on-chain history. 3. Adversary $\mathcal{A}$ observes all transactions. 4. If $\mathcal{U}$ funds $\mathsf{addr}_{\mathsf{recv}}$ to pay gas, $\mathcal{A}$ traces the funding source. 5. $\mathcal{A}$ wins by linking the withdrawal to a prior deposit with non-negligible advantage. $$ \mathsf{Adv}^{\mathsf{FeeParadox}}_{\mathcal{A}}(\lambda) \;\geq\; 1 - \mathsf{negl}(\lambda) \quad \text{in standard blockchain models.} $$ The advantage is overwhelming. The deck is stacked because the chain literally requires an attestation from a funded account before it will include any state transition. Privacy at the cryptographic layer collides with funding at the consensus layer, and the consensus layer wins. ## Why UTXO chains don't have this problem Bitcoin and Zcash inherit a different model. A UTXO transaction's "fee" is the difference between input value and output value: $$ f \;=\; \sum_i v^{\mathrm{in}}_i \;-\; \sum_j v^{\mathrm{out}}_j. $$ The miner takes $f$ as the implicit fee. There is no separate "gas account" that needs to exist beforehand. In Zcash specifically, a Sapling or Orchard transaction's `valueBalance` field — the net flow from the shielded pool to the transparent pool — *is* the fee. The binding signature proves the value commitment balance. The miner is paid out of value the prover is already moving, signed by the prover, with the prover's identity hidden by the ZK proof. Result: **Zcash is relayer-free by construction**. Penumbra, Aleo, Namada, and Monero are too — for the same reason. They all run on chains whose native fee model is fee-from-balance, not gas-from-account. The fee paradox is specific to **account-model chains** like Ethereum and Solana, where transactions require an explicit fee payer signature and the fee is debited from a known account. ## Three approaches to resolution The paper section §2.3 enumerates three approaches to resolving the paradox without a relayer: ### Approach A — Protocol-Native Fee Abstraction via ZK Fee Proofs The fee is extracted from the shielded pool *inside the ZK proof itself*. The proof attests that $$ \sum_{i=1}^{n_{\mathsf{in}}} v_i \;=\; \sum_{j=1}^{n_{\mathsf{out}}} v'_j \;+\; f $$ where $f$ is a public input to the proof and $v_i, v'_j$ are private inputs. Pedersen commitments make this clean: with $C_i = v_i \cdot G + r_i \cdot H$ for input notes and $C'_j = v'_j \cdot G + r'_j \cdot H$ for output notes, $$ \sum_i C_i \;=\; \sum_j C'_j \;+\; f \cdot G \;+\; r_\Delta \cdot H, $$ where $r_\Delta = \sum_i r_i - \sum_j r'_j$ is a blinding-factor residual that the prover demonstrates equals the right thing. The validator extracts $f$ as inclusion compensation directly. The submitter does not need a public balance. This is what SPST does, and it's the path the whole series builds toward. ### Approach B — Nullifier-Derived Fee Authorization A second derivation from the spending key, parallel to the nullifier: $$ \mathsf{nullifier} = \mathsf{PRF}_k(\rho), \qquad \mathsf{fee\_auth} = \mathsf{PRF}_k(\rho \,\|\, \text{``fee''} \,\|\, f). $$ The ZK proof proves both come from the same $(k, \rho)$, that $\mathsf{fee\_auth}$ encodes $f$, and that the underlying note has sufficient balance. The fee is bound to the nullifier cryptographically — no party can alter the fee post-proof-generation. This is more invasive than Approach A (changes the nullifier scheme) but gives a stronger non-malleability property. Most production designs use Approach A; Approach B becomes interesting when the protocol wants stricter binding for compliance audits. ### Approach C — Recursive Fee Amortization via Batch Proofs For high-frequency private transactions, fold $n$ proofs into a single Nova-style accumulator: $$ \mathsf{FoldedProof}_n \;=\; \mathsf{Fold}(\mathsf{FoldedProof}_{n-1}, \, (\mathsf{tx}_n, w_n)). $$ The folded proof attests that all $n$ transactions are individually valid and that the cumulative fee $F_n = \sum_{k=1}^n f_k$ has been correctly accumulated. A single on-chain verification covers all $n$ transactions. On Solana, with ~200,000 CU per Groth16 verification and a per-transaction limit of ~1,400,000 CU, batches of $n \leq 7$ fit within a single transaction's compute budget. The amortised per-transaction CU cost drops by an order of magnitude. ### Tradeoff summary ## What the relayer-dependent protocols do instead Tornado Cash, RAILGUN, and Light Protocol's older privacy phase all chose **none of the above**. They use a relayer who pays the gas in the host chain's native asset, takes a fee from the withdrawn amount, and broadcasts the transaction. The architecture is roughly: |"1- Build ZK proof locally"| U U -->|"2- Send proof + nullifier + recipient + fee"| R[Relayer] R -->|"3- Pay gas in native asset"| R R -->|"4- Broadcast withdrawal tx"| P[Privacy Contract] P -->|"5- Verify proof, check nullifier"| P P -->|"6- N minus f tokens to recipient"| U P -->|"7- f tokens fee to relayer"| R classDef actor stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff class U,R,P actor `}/> The proof is binding — the relayer cannot redirect funds, cannot change the fee — but the relayer **can refuse to broadcast**. They can also log the user's IP, timing, and metadata. In RAILGUN this is mitigated by routing over the Waku P2P network; in Tornado Cash it was just an HTTPS endpoint. Either way: the relayer is a third party, and that third party is a regulatory and operational single point of failure. ## What changes when fees are folded into the proof Once the fee comes from inside the proof, three things become different: 1. **The submitter does not need a balance.** The transaction can be broadcast by the user themselves from a fresh address that has zero of the host chain's native asset. The chain's transaction-broadcasting interface accepts transactions from any party with a valid signature; that signature now binds nothing to the user's identity. 2. **The validator gets paid out of the shielded pool's escrow.** On Solana, this is realised by having the privacy program's PDA hold a lamport reserve. The fee $f$ — proven inside the SPST proof — authorises a transfer from this reserve to the validator. The shielded pool's internal accounting decrements by $f$. Every deposit replenishes the reserve. 3. **Censorship surface collapses.** There is no "approved relayer list" for an adversary to attack. There is no operator to subpoena. The user's only dependency is **chain liveness** — and that's what Solana's PoS consensus guarantees. This is the Self-Sovereignty Theorem in informal form. The next post ([SPST](/blog/spst_self_paying_shielded_transactions/)) makes it formal. ## Bibliography - Pertsev, A., Semenov, R., Storm, R. (2019). *Tornado Cash Privacy Solution v1.4.* https://berkeley-defi.github.io/assets/material/Tornado%20Cash%20Whitepaper.pdf - RAILGUN Documentation. *Privacy System Architecture.* https://docs.railgun.org - Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf - Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991. - Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370 Previous: [Series intro ←](/blog/relayerless_privacy_intro/) · Next: [SPST: self-paying shielded transactions →](/blog/spst_self_paying_shielded_transactions/) --- # Relayerless privacy on a Turing-complete L1: an intro to F_RP Canonical: https://blog.skill-issue.dev/blog/relayerless_privacy_intro/ Description: A series-opening map of the relayerless full-privacy framework I've been writing up. Five cryptographic games, four constructions (SPST, PPST, TAB, UPEE), one main theorem — and why it matters that the target chain is Solana. Published: 2026-04-26T15:00:00.000Z Tags: zk, cryptography, privacy, solana, vanta, research, phd import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; I've been writing a paper. Working title: *Relayerless Full-Privacy Framework for Turing-Complete Blockchain Systems*. I keep calling it $\mathcal{F}_{\text{RP}}$ in my notebook, and I'll keep doing that here. The shape of it is a quintuple of protocols — $\mathsf{Setup}$, $\mathsf{Shield}$, $\mathsf{Transfer}$, $\mathsf{Unshield}$, $\mathsf{Execute}$ — that together aim to do something every existing privacy system on a smart-contract chain refuses to do: **let the user finish a private transaction without paying anyone but a validator**. This post is the orientation. Subsequent posts in the series step through each construction in detail with proofs, circuit costs, and Solana instantiation numbers. Here I want to set the table — what problem $\mathcal{F}_{\text{RP}}$ targets, what games formalise it, and how the four pieces compose. ## The relayer problem, in one paragraph Submit a private withdrawal on Tornado Cash from a fresh address. The contract runs the proof, accepts it, and tries to send 1 ETH to your fresh address. Except the *fresh* address has zero ETH and cannot pay gas. So you can't *be* the submitter — somebody else has to broadcast the transaction with their own ETH and bill you for it. That somebody is the **relayer**. The relayer breaks the on-chain link between your deposit and your withdrawal address, but in exchange they observe everything: your IP, your timing, the recipient address, the fee you accept, and which proof maps to which deposit. They are also a **single regulatory point of failure**, as everyone in the West learned in August 2022 when [OFAC sanctioned Tornado Cash](https://www.mayerbrown.com/en/insights/publications/2024/12/federal-appeals-court-tosses-ofac-sanctions-on-tornado-cash) and the registered relayers stopped operating. The user funds were not seized — they were merely *unspendable* because the relayer infrastructure went dark. Zcash, Penumbra, and Aleo don't need relayers because they are their own chains. Aztec doesn't need relayers because it is its own L2 with its own sequencer. Tornado Cash, RAILGUN, and Light Protocol's older privacy phase need relayers because they are smart-contract layers on a host chain whose fees must be paid in the host chain's native asset by an address that already has it. What I want — and what $\mathcal{F}_{\text{RP}}$ delivers — is a privacy protocol that runs as a smart-contract layer on a Turing-complete L1, where the only thing the protocol needs from the outside world is **liveness**: the chain keeps making blocks, and any valid transaction eventually gets included. ## Five games that pin down "relayer dependence" Section 1 of the paper formalises five distinct failure modes that emerge from relayer dependence. Every one of them is an active threat against currently deployed protocols. I'll quote them tersely; the full game definitions are in the paper. The point of formalising these as games is the same point Goldwasser, Micali, and Rackoff made about zero-knowledge proofs in 1985: until you've written down what an adversary can do and how it wins, you have no theorem to prove. The five games above are what every honest analysis of a privacy protocol owes the reader. ## What we want, formally $\mathcal{F}_{\text{RP}} = (\mathsf{Setup}, \mathsf{Shield}, \mathsf{Transfer}, \mathsf{Unshield}, \mathsf{Execute})$ — five protocols, each a PPT algorithm, with the following five desiderata: **D1 (Full Privacy).** For any PPT adversary with full view of chain state $\sigma$ and any two valid transactions $\mathsf{tx}_0, \mathsf{tx}_1$ (different senders / recipients / amounts / programs): $$ \mathsf{Adv}^{\mathsf{priv}}_{\mathcal{A}}(\lambda) \;=\; \bigl|\,\Pr[\mathcal{A}(\sigma, \mathsf{tx}_0) = 1] - \Pr[\mathcal{A}(\sigma, \mathsf{tx}_1) = 1]\,\bigr| \;\leq\; \mathsf{negl}(\lambda). $$ **D2 (Self-Sovereignty).** For every protocol operation $\mathsf{Op}$ and any adversary controlling all network participants except the user $\mathcal{U}$, $\mathcal{U}$ still completes $\mathsf{Op}$ with overwhelming probability — assuming only that the underlying chain $\mathcal{B}$ provides liveness. **D3 (Composability).** Private state transitions can invoke arbitrary smart contract logic. For any arithmetic circuit $C: \mathbb{F}^n \to \mathbb{F}^m$ with $|C|$ gates, the framework supports $\mathsf{Execute}(\mathsf{pp}, C, \cdot, \cdot)$ with proof generation cost polynomial in $|C|$. **D4 (Succinctness).** On-chain verification cost $O(1)$ pairings or $O(\log n)$ hash evaluations. Proof size $O(1)$ or $O(\log^2 n)$. **D5 (No / Universal Trusted Setup).** Either no setup (transparent) or a universal SRS that is updatable by any party. If you've read [the post on Halo2](/blog/halo2_in_2026_what_changed/) you'll recognise D5 as the "no per-circuit ceremony" requirement. D1, D2, D3, D4 are the standard four for a privacy SNARK; D2 is the one the existing relayer-dependent protocols silently violate. ## Four constructions The framework decomposes into four primitives, each addressing one piece of the problem: Self-Paying Shielded Transactions] --> D[UPEE
Universal Private Execution Environment] B[PPST
Private Programmable State Transitions] --> D C[TAB
Threshold-Anonymous Broadcast] --> D D --> E[Theorem 3.12
Simulation-Based Privacy] D --> F[Theorem 3.13
Self-Sovereignty] classDef build stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff classDef thm stroke:#facc15,stroke-width:2px,fill:#0a0a0a,color:#fff class A,B,C,D build class E,F thm `}/> 1. **SPST — Self-Paying Shielded Transaction.** A note/commitment/nullifier scheme where the fee $f$ is extracted *inside the ZK proof itself* via a Pedersen-commitment balance equation. The fee paradox dies here. ([Post 3](/blog/spst_self_paying_shielded_transactions/).) 2. **PPST — Private Programmable State Transitions.** SPST generalised so that the proof attests to correct execution of an arbitrary arithmetic circuit $C$ over committed pre-state and post-state. This is what makes the framework Turing-complete. ([Post 4](/blog/ppst_private_programmable_state/).) 3. **TAB — Threshold-Anonymous Broadcast.** Network-layer anonymity, using ring signatures (Approach A) or FROST-style threshold Schnorr (Approach B) to hide which of $n$ participants actually submitted the transaction. ([Post 5](/blog/tab_threshold_anonymous_broadcast/).) 4. **UPEE — Universal Private Execution Environment.** The composition: $(\mathsf{Setup}, \mathsf{Deploy}, \mathsf{Invoke}, \mathsf{Verify}, \mathsf{Finalize})$. UPEE is what gets deployed to a chain. ([Post 7](/blog/upee_universal_private_execution/).) The two main theorems sit on top of the stack: - **Theorem 3.12 (Simulation-Based Privacy).** For any PPT adversary controlling the blockchain there exists a PPT simulator $\mathcal{S}$ such that $\{\mathsf{View}_{\mathcal{A}}(\mathsf{Real})\} \approx_c \{\mathsf{View}_{\mathcal{A}}(\mathsf{Ideal})\}$, where $\mathcal{S}$ learns only that *some* valid transaction occurred and *some* fee was paid. - **Theorem 3.13 (Self-Sovereignty).** $\Pr[\mathsf{Game}_{\mathrm{RF}}(\mathcal{A}, \lambda) = 1] = 1 - \mathsf{negl}(\lambda)$ for any adversary $\mathcal{A}$ controlling all network participants except the user. The first theorem is the "this is private" theorem; the second is the "you don't need a relayer" theorem. The series will derive both. ## Why Solana, specifically I keep being asked why I'm building this on Solana instead of writing yet another L1. The honest answer: 1. The chain already exists, has 65k+ TPS theoretical throughput, and sub-second finality. 2. Native `alt_bn128` syscalls (added in v1.16) make Groth16 verification cost **< 200,000 CU** on-chain — that's roughly $0.02 per private transaction. 3. The 1,232-byte transaction limit is tight but not impossible: SPST fits in **656 bytes**. SIMD-0296 (approved late 2025) raises this to 4,096 bytes. 4. Light Protocol's [ZK Compression](https://www.zkcompression.com/resources/whitepaper) infrastructure already provides Poseidon Merkle trees and Groth16 verification — most of the substrate I need. The chain doesn't get to lie about what it ran. So make the chain run something that doesn't tell anyone anything. Solana is also the only general-purpose Turing-complete L1 that has shipped pairing-friendly elliptic-curve precompiles to the validator runtime. Ethereum has had `EIP-197` since the Byzantium fork (2017), but the gas costs make Groth16 verification on Ethereum L1 cost ~$5 per proof at typical gas prices. Solana's per-CU pricing brings that down by ~400×. ## What's coming in the series | # | Slug | What it covers | |---|------|----------------| | 2 | [`the_fee_paradox`](/blog/the_fee_paradox/) | Why every smart-contract privacy protocol needs a relayer (or doesn't) | | 3 | [`spst_self_paying_shielded_transactions`](/blog/spst_self_paying_shielded_transactions/) | SPST construction, balance theorem, double-spend resistance, unlinkability proof | | 4 | [`ppst_private_programmable_state`](/blog/ppst_private_programmable_state/) | Generalising SPST to arbitrary computation; PPST relation; PPST-SPST composition | | 5 | [`tab_threshold_anonymous_broadcast`](/blog/tab_threshold_anonymous_broadcast/) | Ring signatures over Ed25519 + FROST threshold Schnorr | | 6 | [`verifiable_shuffles_for_privacy`](/blog/verifiable_shuffles_for_privacy/) | Bayer-Groth shuffles for network-layer mixing | | 7 | [`upee_universal_private_execution`](/blog/upee_universal_private_execution/) | UPEE deploy / invoke / verify; the simulation-based privacy theorem | | 8 | [`solana_instantiation_656_bytes`](/blog/solana_instantiation_656_bytes/) | Concrete Solana instantiation with CU + transaction-byte budgets | | 9 | [`f_rp_vs_existing_privacy_systems`](/blog/f_rp_vs_existing_privacy_systems/) | F_RP vs Zcash, Tornado, Railgun, Aztec, Penumbra, Aleo, Namada, Monero | | 10 | [`mev_resistance_in_private_execution`](/blog/mev_resistance_in_private_execution/) | Sandwich-proofness; bounding MEV by public-bit leakage | | 11 | [`post_quantum_relayerless_path`](/blog/post_quantum_relayerless_path/) | Lattice commitments, STARK wrapping, isogeny credentials | ## Bibliography for this post - Aylor, H. (2026). *Relayerless Full-Privacy Framework for Turing-Complete Blockchain Systems.* Preprint, Zera Labs. (The paper this series is derived from. Final PDF will land at `/papers/relayerless-privacy/` once typeset.) - Ben-Sasson, E. et al. (2014). *Zerocash: Decentralized Anonymous Payments from Bitcoin.* IEEE S&P 2014. - Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf - Pertsev, A., Semenov, R., Storm, R. (2019). *Tornado Cash Privacy Solution v1.4.* - Mayer Brown (2024). *Federal Appeals Court Tosses OFAC Sanctions on Tornado Cash.* Next post: [The fee paradox →](/blog/the_fee_paradox/) --- # Cross-compiling vantad for darwin: Apple Silicon, sign + notarise Canonical: https://blog.skill-issue.dev/blog/vanta_darwin_apple_silicon_build/ Description: Shipping vantad as a notarised Mac binary inside a Tauri app meant fixing libconsensus link order, building Rust release with the right target triple, signing every sidecar, and stapling the DMG separately. The notes from the trenches. Published: 2026-04-13T18:25:14.000Z Tags: vanta, darwin, macos, apple-silicon, tauri, codesign The 2026-04-13 commit `eff33f7a chain+build: mined genesis nonce + libconsensus links FFI` and the 2026-04-23 commit `0edddc82 build: darwin frameworks, wallet-ui node types, v2 test renames` are the bookends of the macOS build story. In between is a week of "why does my dylib not load" and "why does Gatekeeper not trust this DMG even though everything inside it is signed." This post is the field notes from cross-compiling `vantad` for darwin-aarch64, signing everything that needs signing, notarising what needs notarising, and stapling the DMG so users don't see "this came from the internet" prompts. Everything I describe is in [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) and the [`doc/build-osx.md`](https://github.com/Dax911/vanta/blob/main/doc/build-osx.md) it leans on. The goal is so an engineer running their first ARM Mac doesn't have to repeat the mistakes. ## What you actually have to ship A Vanta desktop install on macOS contains three sidecar binaries inside one signed `.app`: - `vantad-aarch64-apple-darwin` — the C++ Bitcoin Core fork - `vanta-cli-aarch64-apple-darwin` — the matching CLI - `vanta-node-aarch64-apple-darwin` — the Rust L2 sidecar Plus the Tauri host binary (`Vanta Wallet`) and the WebView assets. All of this rides inside a `.dmg` that has to itself be notarised separately. There's a sentence I'm going to repeat because it tripped me twice: **on macOS Tauri 2.x notarises the `.app`, not the `.dmg` that wraps it.** Gatekeeper checks the file the user *downloaded*, which is the DMG. So you have to submit the DMG to `notarytool` separately and staple the resulting ticket. This is documented in approximately zero places. I figured it out by attempting to install my own DMG on a clean VM and watching Gatekeeper refuse it with a generic error. ## The build sequence The release script does seven steps. I'll narrate them. **Step 1: build `vantad`.** This is the C++ Bitcoin Core fork's autotools build: ```bash ./autogen.sh ./configure --without-gui --disable-tests --disable-bench \ --without-bdb --without-miniupnpc --without-natpmp make -j$(sysctl -n hw.ncpu) ``` The `--without-bdb --without-miniupnpc --without-natpmp` flags are the canonical "don't pull in dependencies the wallet doesn't need" set. BerkeleyDB only matters for legacy wallets, miniupnpc is for UPnP NAT traversal, natpmp is the same on Apple's stack. Skipping them shaves 30+ MB and a bunch of failure modes off the binary. `--without-gui` is because we're not shipping `vanta-qt`. The Qt UI is *also* possible to ship — the upstream Bitcoin Core team supports it — but on Vanta the desktop wallet *is* the Tauri app, and the C++ binary is just a sidecar. No need for two UIs. **Step 2: build `vanta-node`.** ```bash cd vanta && cargo build --release -p vanta-node ``` Cargo handles the cross-compile to whatever the host target is. On an Apple Silicon Mac that produces the `aarch64-apple-darwin` binary you want. On Intel macs you'd get `x86_64-apple-darwin`; the Tauri sidecar resolution in [`node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) handles both. The `target_triple()` helper in `node.rs` makes the runtime resolution honest: ```rust pub fn target_triple() -> &'static str { if cfg!(target_os = "macos") { if cfg!(target_arch = "aarch64") { "aarch64-apple-darwin" } else { "x86_64-apple-darwin" } } else if cfg!(target_os = "linux") { "x86_64-unknown-linux-gnu" } else { "x86_64-pc-windows-msvc" } } ``` The host binary discovers its sidecars by appending the target triple suffix. This is also how Tauri itself decides which file to bundle — `tauri.conf.json` declares `"binaries/vantad"` and the bundler looks for `vantad-aarch64-apple-darwin` next to the conf. **Step 3: copy the sidecars.** ```bash cp "$REPO_ROOT/src/bitcoind" "$BINDIR/vantad-$TRIPLE" cp "$REPO_ROOT/src/bitcoin-cli" "$BINDIR/vanta-cli-$TRIPLE" cp "$REPO_ROOT/vanta/target/release/vanta-node" "$BINDIR/vanta-node-$TRIPLE" ``` The C++ binary is still called `bitcoind` after the upstream fork (we haven't renamed the actual file in `src/` because that breaks too much of the upstream build); we rename it during the copy. **Step 4: install frontend deps.** `pnpm install`. Vite/Tauri build needs the React app's deps for the bundler. **Step 5: build the Tauri app.** `pnpm tauri build`. Tauri auto-signs the `.app` (and every binary inside it) with the Developer ID identity declared in [`tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json): ```json "macOS": { "signingIdentity": "474F624D8F3783B4D607CFF2331AD4C6CC26A1B5", "providerShortName": "9HD4Q82U58", "entitlements": "entitlements.plist", "minimumSystemVersion": "10.15" } ``` Tauri also auto-notarises the `.app` if `APPLE_ID`, `APPLE_PASSWORD`, and `APPLE_TEAM_ID` are set in the environment. The release script reads them from `.env.local`. If they're missing the script prints a warning and proceeds without notarisation — useful for local dev builds. **Step 6: notarise the DMG separately.** ```bash xcrun notarytool submit "$DMG_PATH" \ --apple-id "$APPLE_ID" \ --password "$APPLE_PASSWORD" \ --team-id "$APPLE_TEAM_ID" \ --wait xcrun stapler staple "$DMG_PATH" ``` This is the step I had to discover. `notarytool submit` uploads the DMG, Apple's notary service runs its scan, and `stapler staple` attaches the resulting ticket to the file so Gatekeeper can verify offline. **Step 7: verify everything.** ```bash codesign --verify --deep --strict --verbose=2 "$APP_PATH" spctl -a -t open --context context:primary-signature -v "$DMG_PATH" xcrun stapler validate "$APP_PATH" xcrun stapler validate "$DMG_PATH" ``` If any of these fail I want to know *before* the DMG ships, not after a user tries to install it. The verification is fast — under a second on a recent Mac — so it's free to run as a final step. ## The libconsensus link order issue The 2026-04-13 commit `eff33f7a chain+build: mined genesis nonce + libconsensus links FFI` was the fix for a problem that took an embarrassing amount of time. The C++ build's link order didn't include the FFI verifier static library (`libvanta_verifier.a`) before the Bitcoin libconsensus shared library, with the result that consensus-time calls to `vanta_verify_and_decode` came back as undefined symbols at runtime. The fix was a `Makefile.am` patch making the linker order explicit: ``` src_bitcoind_LDADD = libvanta_verifier.a $(LIBBITCOIN_CONSENSUS) ... ``` The lesson: when you're FFI-binding a Rust static lib into a C++ autotools project, `LDADD` order is load-bearing. The static lib has to come *before* the shared lib that depends on it, or the linker won't resolve symbols. This is one of those things autotools makes more painful than it should be; in cmake you'd never trip on it. ## The `share/pixmaps` straggler A bunch of the macOS-build pain wasn't actual build pain; it was rebrand pain. The Bitcoin Core stock build copies icons from `share/pixmaps/bitcoin*.{png,xpm,ico}` into the bundle. Those files still showed Bitcoin's logo even after the chain rebrand, because the icon files weren't `git mv`'d during the zera→vanta rebrand. From [`CLAUDE.md`](https://github.com/Dax911/vanta/blob/main/CLAUDE.md): > Bitcoin Core stock icons in `share/pixmaps/bitcoin*.{png,xpm,ico}` still show Bitcoin logo; `src/qt/res/` Qt resources also unrebranded. Qt wallet rebrand is secondary. We don't ship `vanta-qt` so this is a cosmetic-only issue, but it's the kind of thing an external auditor will flag and rightly so. **TODO: Dax confirm we ship the icon rename in the next pass.** ## Why ship a Mac binary at all A reasonable challenge: if Vanta is meant to be operator-driven and most operators run Linux servers, why spend this much effort on Mac packaging? The answer is that *desktop* runs on Mac. Servers run Linux; that's the `vantad` people deploy with systemd. But the wallet — the thing a person actually opens to send a transaction — needs to feel native on the platform the user has. In 2026 that's Mac for half my user base, Linux for the other half (with a long tail of Windows, which we ship via the `bd7d6299` MSI build). A privacy-chain wallet that only works on Linux is a wallet that's only used by people who already agree with you. The Mac story is the bridge to *normal users*. ## What I would do differently 1. **Codesign-by-default in the Rust build.** I have my Apple Developer creds in `.env.local` and the release script reads them. If I'm doing a quick dev build I sometimes forget to enable signing, and then the resulting binary won't load on a fresh macOS sandbox. Default-on signing for any release build, opt-out for dev builds, would be safer. 2. **Universal binary instead of two builds.** Right now I build aarch64 and x86_64 separately and ship two DMGs. `lipo` can produce a universal binary that runs on both. Tauri 2.x supports it. On the list. 3. **Reproducible builds.** Bitcoin Core has a [Guix-based reproducible build setup](https://github.com/Dax911/vanta/blob/main/doc/guix.md) that produces byte-identical binaries on any host with the right toolchain. I haven't ported that to the Vanta build because it'd require pulling vanta-node into the Guix manifest. Important for downstream trust; not blocking for a first release. ## Further reading - [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) — the script this post narrates - [`vanta-desktop/src-tauri/tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) — the bundle config - [`doc/build-osx.md`](https://github.com/Dax911/vanta/blob/main/doc/build-osx.md) — the upstream Bitcoin Core macOS build doc - [`doc/guix.md`](https://github.com/Dax911/vanta/blob/main/doc/guix.md) — reproducible-build path I haven't taken yet - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the app this build produces - [Apple's notarytool docs](https://developer.apple.com/documentation/security/notarizing-macos-software-before-distribution) — the notarisation contract --- # Vanta Desktop: a Tauri wallet that ships its own full node Canonical: https://blog.skill-issue.dev/blog/vanta_desktop_tauri_wallet/ Description: Most desktop wallets are thin RPC clients that talk to somebody else's node. The Vanta desktop app spawns vantad and the L2 sidecar as Tauri sidecar binaries, owns their PIDs, and adopts orphans on restart. Here is how that came together. Published: 2026-04-13T21:39:27.000Z Tags: vanta, tauri, rust, desktop, wallet, sidecar The user-facing pitch for Vanta is short: open the wallet, click send, watch a private transaction settle on a chain you can verify yourself. The version of that pitch that's actually true requires three processes: a C++ Bitcoin Core fork (`vantad`), a Rust L2 sidecar (`vanta-node`), and a UI. Most "desktop wallets" in 2026 ship the UI and trust someone else for the other two. We didn't want to ship a wallet like that, and the answer turned out to be Tauri. This post is a tour of [`vanta/vanta-desktop`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop) — the Tauri 2.x app that bundles `vantad` and `vanta-node` as sidecars, runs them under PID supervision in a Rust host, and exposes the resulting capability to a React frontend through `#[tauri::command]` IPC. Sister reads: [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) is the chain itself, and [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) is the deeper dive on the L2 daemon. ## Why Tauri at all I spent an unreasonable number of hours on this question. The candidates were Electron, Tauri, native (Swift / Cocoa for Mac, GTK or Qt elsewhere), and a Wails-style Go-with-WebView setup. The constraints: 1. The wallet has to ship a **full node binary**. Not link to it — *ship* it as an external file inside the app bundle. That binary is ~25 MB on macOS aarch64. 2. The node has to **run as a child process** of the app, with the app owning its PID and capable of cleanup on quit. 3. The UI is React because the [web wallet UI](https://github.com/Dax911/vanta/tree/main/wallet-ui) already existed and I wasn't rewriting it. 4. The signing path uses Apple's Developer ID program. The bundle has to be signed and notarised, including the sidecars. Electron was out because the bundle bloat (Chromium ~300 MB) plus the historical Electron-IPC-as-XSS attack surface was a non-starter for a wallet. Native Mac was out because we ship Linux too. Wails was tempting but the sidecar story for non-Go binaries is awkward. Tauri 2.x ticked every box: small bundle (the WebView is OS-provided), sidecars are first-class via the [`externalBin` field](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json), the IPC contract is generated from `#[tauri::command]` Rust functions, and the host process is a normal Rust program where I can do `std::process::Command::new(...)` exactly the way I'd do in any CLI. The `tauri.conf.json` declares the sidecars verbatim: ```json "externalBin": [ "binaries/vantad", "binaries/vanta-node", "binaries/vanta-cli" ] ``` Tauri's bundler will copy `binaries/vantad-aarch64-apple-darwin` (note the target-triple suffix it requires) into the `.app`'s `Contents/MacOS/` directory and code-sign it with the same identity as the host binary. From the Rust side I get a path I can spawn against. Done. ## The sidecar inventory There are three sidecar binaries and they have different jobs. `vantad` is the C++ Bitcoin Core fork. It's the consensus node — it runs the SHA-256 PoW mainnet, validates blocks, holds the L1 UTXO set, exposes JSON-RPC on a port. From the desktop app's point of view it is the ground truth for "what does the chain say." `vanta-node` is the Rust L2 sidecar. It indexes commitments and nullifiers from L1 OP_RETURN anchors, maintains the SMT, and exposes a REST API. The shielded balance, the SMT root, the nullifier set — that's all here. The desktop app talks to it on a separate port. `vanta-cli` is the C++ command-line client. It's there for power users and debugging. The wallet doesn't shell out to it for anything load-bearing, but it's bundled because if you have `vantad` you almost always want `vanta-cli` too. The sidecar build script is short and readable — [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) just searches well-known paths, copies the binaries into `src-tauri/binaries/`, and renames them with the target-triple suffix Tauri's bundler expects. The release pipeline ([`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh)) builds `vantad` and `vanta-node` from source first, then runs `pnpm tauri build`, then notarises the resulting `.dmg` separately because Tauri 2.x notarises the `.app` but not the DMG wrapper. ## The PID supervisor Once you've decided to ship a binary, you've inherited a job: babysit its process. The supervisor lives in [`src-tauri/src/node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) and the centerpiece is the `NodeManager` struct. ```rust pub struct NodeManager { l1_process: Option, l2_process: Option, l1_adopted: bool, l2_adopted: bool, l1_bin: Option, l2_bin: Option, pub l1_logs: LogBuffer, pub l2_logs: LogBuffer, app_handle: Option, } ``` A few things in here that took longer than they should have to get right. **Adoption.** The desktop app uses dedicated ports — `19332` for L1 RPC, `19333` for P2P, `19380` for the L2 API — so it never collides with a standalone `vantad` running elsewhere on the same machine. But it *does* collide with itself if a previous run died ungracefully. So `start_l1` first probes the port: if something is listening *and* it answers `getblockchaininfo` correctly, we adopt it (return PID 0 as a sentinel). If something is listening but not responsive, we kill the orphan with `lsof -ti :PORT | xargs kill` and respawn. If the port is free, we spawn fresh. This logic is not optional. The first version of this code didn't have it and every quit-and-relaunch produced a "port in use, vantad failed to start" error that confused absolutely everybody. **Pipe draining.** A C++ process logging to stdout will eventually fill the OS's 64 KB pipe buffer if nobody reads it, then block on the next `write()`. `vantad` with `-printtoconsole` is a heavy logger. The host has to drain the pipes constantly. The function that does it is small enough to quote whole: ```rust fn drain_pipe( pipe: R, label: &'static str, log_buf: LogBuffer, event_emitter: Option, ) { std::thread::spawn(move || { let reader = std::io::BufReader::new(pipe); for line in reader.lines() { match line { Ok(text) => { tracing::debug!("[{label}] {text}"); log_buf.push(text.clone()); if let Some(ref app) = event_emitter { let _ = app.emit("node-log", serde_json::json!({ "source": label, "line": text, })); } } Err(e) => { tracing::debug!("[{label}] pipe read error: {e}"); break; } } } }); } ``` Each line goes three places: the Rust tracing log, a 200-line ring buffer that the frontend can pull on demand (`status` returns the last 20), and a Tauri event so the frontend can render a live console. This last one is the thing that turned a black-box "is the node alive" indicator into a full-screen log view that's actually useful to debug failures. **Auto-config.** Before `vantad` starts, the host writes a fresh `vanta.conf` into the desktop-isolated data dir at `{home}/.vanta-desktop/l1/vanta.conf`. The config is hardcoded for the desktop's port plan, points at the seed nodes, sets `txindex=1` so the L2 watcher can find historic OP_RETURN anchors, and disables Bitcoin-style DNS seeding (we're not on Bitcoin's network). The user never sees this file unless they go looking. ## Sequenced startup The first wallet release would just spawn both nodes and hope. The result was a race condition: `vanta-node` would come up before `vantad`'s RPC was reachable, fail its first poll, and die. We added [`sequenced_startup`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) so the L2 only starts after the L1's RPC has actually answered. ``` Stage 1: Start L1 (may adopt) Stage 2: Wait for L1 RPC up to 60s, exponential backoff Stage 3: Create / load default wallet via RPC Stage 4: Start L2 (now that L1 is confirmed reachable) Stage 5: emit "ready" event to frontend ``` Each stage emits a Tauri event the frontend subscribes to. The first-launch UX is a five-stage progress meter that goes "spawning vantad… RPC ready… wallet loaded… spawning vanta-node… ready." On a warm cache that whole flow takes about 4 seconds. On a cold first launch it's closer to 12. Better than 12 silent seconds with a spinner. If anything fails, the failure stage gets the last 15 lines of stdout/stderr appended into the error message. The user sees not "vantad failed" but "vantad exited during startup. Last output: …". That diagnostic surface alone has paid for itself ten times over in support tickets I didn't have to chase. ## The IPC contract The frontend never speaks JSON-RPC to `vantad` directly. Every UI action goes through a `#[tauri::command]` defined in [`commands.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/commands.rs). The `lib.rs` registration is a single `invoke_handler` macro: ```rust .invoke_handler(tauri::generate_handler![ commands::wallet_init, commands::wallet_info, commands::wallet_balance, commands::wallet_notes, commands::wallet_send, commands::wallet_sync, commands::wallet_pubkey, commands::rpc_call, commands::start_nodes, commands::node_start_l1, commands::node_start_l2, commands::node_stop_l1, commands::node_stop_l2, commands::node_status, commands::l2_status, commands::swap_initiate, commands::swap_participate, commands::swap_list, commands::swap_inspect, commands::get_settings, commands::set_settings, ]) ``` Every command is a typed Rust function that Tauri generates a TypeScript stub for. The frontend imports `invoke('wallet_balance')` and gets back a typed JSON response. There's no HTTP server inside the app, no `localhost:8085`, no possibility of a malicious website hitting the wallet's API. This is a privacy property as well as a security one. A web wallet that runs on `localhost:8085` is reachable by any browser tab. A Tauri wallet that uses the IPC bridge isn't. The wallet's `csp` is `null` in `tauri.conf.json` only because the frontend doesn't load anything cross-origin — every "fetch" is actually an `invoke`. ## Linux/NVIDIA, the cursed stanza Two lines in [`lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) earned a comment longer than they are: ```rust #[cfg(target_os = "linux")] { if std::env::var("WEBKIT_DISABLE_DMABUF_RENDERER").is_err() { std::env::set_var("WEBKIT_DISABLE_DMABUF_RENDERER", "1"); } } ``` webkit2gtk on NVIDIA's proprietary driver under Wayland tries to use a DMA-BUF renderer path that crashes with "Error 71 (Protocol error) dispatching to Wayland display." Disabling it forces software compositing, which is fine. This one bug ate a weekend before the workaround landed. The wider lesson: when you ship a desktop app you become a desktop developer, and "desktop developer" means "the OS will surprise you in ways the web never has." Budget for it. ## macOS sign + notarise Apple's developer pipeline for distributing an app outside the Mac App Store is its own genre of misery, but it's a solved misery. The [`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) script automates the whole thing: 1. Build `vantad` (C++) and `vanta-node` (Rust release) from source. 2. Copy them into `src-tauri/binaries/` with target-triple suffixes. 3. Load `APPLE_ID` / `APPLE_PASSWORD` / `APPLE_TEAM_ID` from `.env.local`. 4. Run `pnpm tauri build`. Tauri auto-signs the `.app` with the Developer ID identity declared in `tauri.conf.json` (`Developer ID Application: Hayden Porter-Aylor (9HD4Q82U58)`). 5. Submit the `.dmg` separately to `xcrun notarytool`. 6. Staple the resulting ticket with `xcrun stapler staple`. 7. Verify everything with `codesign --verify --deep --strict --verbose=2` and `spctl -a -t open -v`. The reason step 5 exists at all is that Tauri 2.x notarises the `.app` but not the `.dmg` that wraps it. Gatekeeper checks the outer file when a user downloads the DMG, so we have to submit the wrapper separately. This is documented in approximately zero places. I figured it out by attempting to install my own DMG on a clean VM and watching Gatekeeper refuse it. Two hours of head-scratching later, the staple step landed. The end state: a user downloads `Vanta Wallet.dmg`, double-clicks it, drags the app to Applications, and Gatekeeper signs off without a "this came from the internet" prompt. That's the outcome that matters — and it would not be possible without the Tauri sidecar pattern signing the inner binaries with the same identity. ## What I changed my mind about I started the desktop project genuinely planning to ship the existing `wallet-ui` as a webpage and tell people to run a `vantad` themselves. The friction of that — every user a node operator on day one — was always going to be a non-starter for everybody but engineers like me. The desktop app is the answer to "I want my mom to be able to use this," and Tauri's sidecar feature is what made the answer cheap enough to ship. If you're building a wallet for a privacy chain in 2026 and you skip the embedded full-node story, you are shipping an indexed light client and calling it a wallet. That's fine for some products. It is not fine for this one. The whole pitch of Vanta is *you don't have to trust an indexer.* If the wallet trusts an indexer the pitch evaporates. ## TODO: Dax confirm - The signing identity hash `474F624D8F3783B4D607CFF2331AD4C6CC26A1B5` and team ID `9HD4Q82U58` are real Apple Developer values. They're committed to the repo because the cert itself is private and the public values aren't sensitive — but worth a sanity check before publishing a wider distribution. - Windows MSI build was added in [commit `bd7d6299`](https://github.com/Dax911/vanta/commit/bd7d6299) on 2026-04-14. I'm describing the macOS-canonical pipeline because it's the one I run end-to-end; the Windows path may have evolved since I last touched it. ## Further reading - [`vanta-desktop/src-tauri`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop/src-tauri) — the Tauri host, including the node supervisor - [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) — sign + notarise + verify pipeline - [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — what `vanta-node` is doing on the other side of these IPC calls - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — what's running inside `vantad` - [Tauri 2.x docs on sidecars](https://tauri.app/v2/develop/sidecar/) — the framework feature this is all built on - [Apple's notarytool docs](https://developer.apple.com/documentation/security/notarizing-macos-software-before-distribution) — the macOS distribution pipeline --- # The vanta sidecar: how a Rust ZK indexer talks to a C++ Bitcoin node Canonical: https://blog.skill-issue.dev/blog/vanta_sidecar_architecture/ Description: vantad is C++. The ZK index is Rust. They cooperate over RPC and a REST API, with the C++ verifier linked statically through libvanta_verifier.a. Here is the audit-surface trade we made and what the sidecar actually does. Published: 2026-04-13T17:46:02.000Z Tags: vanta, rust, sidecar, sp1, zk, bitcoin, ffi A 1-minute-block Bitcoin Core fork with ZK proofs at consensus has a problem the README doesn't volunteer: you need the validator to *check the proofs*, but you don't want to write the proof system in C++. Vanta's answer is a hybrid. The C++ consensus engine calls a Rust verifier statically linked as `libvanta_verifier.a`. The L2 indexing, the SMT, the encrypted-note delivery, and the proof-generation hot path all live in a *separate* Rust process — `vanta-node` — that talks to `vantad` over JSON-RPC and to wallets over REST. Two things share the name "sidecar" in this codebase and I want to disambiguate them up front: 1. The **FFI verifier** (`vanta-verifier-ffi` → `libvanta_verifier.a`) is *linked into vantad*. It runs in-process. It's what answers "does this SP1 proof verify" inside `src/script/interpreter.cpp`. 2. The **L2 sidecar** (`vanta-node`) is a *separate daemon*. It indexes commitments, holds the SMT, distributes encrypted notes, and serves a REST API to wallets. It does not participate in consensus. This post is about both, because the architecture only makes sense when you see why one is in-process and the other isn't. ## The audit-surface trade Bitcoin Core has 280k+ lines of C++ that have been read by more eyes than any other consensus codebase on Earth. Adding a Rust dependency to that build is a non-trivial ask of a future Bitcoin-Core-style review. We made the call up-front: a *minimal* Rust footprint inside `vantad`, exposed through a hand-written C ABI, with everything else in a separate process. The minimal footprint is `libvanta_verifier.a`. From the [zkVM engineering paper](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md), the contract: > The bridge between the ZK proof world and the consensus world is a 440-byte C-compatible structure called `VantaJournal`, declared in `src/vanta/verifier.h`: > > ```c > typedef struct { > uint8_t smt_root[32]; > uint32_t input_commitment_count; > uint8_t input_commitments[VANTA_MAX_SLOTS][32]; > uint32_t nullifier_count; > uint8_t nullifiers[VANTA_MAX_SLOTS][32]; > uint32_t commitment_count; > uint8_t commitments[VANTA_MAX_SLOTS][32]; > int64_t value_balance; > } VantaJournal; > ``` The C++ never deserializes an SP1 proof. It calls `vanta_verify_and_decode()` with a byte slice; the function returns a boolean and populates the `VantaJournal`. From there the consensus engine looks at 32-byte hashes and a signed `i64` and makes its decisions on bytes alone. This is a deliberate cryptographic-engineering posture. The proof system can change underneath the FFI without changing the FFI. SP1 today, Halo 2 someday, whatever-comes-after-that the day after that — the C++ doesn't have to know. ## Why isn't the L2 logic in `vantad` too? This was the design conversation that took the longest to resolve. Option A was to put everything in-process. One binary, one supervised PID, fewer moving parts. The problem: the L2 index isn't *consensus*. It's an indexed view of commitments, an SMT, and a REST API. Bundling that into `vantad` would mean every Bitcoin-Core-style operator who wanted to run the chain would inherit an HTTP server, an SQLite-backed index, and an iroh-based gossip layer. That's a footprint expansion that buys nothing for the consensus path. Option B was a separate process with a clean network boundary. `vanta-node` talks *down* to `vantad` over standard JSON-RPC (the same `getblock`/`getrawtransaction` an explorer would use) and *up* to wallets over REST and iroh gossip. The footprint cost lives in the operator's discretion: if you don't want the L2 services, don't run `vanta-node`. We went with B and I don't regret it. The trade-off is that the L2 sidecar is a piece of operational machinery to keep alive. The desktop app handles that automatically (see [vanta-desktop](/blog/vanta_desktop_tauri_wallet/)); a server operator handles it the way they handle any daemon. ## What `vanta-node` actually does [`main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/main.rs) is a four-task tokio program: ```rust let watcher_handle = tokio::spawn(async move { if let Err(e) = l1_watcher::run(watcher_config, watcher_state).await { tracing::error!("L1 watcher error: {e}"); } }); let gossip_handle_opt = match gossip::start(state.clone(), config.bootstrap_peers.clone()).await { Ok((handle, router)) => { ... } Err(e) => { tracing::warn!("Failed to start gossip (continuing without P2P): {e}"); None } }; let api_handle = tokio::spawn(async move { if let Err(e) = api::serve(api_state, &api_listen).await { tracing::error!("API server error: {e}"); } }); let save_handle = tokio::spawn(async move { let mut interval = tokio::time::interval(tokio::time::Duration::from_secs(10)); loop { interval.tick().await; if let Err(e) = save_state.save() { tracing::warn!("Failed to save state: {e}"); } } }); ``` Four jobs: watch L1 for new blocks, gossip with peers over iroh, serve the REST API, and snapshot state to disk every 10 seconds. Each tokio task runs against a shared `L2State` that's `Arc`-cloned across them. The **L1 watcher** polls `vantad`'s RPC every 2 seconds (configurable via `VANTA_POLL_MS`). For each new block it scans every transaction's outputs for the `OP_RETURN` anchor format we use to publish commitments and nullifiers — the byte sequence is `OP_RETURN 0xbb 0x00 <32-byte commitment>`, defined in the [pool's stratum server](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py) where I wrote it in Python and reused the format on the Rust side. Hits get fed into the SMT. The **gossip layer** uses [iroh](https://iroh.computer) — pure-Rust, QUIC-based, NAT-traversing — to share encrypted notes between L2 peers. The `bootstrap_peers` come from an env var; the desktop app starts with that empty by default. Iroh's gossip is a per-topic channel and we use one topic per chain (mainnet, regtest). The architecture doc explains the pick: > **P2P:** iroh.computer — pure Rust, QUIC-based, NAT traversal, gossip protocol, content-addressed blobs. Chosen over libp2p for simplicity, built-in QUIC + NAT hole-punching, and document sync (useful for offline branch-and-merge). The **REST API** is the thing wallets actually consume. The endpoints I care most about are `/status` (commitment count, nullifier count, SMT root, last block), `/submit` (push new commitments + encrypted notes from the pool or a wallet), `/notes/scan` (trial-decrypt encrypted notes against a wallet's secret key), and `/proofs/recent` (the 500-slot ring buffer of recently-verified proofs the explorer renders). The **save loop** is 10-second snapshots. The state file is a bincode'd dump of the SMT plus the nullifier set plus the encrypted-note inbox. `Drop` on the `L2State` saves on shutdown too. If the process is killed `-9` you lose at most 10 seconds of work, and the L1 watcher rebuilds the state by re-scanning from the last good height. ## How the wallet uses the sidecar The desktop wallet uses both `vantad` *and* `vanta-node`. From the wallet's perspective: - `vantad` is the source of truth for L1 — block heights, transparent UTXOs, transaction broadcast. - `vanta-node` is the source of truth for L2 — commitments, nullifiers, encrypted notes addressed to me. When I press "send" on a private transaction in the desktop wallet: 1. The wallet asks `vanta-node` for the current SMT root and the membership proof for the input commitment I'm spending. 2. The wallet generates an SP1 proof locally (or, for low-end machines, against the SP1 proving network) using the membership proof and my secret key as private witness. 3. The wallet builds an L1 transaction that includes the SP1 proof in `witness.stack[0]` and an OP_RETURN anchor with the new commitment. 4. The wallet broadcasts the transaction via `vantad`'s `sendrawtransaction` RPC. 5. `vantad` validates: standard script checks, then `vanta_verify_and_decode()` against the SP1 proof in the witness. 6. After the block is mined, `vanta-node`'s L1 watcher picks up the OP_RETURN anchor and the new commitment lands in the SMT. The recipient wallet `/submit`s an encrypted-note query to its `vanta-node`, which trial-decrypts using the recipient's secret. If the trial decrypts cleanly, the note is mine. This is the architecture the [coinbase auto-shield](#) feature also rides on: every miner reward is a witness v2 commitment paying into the miner's shielded address, with the encrypted note pushed to the L2 via the same `/submit` endpoint. From the [pool's stratum server](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py): ```python def save_shielded_note(height, commitment_hex, randomness_hex, value): """Persist mining note and submit encrypted note to L2 for wallet discovery. Called ONLY after a winning block is accepted by submitblock — never from the per-share job-template path, otherwise the L2 SMT fills up with phantom commitments for templates that never won the PoW race. """ ``` That comment is load-bearing. The first version of the pool submitted the encrypted note on every share — that produced thousands of phantom commitments per block. Submitting only on `submitblock` accept fixes it. ## Failure modes The sidecar architecture has failure modes the in-process design wouldn't have. They're worth naming. **`vanta-node` is dead, `vantad` is alive.** The wallet's L1 RPC works fine. The wallet's L2 calls all 503. The desktop app surfaces this as "L2 disconnected" and lets you keep using transparent functionality. Private send is gated behind L2 reachability. **`vantad` is dead, `vanta-node` is alive.** L1 RPC fails. `vanta-node`'s watcher logs polling failures. The L2 state is frozen at whatever block was last seen. The desktop app surfaces this as "L1 disconnected"; sending is impossible (no broadcast endpoint), but the wallet can still display historic state. **Both alive but `vanta-node` lost its data dir.** The L1 watcher detects "I've seen no blocks" on startup and re-scans from genesis. On a small chain this is fine. On a large chain this is a known cost of recovery — measured in hours, not days, but not free. **`vanta-node` is alive but the SMT is corrupted.** This one I worry about. Bincode + Drop-save + 10-second snapshots is a defensible steady state, but a partial write during a crash could in principle produce a non-loading state file. Recovery is "rm the state file, restart, let the watcher rebuild." We have monitoring on the fall-through path. **TODO: Dax confirm we ship cryptographic checksums on the state file.** ## What this isn't I want to head off two possible misreadings. **This is not a proof-on-server architecture.** The proof generation happens in the wallet (or, optionally, on a remote SP1 prover the user trusts). `vanta-node` doesn't generate proofs. It distributes encrypted notes and indexes commitments. The only ZK code in `vanta-node` is the verifier path it uses to sanity-check proofs before accepting them into the proof event ring buffer. **This is not a custodial sidecar.** `vanta-node` never sees secret keys. The encrypted notes are encrypted *to the recipient's pubkey* — `vanta-node` distributes ciphertext. Trial decryption happens client-side in the wallet using the recipient's secret. Lose the secret, lose the funds; lose the L2 sidecar, replay the chain. The cryptographic posture is the same as Zcash sapling notes. ## What I changed my mind about The original [nullifier-set post](/blog/vanta_l1_nullifier_set/) hinted at this: "The actual ZK proof verification happens **out of process** in the Rust sidecar. The C++ node fires off the proof to a local Unix socket and waits for `ok` or `not ok`." That's how it was originally architected. We changed it. The Unix-socket sidecar didn't survive contact with the SP1 backend. Spawning a sub-process every block to verify proofs is fine in regtest where blocks are minutes apart; on mainnet at 1-minute blocks with peak-hour transaction volume, the IPC overhead added up to milliseconds per verify, multiplied by every spend in every block. Statically linking `libvanta_verifier.a` into `vantad` brought the verifier into the same address space and the same allocator and dropped the per-verify cost to roughly what an in-Rust call would cost. The audit-surface concern is real but mitigated by the *minimal* FFI: 440 bytes of struct, two C functions, deterministic output. A fuzzer can hammer that boundary and you'll know if it's broken. What's *still* out of process is the L2 state — the SMT, the nullifier index, the encrypted-note inbox. That's the thing whose footprint we never want inside `vantad`, and there it's stayed. ## Further reading - [`vanta/vanta-node`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-node) — the L2 sidecar - [`vanta/vanta-verifier-ffi`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) — the in-process FFI verifier - [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — the design rationale for the SP1/Plonky3 backend - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain itself - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the binary that supervises both processes - [iroh.computer](https://iroh.computer) — the QUIC-based P2P stack the gossip layer uses --- # Why we shipped SP1 instead of RISC Zero Canonical: https://blog.skill-issue.dev/blog/vanta_sp1_zkvm_circuits/ Description: Vanta's earliest design notes said 'RISC Zero zkVM.' Production ships SP1 + Plonky3. The swap was cheap because the privacy protocol is independent of the prover. Here is why we moved, what stayed the same, and what the FFI verifier looks like. Published: 2026-04-15T23:15:13.000Z Tags: vanta, sp1, risc-zero, zkvm, plonky3, rust In the original [Vanta L1 post](/blog/vanta_zk_privacy_l1/) I wrote: > The ZK layer is in the `vanta/` subtree, written in Rust against [RISC Zero's zkVM](https://www.risczero.com), running entirely outside the C++ core. That sentence was true when I wrote it. It is no longer true. Production Vanta ships SP1 — Succinct Labs' zkVM — with Plonky3 as the proof backend. RISC Zero was the early prototype. The migration happened before mainnet and has been the production prover for every consensus-critical proof since. This post is the *why* of that change, the architectural choice that made the migration cheap, and what the verifier surface inside `vantad` actually looks like. The design rationale is also documented in [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md), which is the canonical version. This post is the practitioner-flavored version: what I had to change, what I didn't, and what I'd warn the next person about. ## The abstraction that made the swap cheap The reason RISC Zero → SP1 was a code refactor and not an architectural rewrite is that the ZK code in Vanta is split into four layers and only *two* of them touch the zkVM SDK at all. From the engineering paper: > 1. **Core logic** (`vanta-core`): Pure Rust library containing the transfer validation function, domain-separated commitment construction, nullifier derivation, SMT membership proofs, and conservation law checks. This library has no dependency on any zkVM. It compiles to native x86, to ARM, and to RISC-V. It is the same code whether it runs inside a zkVM guest, inside a test harness, or on a developer's laptop. > > 2. **Guest program** (`vanta-circuits/methods/guest/`): A thin wrapper that reads private inputs from the zkVM host, calls `validate_transfer()` from `vanta-core`, and commits the public outputs (`TransferPublicInputs`: `smt_root`, `input_commitments`, `nullifiers`, `commitments`, `value_balance`) to the journal. The guest program is a few dozen lines of Rust. Its only zkVM-specific code is the I/O calls (`sp1_zkvm::io::read()` and `sp1_zkvm::io::commit()`). > > 3. **Host prover** (`vanta-circuits/src/prover.rs`): The component that sets up the proving environment, feeds private inputs to the guest, and invokes SP1 to generate a compressed Plonky3 proof. > > 4. **FFI verifier** (`vanta-verifier-ffi`): A Rust static library compiled to `libvanta_verifier.a` and linked directly into `vantad`. The split means that *the cryptographic protocol* — commitment scheme, nullifier scheme, conservation law, SMT membership proof verification — lives in `vanta-core` and has no zkVM dependency. The same Rust source compiles to: - native x86_64 for unit tests on my laptop - RISC-V for either zkVM's guest target - ARM for the iOS wallet (eventually) The guest program in [`vanta-circuits/methods/guest/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-circuits/methods/guest) is a thin shim. Its only zkVM-specific code is `sp1_zkvm::io::read()` to pull private inputs and `sp1_zkvm::io::commit()` to commit public outputs to the proof journal. Swapping to a different zkVM is a few lines of code in a file that is a few dozen lines long. It is not an architectural change. That split is why I'm comfortable saying we could swap zkVMs *again* without an architectural rewrite. The chain doesn't know what proof system it's running; it knows how to verify a journal. ## What the host prover looks like Here's the actual prover from [`vanta-circuits/src/prover.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-circuits/src/prover.rs): ```rust pub fn prove_transfer( private_inputs: &TransferPrivateInputs, smt_root: &Hash, ) -> Result<(SP1ProofWithPublicValues, TransferPublicInputs)> { let pi = private_inputs.clone(); let root = *smt_root; let result = std::thread::spawn(move || -> Result<...> { let mut stdin = SP1Stdin::new(); stdin.write(&pi); stdin.write(&root); let client = ProverClient::from_env(); let pk = client.setup(GUEST_ELF.clone())?; // Use compressed proofs (no Docker dependency). // Groth16 wrapping requires Docker + Gnark — enable later for production. let proof = client.prove(&pk, stdin).compressed().run()?; let mut proof_clone = proof.clone(); let public_inputs: TransferPublicInputs = proof_clone.public_values.read(); Ok((proof, public_inputs)) }) .join() .map_err(|e| anyhow::anyhow!("proof thread panicked: {:?}", e))??; Ok(result) } ``` A couple of details worth pulling out. **Compressed proofs, not Groth16-wrapped.** SP1 supports a Groth16 wrapping step that shrinks the receipt from ~1.27 MB to ~260 bytes. v2.0 ships compressed Plonky3 instead because Groth16 wrapping requires a Docker + Gnark toolchain that I did not want in the consensus-critical path at launch. Smaller proofs are a future release. **Spawn-on-thread because of tokio.** SP1's blocking ProverClient creates its own tokio runtime. If you call it from inside another tokio runtime — which is what happens when the Axum wallet or the Tauri app invokes the prover — you get a "runtime in runtime" panic. Spawning the prove call on a dedicated `std::thread` and joining it cleanly side-steps that. This is the kind of footgun that's invisible at unit-test time and very visible at integration-test time. It cost me an afternoon. Documented now. **`include_elf!` macro.** The guest binary is embedded into the host binary at compile time: ```rust pub static GUEST_ELF: Elf = include_elf!("vanta-guest"); ``` That means the wallet binary (or the Tauri host) carries the guest ELF along with the proving stack. No separate file to ship, no path resolution. This was one of the SP1 ergonomic wins over RISC Zero — the include macro removes a class of "where's the guest" bugs. ## What stayed identical The cryptographic protocol *did not change* between RISC Zero and SP1. From the engineering paper: > ### 4.1 The Privacy Model Is Application-Layer, Not Prover-Layer > > Vanta's privacy guarantees come from four cryptographic constructions, all of which are implemented in `vanta-core` and are independent of the proof system: > > **Commitment hiding.** A note commitment is computed as: > $$\text{cm} = H(\text{"Vanta/NoteCommitment/v1"}, \text{value} \| \text{owner\_pk} \| \text{asset\_type} \| r)$$ > > The hiding property — the fact that an observer cannot determine the committed values from the commitment — comes from the randomness $r$ and the preimage resistance of the hash function. This has nothing to do with the proof system. Whether the commitment is computed inside SP1, inside another zkVM, or on a napkin, the hiding property is identical. This is the load-bearing insight. The proof system is an *attestation layer*. It says "I correctly executed this Rust program against these private inputs and the public outputs are these." It does not contribute to the soundness of the commitment scheme, the unlinkability of the nullifier, or the integrity of the SMT membership proof. Those properties live in the application code that runs *inside* the proof. ## Why SP1 won The full case is in the engineering paper. The short version: **Speed.** SP1 generates compressed Plonky3 receipts for Vanta's transfer workload in 30–60 seconds on a modern multi-core CPU. RISC Zero in our early benchmarks was slower — comparable on simple programs, materially slower once domain-separated SHA-256 was the dominant operation. SP1's SHA-256 precompile is the difference; it substitutes a hand-optimized circuit for the operation rather than proving SHA-256 instruction-by-instruction through the RISC-V execution trace. **Trusted-setup posture.** Plonky3 is a hash-based STARK. It is post-quantum-resilient (under Grover's algorithm, 256-bit hash → 128-bit effective security is still strong) and it has *no trusted setup, no SRS, no powers-of-tau ceremony*. Anyone can compile the prover and verifier from source and run them without trusting a third party to have generated setup parameters correctly. This was a hard requirement for Vanta. The engineering paper is opinionated about it: > a permanent contingent backdoor against a single participant's operational discipline is not a substitute for transparent cryptography. That sentence is the reason we don't ship Groth16, despite Groth16's ~260-byte proofs. Groth16's per-circuit trusted setup requires a multi-party-computation ceremony where the security assumption is "at least one participant was honest and destroyed their share." We didn't want to carry that assumption. **SDK ergonomics.** SP1's `include_elf!`, `SP1Stdin`, `ProverClient` API, and the cargo-prove tool are well-typed and well-documented. The development cycle is fast and doesn't require specialized tooling beyond the Rust toolchain plus the SP1 target. **Active development + funding.** Succinct (the company) has raised over $55M and ships monthly releases with measurable performance improvements. SP1 is MIT/Apache-2.0, used in production by multiple chains. Lower abandonment risk than smaller projects. **GPU acceleration available.** SP1 has CUDA-based GPU proving. Doesn't matter for the wallet path (we're not putting GPUs in user laptops) but matters for the proving-network and miner-prover roles. ## What the FFI verifier looks like The FFI lives in [`vanta/vanta-verifier-ffi/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) and compiles to `libvanta_verifier.a`. It exposes two functions to C++: ``` bool vanta_verify_and_decode(const uint8_t *proof_bytes, size_t proof_len, VantaJournal *out); bool vanta_decode_journal(const uint8_t *bytes, size_t len, VantaJournal *out); ``` The C++ consensus engine in `src/script/interpreter.cpp` calls `vanta_verify_and_decode` from the witness-v2 branch. It hands over a byte slice (the SP1 receipt embedded in `witness.stack[0]`) and receives back a boolean and a populated `VantaJournal`. The `VantaJournal` is the 440-byte struct from the engineering paper: ```c typedef struct { uint8_t smt_root[32]; uint32_t input_commitment_count; uint8_t input_commitments[VANTA_MAX_SLOTS][32]; uint32_t nullifier_count; uint8_t nullifiers[VANTA_MAX_SLOTS][32]; uint32_t commitment_count; uint8_t commitments[VANTA_MAX_SLOTS][32]; int64_t value_balance; } VantaJournal; ``` The C++ doesn't deserialize the proof. It doesn't know the proof system. It looks at 32-byte hashes and a signed `int64_t`, and: - checks the `input_commitments` against the spent UTXO's `OP_2 PUSH32 ` script - checks the `nullifiers` against the chainstate nullifier set ([the post on this](/blog/vanta_l1_nullifier_set/)) - checks the `smt_root` against the chain's currently committed state root - checks `value_balance` for sign + balance against any transparent outputs in the transaction That's the whole consensus contract. The proof verified bit either is or isn't set. Everything else is byte arithmetic. The minimal-FFI design is what makes the audit story tractable. A fuzzer can hammer the boundary and you'll know if `vanta_verify_and_decode` ever populates a journal that the proof didn't actually attest to. The Rust side is the thing that needs the careful audit; the C++ side is reading bytes. ## What I learned about zkVMs by switching A few things I'd tell my past self. **Decouple the cryptographic protocol from the proof system. Hard.** It is enormously tempting to put commitment construction in the guest program. Don't. Put it in `vanta-core`, call it from the guest, *and call it from native unit tests*. The native unit tests are how you find off-by-one byte ordering bugs without paying 30s/proof to find them. **The journal is the public contract.** Whatever public outputs you commit to the proof journal *are* your interface. Adding a field is a forking change for the verifier. Removing one is too. Plan the journal layout the way you'd plan a serialised network message: explicit, versioned, ABI-stable. **Compressed proofs are large.** ~1.27 MB on Vanta. That meant raising `MAX_STANDARD_TX_WEIGHT` to `MAX_BLOCK_WEIGHT` so a single witness-v2 spend fits in one transaction. Plan the chain parameters around the proof size you ship with, and budget for the eventual Groth16-wrapping shrink. **zkVM benchmarks lie about your workload.** Generic prover benchmarks measure simple programs. Yours is not simple. Measure your *actual circuit* against the candidates before deciding. SP1's SHA-256 precompile dominated our workload; if your hashing is Poseidon over a different field, your numbers will look different. ## What's next The roadmap from the papers calls out a few things that have *not* shipped yet: - **Groth16 wrapping** to bring receipt size from 1.27 MB to ~260 bytes. Deferred for the Docker dependency reason above. - **Poseidon migration** to replace SHA-256 in the commitment scheme with a ZK-friendlier hash. Performance win, no security change. - **GPU proving distribution.** SP1 supports CUDA but we haven't shipped a wallet path that uses it; the lift is mostly UX (how does the user point at a GPU?) and packaging. The architectural property I want to keep regardless of which of these lands: *the chain doesn't know what proof system it's running, it knows how to verify a journal*. That's the property that lets us swap zkVMs again. It's the property that lets the eventual full-Rust node rewrite ship without a forking change. It's the property that, six years from now, lets us swap the proof backend for whatever has won the 2032 cryptography landscape. ## Further reading - [`vanta/vanta-circuits`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-circuits) — host prover + guest program - [`vanta/vanta-verifier-ffi`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) — the static library linked into vantad - [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — full design rationale - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain the verifier protects - [L1 nullifier sets: enforcing no-double-spend at consensus](/blog/vanta_l1_nullifier_set/) — what the verifier's nullifiers feed into - [SP1 docs](https://docs.succinct.xyz/docs/sp1/introduction) — the prover we ship - [Plonky3 repo](https://github.com/Plonky3/Plonky3) — the proof backend --- # Tauri 2.x sidecars in anger: the ergonomics paper-cuts I had to fix Canonical: https://blog.skill-issue.dev/blog/vanta_tauri_ergonomics/ Description: externalBin wants a target-triple suffix nobody documents loudly enough. The dev resolver walks up parents. Startup must be sequenced. The setup-sidecars.sh + resolve_binary() story for shipping a wallet that runs its own node. Published: 2026-04-13T21:45:02.000Z Tags: vanta, tauri, rust, desktop, sidecar, devenv The [Vanta Desktop walkthrough](/blog/vanta_desktop_tauri_wallet/) is the architectural story: a Tauri 2.x wallet that ships its own full node, supervises three sidecar binaries, and exposes everything through `#[tauri::command]` IPC. That post is the *what.* This post is the *how* — the small, awkward, under-documented ergonomics details that took me a working week to figure out. If you're shipping a Tauri app with sidecar binaries in 2026 and you're hitting walls, this post is the field notes I wish someone had written for me. If you're not, skip it. ## The target-triple suffix The first wall is the one that's most easily missed because it works fine in production and breaks subtly in dev. Tauri's `externalBin` config in [`tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) declares the sidecar binaries: ```json "externalBin": [ "binaries/vantad", "binaries/vanta-node", "binaries/vanta-cli" ] ``` The bundler does *not* look for `binaries/vantad` literally. It looks for `binaries/vantad-` — that is, the file with the target-triple appended. On Apple Silicon that's `binaries/vantad-aarch64-apple-darwin`. On Intel macOS it's `binaries/vantad-x86_64-apple-darwin`. On Linux it's `binaries/vantad-x86_64-unknown-linux-gnu`. Windows is `binaries/vantad-x86_64-pc-windows-msvc`. If the file is named just `binaries/vantad`, Tauri's bundler emits a moderately cryptic error during `tauri build`. The fix is renaming the file. The discovery process for figuring this out is `tauri build` → fail → google → find a [years-old GitHub issue](https://github.com/tauri-apps/tauri/issues) → realise. The setup script that handles this lives in [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) and the load-bearing line is the very first one: ```bash TRIPLE=$(rustc -vV | grep host | awk '{print $2}') ``` `rustc -vV` outputs Rust's verbose version info, which includes a `host: ` line. Awk picks the second field. The variable then suffixes every file copied into `src-tauri/binaries/`: ```bash cp "$VANTAD" "$BINDIR/vantad-$TRIPLE" cp "$ZERANODE" "$BINDIR/vanta-node-$TRIPLE" ``` That's the canonical way to discover the host triple, and it's what every Tauri tutorial buries five paragraphs in. **Do not hardcode the triple.** A Mac developer who switches between aarch64 and x86_64 (e.g., a Rosetta context for testing) will produce one set of binaries from one shell and another from another, and the bundler will pick whichever is on disk *and* matches the active build target — only one of which is correct on any given build. The full search for `vantad` in `setup-sidecars.sh` walks well-known locations: ```bash VANTAD="${ZERA_L1_BIN:-}" if [ -z "$VANTAD" ]; then for candidate in \ "../src/vantad" \ "../../src/vantad" \ "/usr/local/bin/vantad" \ ; do if [ -x "$candidate" ]; then VANTAD="$candidate" break fi done fi ``` The `../src/vantad` and `../../src/vantad` are relative paths from `vanta-desktop/` and `vanta-desktop/src-tauri/` respectively. The `/usr/local/bin/vantad` is the canonical install path on macOS. The `ZERA_L1_BIN` env var is the escape hatch for non-default layouts. (The variable name still reads `ZERA_L1_BIN` because of the [zeracoin → vanta rebrand](/blog/vanta_darwin_apple_silicon_build/) — pre-rebrand artefact, on the cleanup list.) ## The dev-mode resolver The bundler problem is solved at *build* time. There's a different problem at *dev* time: when you `cargo run` the Tauri host directly (or `pnpm tauri dev`), the executable lives at `src-tauri/target/debug/vanta-desktop`, not in a `.app` bundle. The sidecars aren't sitting next to the host executable; they're at `src-tauri/binaries/vanta*-`. The host has to discover them at runtime, in both production and dev. The function that does this is `resolve_binary` in [`src-tauri/src/node.rs:225`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs): ```rust pub fn resolve_binary(name: &str, config_path: &str) -> PathBuf { let triple = target_triple(); let suffixed = format!("{name}-{triple}"); if let Ok(exe) = std::env::current_exe() { if let Some(dir) = exe.parent() { // Production: sidecars are next to the exe with triple suffix for candidate in [ dir.join(&suffixed), dir.join(name), ] { if candidate.exists() && candidate.metadata().map(|m| m.len() > 0).unwrap_or(false) { tracing::info!("Found sidecar: {}", candidate.display()); return candidate; } } // Dev mode: the exe is at src-tauri/target/debug/vanta-desktop. // Walk up ancestors looking for a `binaries/` dir containing our binary. for ancestor in dir.ancestors().skip(1) { let candidate = ancestor.join("binaries").join(&suffixed); if candidate.exists() && candidate.metadata().map(|m| m.len() > 0).unwrap_or(false) { tracing::info!("Found dev sidecar: {}", candidate.display()); return candidate; } } } } // Check explicit config path let config = PathBuf::from(config_path); if config.exists() && config.metadata().map(|m| m.len() > 0).unwrap_or(false) { return config; } // Fall back to PATH lookup PathBuf::from(name) } ``` The five-tier resolution order is: 1. Same directory as the host exe, with the triple suffix → production bundle. 2. Same directory as the host exe, without suffix → also production, fallback for some bundlers. 3. Walk up the parent chain looking for a `binaries/` directory → dev mode. 4. Explicit path from `WalletConfig` → user override. 5. Bare `name` → `PATH` lookup. The fallback to PATH is what lets a developer with `vantad` already installed at `/usr/local/bin/vantad` skip the `setup-sidecars.sh` step entirely if they want to. The metadata check for `len() > 0` is paranoia — empty files passing existence checks have caused at least one wasted afternoon. The `target_triple()` helper picks the right suffix based on `cfg!`: ```rust pub fn target_triple() -> &'static str { if cfg!(target_os = "macos") { if cfg!(target_arch = "aarch64") { "aarch64-apple-darwin" } else { "x86_64-apple-darwin" } } else if cfg!(target_os = "linux") { "x86_64-unknown-linux-gnu" } else { "x86_64-pc-windows-msvc" } } ``` This is intentionally a hardcoded match. We don't support other target triples (yet — Linux ARM is an "if a user complains" item). Hardcoding means the build fails loudly if someone tries to compile for an unsupported target, instead of silently producing a binary that can't find its sidecars. ## The signing identity Tauri 2.x's macOS bundler signs every binary in the `.app` with the identity declared in `tauri.conf.json`: ```json "macOS": { "signingIdentity": "474F624D8F3783B4D607CFF2331AD4C6CC26A1B5", "providerShortName": "9HD4Q82U58", "entitlements": "entitlements.plist", "minimumSystemVersion": "10.15" } ``` `signingIdentity` is the SHA1 fingerprint of an Apple Developer ID Application certificate. You can list yours with `security find-identity -v -p codesigning`. The format `474F62...` is a hex fingerprint, not a CN string — Tauri specifically looks up by fingerprint to disambiguate when you have multiple Developer ID certs in keychain. This took me a few false starts to land on; the first version of this config used the human-readable CN ("Developer ID Application: Hayden Porter-Aylor (9HD4Q82U58)") and broke when I had two certs from different teams. `providerShortName` is the Apple Team ID, the same string that goes in the cert's CN. It's needed for notarisation — `xcrun notarytool submit` requires `--team-id` to match this value. `entitlements` points at a separate plist file. Tauri's default entitlements are too permissive for a wallet; ours pins network access (we need it for the L2 P2P), allows JIT (because the WebView needs it), and otherwise denies everything — including microphone, camera, location, and the raft of other Apple-managed entitlements that a finance app should not be requesting. ## Sequenced startup, not parallel The first version of the wallet started both nodes in parallel, hoping the L2 watcher would survive its first few RPC failures while the L1 came up. It didn't. The L2 would crash on its first poll, the supervisor would log "L2 stopped," the user would see "L2 disconnected," and the eventual successful start (after `vantad`'s 30-second startup) would race the user's first attempt to send a transaction. The fix is in [`src-tauri/src/lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) and the structure is five sequenced stages: ``` Stage 1: Start L1 (may adopt) Stage 2: Wait for L1 RPC up to 60s, exponential backoff Stage 3: Create / load default wallet via RPC Stage 4: Start L2 (now that L1 is confirmed reachable) Stage 5: emit "ready" event to frontend ``` Each stage emits a Tauri event (`startup-stage`) the frontend subscribes to. The user-facing UX is a five-step progress meter that goes "spawning vantad… RPC ready… wallet loaded… spawning vanta-node… ready." On a warm cache the whole flow takes about 4 seconds. On a cold first launch with chainstate to load, it's closer to 12. The stage-by-stage approach makes failures actionable. If stage 2 times out (L1 RPC didn't come up), the error message points at L1; if stage 4 fails, it points at L2. The pre-stage version of this had a single `start_nodes()` command whose only failure mode was a generic "couldn't start nodes," which was useless for support. ## The NodeManager struct The supervisor is a single struct in [`src-tauri/src/node.rs:178`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs): ```rust pub struct NodeManager { l1_process: Option, l2_process: Option, /// When true, we detected a pre-existing L1 that we're reusing. l1_adopted: bool, /// When true, we detected a pre-existing L2 that we're reusing. l2_adopted: bool, /// Resolved path to vantad binary. l1_bin: Option, /// Resolved path to vanta-node binary. l2_bin: Option, /// Recent output from L1 process. pub l1_logs: LogBuffer, /// Recent output from L2 process. pub l2_logs: LogBuffer, /// Tauri app handle for emitting events. app_handle: Option, } ``` The fields tell the whole story: - Two `Option` — the live process handles, when we own them. - Two `bool`s — adopted flags. When the desktop starts and finds an existing `vantad` already listening on `19332`, it doesn't kill it; it adopts it (PID 0 sentinel) and proceeds. The adoption logic exists because every quit-and-relaunch from a previous version of the app would otherwise produce a "port in use" error. - Two `PathBuf`s — the resolved binary paths, useful for the diagnostic UI ("the wallet is using `vantad` at /Applications/Vanta Wallet.app/Contents/MacOS/vantad-aarch64-apple-darwin"). - Two `LogBuffer`s — 200-line ring buffers per process, served to the frontend on demand for the live console view. - The `AppHandle` — used to emit Tauri events for log lines (so the frontend can render a rolling log view without polling). ## Pipe draining as a load-bearing detail A Bitcoin-Core C++ process logging to stdout will eventually fill the OS's 64 KB pipe buffer if nobody reads it, and then *block on its next `write()`*. `vantad` with `-printtoconsole` is a heavy logger; without an active reader, it deadlocks within minutes. The `drain_pipe` function in [`node.rs:64`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) is the safety net. Every line goes three places: 1. `tracing::debug!` — the structured log file. 2. `LogBuffer::push` — the in-memory ring buffer for the frontend. 3. `app.emit("node-log", …)` — a Tauri event the frontend subscribes to for live rendering. That last one is the path I want to underline. The first version of the wallet had no live log surface; debugging a "node won't start" required `tail -f` on the log file. The Tauri event channel turned that into a frontend log component that streams stdout in real time, which by itself paid for the IPC complexity ten times over. ## Auto-config for the L1 Before `vantad` starts, the host writes a `vanta.conf` into `{home}/.vanta-desktop/l1/`. The config is hardcoded for the desktop's port plan, points at the seed nodes, and disables Bitcoin-style DNS seeding. Excerpt: ```rust let conf = format!( "# Vanta Desktop Wallet — auto-generated config\n\ server=1\n\ daemon=0\n\ txindex=1\n\ listen=1\n\ rpcport={DESKTOP_L1_RPC_PORT}\n\ port={DESKTOP_L1_P2P_PORT}\n\ rpcuser={}\n\ rpcpassword={}\n\ rpcallowip=127.0.0.1\n\ rpcbind=127.0.0.1\n\ dnsseed=0\n\ addnode=64.34.82.145:9333\n\ addnode=66.241.124.138:9333\n\ wallet=default\n\ fallbackfee=0.0001\n", config.rpc_user, config.rpc_pass, ); std::fs::write(&conf_path, &conf)?; ``` Three things in here that are non-obvious: **`txindex=1`.** The L2 watcher needs to look up arbitrary historic transactions to scan for OP_RETURN anchors. Without `txindex`, `getrawtransaction` only works for unspent outputs. With it, every transaction is indexed by txid forever. This costs ~10% extra disk vs a stock node; the L2 work flat-out doesn't function without it. **`dnsseed=0`.** Bitcoin Core auto-discovers peers from a hardcoded list of DNS seeds. Vanta's a separate network with separate seeds; if `dnsseed=1`, vantad will spam `seed.bitcoin.sipa.be` looking for peers it'll never find. Disabling it cuts the startup chatter and the wasted DNS queries. **`addnode=64.34.82.145:9333`.** The Latitude bare-metal seed node, hardcoded into the desktop config. Until the chain has organic peer discovery, this is the bootstrap. (More on this in [the fly+bare-metal post](/blog/vanta_flytoml_latitude_baremetal/).) The user never sees this file unless they go looking. Pinning the config to the desktop's port plan also means the wallet never collides with a standalone `vantad` running on default ports — which matters because some users run both. ## The Linux/NVIDIA cursed stanza Two lines in [`lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) earned a comment longer than they are: ```rust #[cfg(target_os = "linux")] { if std::env::var("WEBKIT_DISABLE_DMABUF_RENDERER").is_err() { std::env::set_var("WEBKIT_DISABLE_DMABUF_RENDERER", "1"); } } ``` webkit2gtk on NVIDIA's proprietary driver under Wayland tries to use a DMA-BUF renderer path that crashes with `Error 71 (Protocol error) dispatching to Wayland display`. Disabling it forces software compositing, which is fine. This is the canonical example of "Tauri 2.x ergonomics" actually meaning "the OS will surprise you in ways the web never has." Budget for it. Apps that ship to general users discover bugs that only happen on specific GPU + display server + driver combinations. The fix is usually environmental; the discovery is a weekend. ## What I changed my mind about The big one: **Tauri's sidecar story is the right way to ship a wallet that runs a node.** The alternative — telling users to run `vantad` themselves and pointing the wallet at it — is a non-starter for everybody but engineers like me. The friction of the embedded full-node story turns out to be entirely inside the *developer* (the build process, the signing pipeline, the auto-update story). The user friction is zero. They double-click a DMG and they're a node operator. The smaller one: **the build is more code than the app.** [`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) is 200 lines, [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) is 60. Combined that's about as much shell as the rest of `src-tauri/` is Rust outside `commands.rs`. The shell is *load-bearing infrastructure*, not glue. Treat it that way and the build is reproducible; treat it as glue and the build will surprise you on every fresh checkout. ## Further reading - [`vanta/vanta-desktop/src-tauri/src/node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) — the supervisor and `resolve_binary` - [`vanta/vanta-desktop/setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) — the binary-rename script - [`vanta/vanta-desktop/src-tauri/tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) — the bundle config - [Tauri 2.x sidecar docs](https://tauri.app/v2/develop/sidecar/) — the framework feature this all rides on - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the architecture-level companion - [Cross-compiling vantad for darwin](/blog/vanta_darwin_apple_silicon_build/) — the macOS half of the build pipeline --- # Vanta: a Bitcoin fork with ZK at consensus Canonical: https://blog.skill-issue.dev/blog/vanta_zk_privacy_l1/ Description: 42 billion supply. 1-minute blocks. RISC Zero proofs verified at consensus. The opinionated answer to 'why fork Bitcoin in 2026?' is that you're not really forking Bitcoin — you're shipping a different L1 that has Bitcoin's surface area. Published: 2026-04-17T05:52:57.000Z Tags: vanta, bitcoin, zk, risc-zero, consensus, l1 You can read this post as the technical version of [Why I started Zera Labs](/blog/why_i_started_zera_labs/). The strategy lived there. The code lives in [`Dax911/vanta`](https://github.com/Dax911/vanta), and the [README](https://github.com/Dax911/vanta/blob/main/README.md) opens with a sentence that took me a long time to feel comfortable writing: > *Vanta L1 — ZK-privacy Layer 1 blockchain — fork of Bitcoin Core v27.0.* If you spent any time on crypto Twitter in 2024, the words *fork of Bitcoin Core* were code for "vanity chain that nobody serious will run." I want to make the case that this fork is different — not because the C++ source tree is different (most of it isn't) but because the consensus rules are. ## What it is The headline parameters are deliberate departures from Bitcoin: | Parameter | Vanta | Bitcoin | |---|---|---| | Block reward | 100,000 VANTA | 3.125 BTC (post-2024 halving) | | Block time | **1 minute** | 10 minutes | | Total supply | ~42 billion VANTA | ~21 million BTC | | Halving interval | 210,000 blocks (~146 days) | 210,000 blocks (~4 years) | | Address prefix | `Z` (legacy), `zer1` (bech32) | `1`, `bc1` | | Network magic | `0x5a454500` (`"VANTA\0"`) | `0xf9beb4d9` | A 1-minute block time and a 42-billion supply are not "Bitcoin with a different ticker." They are calibrated to make this chain *feel* like a payments rail rather than a settlement rail. You can confirm a payment in two blocks (2 minutes, ~95% confidence) instead of three blocks (30 minutes). The 100,000-per-block subsidy makes the unit economics of running a node + miner actually work for a small operator. I have [opinions about small-operator unit economics](/blog/what_running_a_bitcoin_mine_taught_me/) that come from running a [Bitaxe BM1368 against this chain](https://github.com/Dax911/vanta/tree/main/pool) for the past several months. ## The fork strategy: keep what works The [monorepo structure](https://github.com/Dax911/vanta/blob/main/README.md) is a direct read on the strategy: ``` zl1/ ├── src/ # Vanta Core (C++ — Bitcoin Core v27.0 fork) ├── wallet/ # Web wallet (Rust/Axum) ├── txbot/ # Transaction bot (Rust) ├── explorer/ # Block explorer (Node.js — patched btc-rpc-explorer) ├── vanta/ # ZK circuits (Rust/RISC Zero) ├── pool/ # Stratum server (Python) └── … ``` Bitcoin Core v27.0 is the most-tested codebase on the planet by node-hours. We did not fork it because we thought we could write something better. We forked it because we wanted **a chain that ships day-one with the same tooling humans have spent fifteen years building around Bitcoin** — wallets that can be ported, RPCs that can be wrapped, block explorers that can be patched. The price of admission is that the C++ surface area is huge and you respect it. We did not fork the *cryptography*. The ZK layer is in the `vanta/` subtree, written in Rust against [RISC Zero's zkVM](https://www.risczero.com), running entirely outside the C++ core. The C++ core verifies one thing: a SHA-256 hash of the proof witness root. That's it. The proof itself is computed and verified in a Rust program that runs as a sidecar. This split means we can change the proof system without forking the chain again, which matters in 2026 because [the proof-system landscape is moving fast](/blog/privacys_broadband_moment/). ## What's at consensus, what's not Here is the part I want to dwell on, because every privacy-coin design eventually crashes into this question. **At consensus** (i.e. nodes will reject blocks that don't satisfy these): 1. **ZK proof-to-UTXO binding.** A spending transaction must include a witness v2 input commitment that matches the proof's public input. The C++ validator verifies the binding before the proof is even consulted; the proof confirms it. 2. **SMT root cross-verification.** Every block has a coinbase commitment to the sparse-Merkle-tree root of the post-block nullifier set. The proof root and the coinbase commitment must match. A miner cannot lie about state. 3. **L1 nullifier set tracking.** The nullifier set is part of consensus state, not a wallet-side hint. Two valid blocks attempting to mine a transaction whose nullifier was spent in either block create a hard chain split. Double-spend prevention is **a property of the chain**, not a property of the wallet. **Not at consensus:** 1. The proof system itself. Today it's Groth16 over the RISC Zero zkVM. We can swap to Halo2 or Nova-style recursion in a soft fork by adding a new opcode and grandfathering the old. The chain doesn't know what proof system it's running; it knows how to verify a witness root. 2. Address format. Z-legacy and zer1-bech32 are wallet-side. The chain treats them all as `OP_PUSHBYTES`-style script commitments. 3. Wallet-level shield/unshield UX. The README lists `shield` and `unshield` as commands for moving between transparent and private. Those are wallet conveniences; the chain itself sees commitments and nullifiers, not "shielded" and "unshielded" as states. This split is load-bearing. If your fork tries to put the proof system into consensus, you have two terrible choices when the proof system improves: hard-fork the world, or live with worse cryptography forever. Vanta lives somewhere in between: the *binding* is at consensus, the *proof system* is not. ## The roadmap, annotated The [README's roadmap](https://github.com/Dax911/vanta/blob/main/README.md) is concise; let me unpack the items that are checked. - **[x] Fork Bitcoin Core v27.0** — the easy part. It's a tree copy and a network-magic change. - **[x] Custom chain parameters** — most of the work was rewriting `src/chainparams.cpp` and the genesis-block builder. Bitcoin Core makes you re-mine the genesis block locally; the script is in `contrib/`. - **[x] Solo mining with Bitaxe BM1368** — the [Python Stratum server in `pool/`](https://github.com/Dax911/vanta/tree/main/pool) is a from-scratch Stratum v1 implementation pointed at the local node's RPC. I'll write a separate post on the Bitaxe rig. - **[x] Web wallet + block explorer** — Rust/Axum wallet, patched btc-rpc-explorer for the explorer. The wallet integrates with the ZK circuits; the explorer renders shielded transactions as opaque commitments. - **[x] Transaction bot for mempool activity** — synthetic mempool activity is essential during testnet. The Rust txbot generates round-robin spends so you're not staring at empty blocks. - **[x] RISC Zero ZK circuit integration** — the big one. RISC Zero gives us a zkVM where the circuit is just Rust. We don't write R1CS by hand. The witness for a spend is the same Rust struct the wallet uses; the prover takes that struct and emits a proof. - **[x] ZK proof-to-UTXO binding** — wired into `src/script/interpreter.cpp` and the witness-v2 stack. A new `OP_VANTA_VERIFY` opcode pulls the proof root from the witness, hashes it with the input commitment, and compares against the script. - **[x] SMT root cross-verification** — sparse-Merkle-tree state for the nullifier set is materialised in the block header's coinbase. A node that doesn't validate the SMT root rejects the block. - **[x] L1 nullifier set tracking** — the chainstate db now has a nullifier table. Double-spend at L1, see [my next post](/blog/vanta_l1_nullifier_set/). - **[x] Shield/unshield wallet commands** — `vanta-cli shield 1.5` moves 1.5 VANTA from a transparent UTXO into a shielded note. `unshield` does the reverse with a destination address. Both produce normal-looking on-chain transactions; the difference is the witness contents. The two unchecked items are the strategic ones. **Mandatory privacy** as a hard fork (so every transaction is a uniform shielded format and no information leaks from "this user used the shielded pool, this one didn't") is the long-term goal. A **full Rust node rewrite** is the longer-term goal — the C++ tree is fine for now, but a `vantad` written from scratch in Rust against the same RPC contract is the kind of project you can spend three years on without regretting it. ## What I changed my mind about When we started, I wanted to write the chain from scratch in Rust. A clean tree, no Bitcoin baggage, no `boost::` types, modern async, the whole pitch. Two things stopped me: 1. **Time.** Writing a UTXO chain from scratch is at least a year of work before you have something nodes will run. We had ~3 months of runway for the L1 proof-of-concept. That's a fork-or-fold decision. 2. **Bitcoin Core's testing infrastructure is the actual product.** The `test/` directory is the most underrated part of the Bitcoin codebase. There are functional tests that cover edge cases nobody on a greenfield team will think of for years. Inheriting that is worth more than most people realise. The compromise we landed on is in the README's last roadmap line: *full Rust node rewrite*. That's the path. Fork now to ship now; rewrite incrementally to own the long term. The Rust ZK sidecar is the first piece of that rewrite. The Rust wallet is the second. ## What's next The next two posts will go deeper: - [L1 nullifier sets: enforcing no-double-spend at the consensus layer](/blog/vanta_l1_nullifier_set/) - Mining VANTA with a Bitaxe BM1368 (forthcoming) If you want the strategic frame, I wrote about the four curves that crossed in 2026 to make this whole thing tractable in [Privacy's broadband moment](/blog/privacys_broadband_moment/). ## Further reading - [`Dax911/vanta` on GitHub](https://github.com/Dax911/vanta) — the codebase - [`Dax911/vanta/README.md`](https://github.com/Dax911/vanta/blob/main/README.md) — chain parameters + roadmap - [Bitcoin Core v27.0 release notes](https://bitcoincore.org/en/releases/27.0/) — the fork base - [RISC Zero docs](https://dev.risczero.com) — the zkVM the proof system runs on - [Sparse Merkle trees: a brief overview](https://eprint.iacr.org/2016/683) — for the SMT root commitment - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on commitment schemes --- # Poseidon, by hand and by code Canonical: https://blog.skill-issue.dev/blog/poseidon_by_hand_and_by_code/ Description: Why one of the cheapest hashes in zero-knowledge cryptography also has the strangest insides. Derive the S-box, count the constraints, and run a 30-line implementation in the browser. Published: 2026-04-22T15:00:00.000Z Tags: cryptography, poseidon, zk, snark, phd, math import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx"; A SHA-256 of "abc" inside a SNARK takes about 24,000 R1CS constraints. The same input through Poseidon — properly parameterised — takes about **250**. Two orders of magnitude. That ratio is the entire reason ZERA's [unified shielded pool](/blog/pedersen_commitments_in_production/) ships with consumer-grade UX in 2026. It's also the reason every modern ZK system you can name uses Poseidon, Rescue, or one of their cousins instead of something the cryptographic community has been beating on for twenty years. This post is the long answer to *why*. ## The problem with hashing inside a SNARK A zero-knowledge SNARK proves you know a witness $w$ such that $C(w) = 0$ for some arithmetic circuit $C$ over a prime field $\mathbb{F}_p$. Every operation in $C$ becomes a constraint, and proof time scales roughly linearly with the number of constraints. The trouble with SHA-256 is that it was designed for CPU efficiency, not arithmetic-circuit efficiency. Its building blocks — XOR, AND, bitwise rotation — are *cheap on a CPU* and *catastrophically expensive in $\mathbb{F}_p$*. A single XOR over 32-bit words requires unpacking each word into 32 individual binary constraints, doing the XOR bit-by-bit, then packing back. SHA-256 has 64 rounds of mixing, and every round does several of these. The constraint cost looks roughly like: $$ \text{cost}_{\text{SHA-256 in SNARK}} \approx 64 \times (k_{\text{xor}} + k_{\text{and}} + k_{\text{rot}}) \times w $$ where $w = 32$ bits and the per-operation constants $k$ run between 30 and 100. You end up north of 25k constraints for a 64-byte input — and that's *just the hash*. A real circuit has dozens of these per spend. This is the gap that hash-friendly arithmetisation closes. ## Poseidon's design: only field operations, all the way down [Grassi, Khovratovich, Rechberger, Roy, and Schofnegger (2021)](https://eprint.iacr.org/2019/458) had a different idea: design the hash *natively in $\mathbb{F}_p$*. No bits. No bytes. Just field elements all the way down. Poseidon is a permutation-based sponge. The state is $t$ field elements — typically $t = 3$ for hashing two-to-one (input $|$ input $\to$ output) and $t = 5$ for absorbing three field elements at once. The permutation alternates two kinds of rounds: - **Full rounds** apply an S-box to *every* state element, then mix. - **Partial rounds** apply an S-box to *one* state element, then mix. The S-box is the simplest possible non-linear function over a prime field: $$ S(x) = x^\alpha $$ with $\alpha$ chosen as the smallest exponent for which $\gcd(\alpha, p - 1) = 1$ (so the map is a bijection). For BN254 — the curve underlying most production ZK pairings, including the one ZERA's SDK uses — $p - 1$ is divisible by 2 and 3, so $\alpha = 5$ is the smallest legal exponent. Poseidon over BN254 ships with $\alpha = 5$. The full permutation is: AC1[+ round constants] AC1 --> SB1[S-box: x^5 on all elems] SB1 --> M1[MDS matrix mix] M1 --> N{round full or partial?} N -->|full| AC2[+ round constants] N -->|partial| AC3[+ round constants] AC2 --> SB2[S-box on all] AC3 --> SB3[S-box on first elem only] SB2 --> M2[MDS mix] SB3 --> M2 M2 --> O[output state]`}/> Three primitives, repeated $R_F + R_P$ times: **add round constants** ⊕ **S-box** ⊕ **MDS matrix multiplication**. That's the whole algorithm. ## Counting the constraints This is where the order-of-magnitude advantage shows up. Each S-box is $x^5 = x^2 \cdot x^2 \cdot x$. In R1CS that's three multiplication constraints (one for $x^2$, one for $x^4 = x^2 \cdot x^2$, one for $x^4 \cdot x = x^5$). The MDS matrix is a fixed $t \times t$ matrix of constants applied to the state — that's *free* in R1CS because constant multiplications fold into linear combinations and don't generate constraints. So per round: $$ \text{cost}_{\text{full round}} = 3t, \quad \text{cost}_{\text{partial round}} = 3 $$ Recommended parameters for BN254 with $t = 3$ (hashing two field elements) are $R_F = 8$ full rounds and $R_P = 57$ partial rounds. Total constraint count: $$ 8 \cdot (3 \cdot 3) + 57 \cdot 3 = 72 + 171 = 243 $$ **Two hundred and forty-three constraints.** For a hash of two field elements (~64 bytes of payload). SHA-256 was 24,000+ for a similar payload. That ratio — about 100× — is the entire ball game. The blast-radius column is doing real work. Poseidon's the one I'm comfortable shipping in [zera-sdk](/blog/zera_sdk_scaffolding/) right now. Rescue and Anemoi are interesting but the cryptanalysis hasn't caught up to the deployment. ## A 30-line Poseidon you can run in the browser Here's a complete, working Poseidon-128 over BN254, written in TypeScript with `bigint` arithmetic. It's not optimised — production code uses Montgomery form, precomputes S-box squares, and uses constant-time field arithmetic — but it's correct and small enough to read in one sitting. BigInt(i + 1) * 3141592653589793238n % P ); const mds: bigint[][] = [ [2n, 3n, 1n], [1n, 5n, 1n], [5n, 7n, 1n], ]; function add(a: bigint, b: bigint) { return (a + b) % P; } function mul(a: bigint, b: bigint) { return (a * b) % P; } function pow5(x: bigint) { const x2 = mul(x, x); const x4 = mul(x2, x2); return mul(x4, x); } function permute(state: bigint[]): bigint[] { let s = state.slice(); let rcIdx = 0; const half = RF / 2; // first half of full rounds for (let r = 0; r < half; r++) { s = s.map((v) => add(v, round_constants[rcIdx++])); s = s.map(pow5); s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n)); } // partial rounds for (let r = 0; r < RP; r++) { s = s.map((v, i) => i === 0 ? add(v, round_constants[rcIdx++]) : v); rcIdx += T - 1; // skip constants for non-first elements s[0] = pow5(s[0]); s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n)); } // second half of full rounds for (let r = 0; r < half; r++) { s = s.map((v) => add(v, round_constants[rcIdx++])); s = s.map(pow5); s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n)); } return s; } export function poseidon2(left: bigint, right: bigint): bigint { const state = [0n, left % P, right % P]; return permute(state)[0]; } // demo: hash two field elements const a = 13n; const b = 27n; const h = poseidon2(a, b); const out = document.getElementById("out")!; out.textContent = \`poseidon(\${a}, \${b}) = \${h}\`; `, "/index.html": `
running...
`, }} /> The thing that's striking when you write this out is how *little* there is. A SHA-256 implementation is hundreds of lines of bit-twiddling. Poseidon is essentially: *add a constant, raise to the fifth power, multiply by a fixed matrix, repeat.* ## Why $\alpha = 5$ specifically The S-box choice is the most-questioned part of Poseidon. Why not $\alpha = 3$? Or $\alpha = 7$? Two constraints: 1. **Bijection.** $x \mapsto x^\alpha$ is a permutation of $\mathbb{F}_p$ if and only if $\gcd(\alpha, p - 1) = 1$. For BN254, $p - 1 = 2^{28} \cdot 3 \cdot \text{(other stuff)}$, so $\alpha \in \{2, 3, 4\}$ all share a factor with $p - 1$ and produce non-bijective maps. The smallest $\alpha$ that works is **5**. 2. **Algebraic degree.** The whole point of the S-box is to introduce algebraic non-linearity that defeats interpolation attacks. Higher $\alpha$ → more non-linearity → fewer rounds needed. So you want $\alpha$ small enough to be cheap, large enough to need few rounds. For curves where $\gcd(3, p-1) = 1$ (like BLS12-381), the choice flips to $\alpha = 3$ and the round count drops because each S-box is more powerful. The trade-off is: cheaper per-round but more rounds. The choice of $\alpha = 5$ for the prime field of BN254 is dictated by the requirement that the S-box must be a permutation: it must hold that $\gcd(\alpha, p-1) = 1$. The next constraint — the one that determines round counts — is the algebraic degree. ## What I would change in a v2 Three things, if I were re-designing Poseidon for 2027: 1. **Drop the partial-rounds split.** The original design has 8 full + 57 partial rounds; the partial rounds save a lot of constraints but make security analysis harder. [Poseidon2](https://eprint.iacr.org/2023/323) (Grassi, Khovratovich, Roy 2023) keeps a similar structure with cleaner analysis. I'd ship Poseidon2 by default in a fresh deployment. 2. **Make the MDS matrix circulant.** A circulant MDS — where each row is a rotation of the previous — has identical security properties but lets you exploit FFT-friendly arithmetic. Worth it on the prover side. 3. **Standardise the parameter file format.** Every implementation rolls its own format for round constants. The Circomlib JSON format works, but a CBOR or Cap'n Proto schema would let implementations cross-check parameters in a way that's currently per-vendor. I keep the Circomlib JSON in zera-sdk because compatibility, not because it's the right choice. ## Where this goes in production Inside [zera-sdk](/blog/zera_sdk_scaffolding/) the Poseidon implementation is `crates/zera-sdk-core/src/hash/poseidon.rs`. It's about 200 lines of safe Rust, written against the [`ff` crate](https://crates.io/crates/ff) for field arithmetic, with the round constants loaded from a JSON file extracted from Circomlib for cross-implementation parity. {`// Skeleton of the production Poseidon-128 over BN254 (in zera-sdk-core) // The actual implementation imports the constants from a build.rs-emitted // JSON; this is the structural shape. use std::ops::{Add, Mul}; #[derive(Clone, Copy, Debug)] struct Fp(u128); // toy stand-in; production uses ff::PrimeField impl Add for Fp { type Output = Fp; fn add(self, o: Fp) -> Fp { Fp((self.0 + o.0) % MODULUS) } } impl Mul for Fp { type Output = Fp; fn mul(self, o: Fp) -> Fp { Fp((self.0 * o.0) % MODULUS) } } const MODULUS: u128 = 1_000_000_007; // toy const T: usize = 3; const RF: usize = 8; const RP: usize = 57; fn pow5(x: Fp) -> Fp { let x2 = x * x; let x4 = x2 * x2; x4 * x } fn permute(mut state: [Fp; T], rc: &[Fp], mds: &[[Fp; T]; T]) -> [Fp; T] { let mut idx = 0; let half = RF / 2; for _ in 0..half { for s in state.iter_mut() { *s = *s + rc[idx]; idx += 1; } for s in state.iter_mut() { *s = pow5(*s); } state = mat_mul(mds, state); } for _ in 0..RP { state[0] = state[0] + rc[idx]; idx += T; state[0] = pow5(state[0]); state = mat_mul(mds, state); } for _ in 0..half { for s in state.iter_mut() { *s = *s + rc[idx]; idx += 1; } for s in state.iter_mut() { *s = pow5(*s); } state = mat_mul(mds, state); } state } fn mat_mul(m: &[[Fp; T]; T], v: [Fp; T]) -> [Fp; T] { let mut out = [Fp(0); T]; for i in 0..T { for j in 0..T { out[i] = out[i] + m[i][j] * v[j]; } } out } fn main() { println!("see crates/zera-sdk-core/src/hash/poseidon.rs for the real thing"); } `} ## Further reading - [Poseidon: A New Hash Function for Zero-Knowledge Proof Systems](https://eprint.iacr.org/2019/458) — Grassi, Khovratovich, Rechberger, Roy, Schofnegger (USENIX Security 2021) — the original - [Poseidon2: A Faster Version of the Poseidon Hash Function](https://eprint.iacr.org/2023/323) — Grassi, Khovratovich, Roy (2023) — what I'd ship in a v2 - [Anemoi: Exploiting the Link between Arithmetisation-Oriented and CCZ-Equivalent Symmetric Designs](https://eprint.iacr.org/2022/840) — Bouvier et al. (2022) — the next-gen contender - [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — production Rust implementation - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on what we're hashing *to* (commitments) - [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — what we use Poseidon to derive (single-use nullifiers) - [Privacy's broadband moment](/blog/privacys_broadband_moment/) — why Poseidon is part of the four-curve crossing in 2026 --- # Stuck Sell, Post-Graduation: Fixing a Trapped-Funds Bug Without a Redeploy Canonical: https://blog.skill-issue.dev/blog/stuck_sell_post_grad/ Description: A graduated launchpad token left users unable to sell. Fix shipped without redeploying the program: a frontend conversion path that withdraws SPL, compresses, then sells through the AMM. Published: 2026-04-19T16:12:10.000Z Tags: zera, solana, light-protocol, compressed-tokens, bug-fix, launchpad The worst kind of bug in DeFi is the one where users can deposit but can't withdraw. ZeraSwap shipped one — quietly — for users holding bonding-curve positions on launchpad tokens that had already graduated. They couldn't sell. They could see the balance, the AMM page existed, the price was real, but every sell attempt was blocked by an `is_active` check on the wrong account. The fix landed at [`6eafc74` — `Fix stuck sell path for users on graduated launchpad tokens`](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d) on 2026-04-19. No on-chain redeploy. The whole fix is a frontend orchestration on top of existing instructions. This post is about why the bug existed and why I chose not to fix it on-chain. ## The original sin: internal balances When you bought a token on the bonding curve, the launchpad program didn't mint compressed tokens to your wallet. It accumulated an `internal_balance` on a `UserPosition` PDA. This was deliberate — minting compressed tokens for every microcap pump.fun-style trade would have wrecked the cost calculus that makes compressed tokens viable in the first place. Internal balances are a single u64 update in a PDA. Compressed-token mints are a Light Protocol state-tree write. The latter is dramatically more expensive. The trade-off was: at graduation time, the launchpad would convert internal balances to real compressed tokens via a `withdraw_token` + `compress` flow. Anyone who held an internal balance up to that moment got the conversion for free. The bug: the launch program's `sell_token` is gated by `is_active`. After graduation, `is_active = false`. The intended sell path is the AMM. But the AMM expects you to hold real compressed tokens, and a small cohort of users still had `internal_balance > 0` because they hadn't traded since graduation — meaning the conversion never fired for them. > Post-graduation, `sell_token` is blocked by `is_active` check and AMM `swap_tokens_for_sol` burn fails because users hold internal `UserPosition.token_balance` rather than actual compressed tokens. ([6eafc74 commit message](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d)) ## Two ways to fix it, picked the second **Option A: redeploy the launchpad program with a `force_convert_on_sell` branch.** This is the obvious fix. It's also the wrong fix. A program redeploy: - Costs me real SOL on mainnet. - Risks a regression on the entire 12-launch live ecosystem. - Requires every active client to re-fetch the IDL. - Can't be reversed cleanly. **Option B: a frontend-only conversion path.** This is what I shipped. Three steps, all using existing on-chain instructions: 1. Call the launchpad's existing `withdraw_token` instruction. It mints SPL tokens from the `internal_balance` to the user's ATA, creating the ATA if needed. 2. Call Light Protocol's `compress` to convert the SPL ATA balance into real compressed tokens. 3. Hand control to the existing AMM `swap_tokens_for_sol` flow, which now sees compressed tokens and works as designed. From the diff: ```ts // sdk/src/launchpad_client.ts async convertInternalTokensToCompressed( user: PublicKey, tokenMint: PublicKey, amount: bigint, compressed: CompressedTokenHelper, ): Promise { const txSigs: string[] = []; // Step 0 (rare): create the Light token pool if it doesn't exist. // Has to be its own tx because compress ix build-time requires the pool. const poolRegistered = await compressed.isTokenPoolRegistered(tokenMint); if (!poolRegistered) { const createPoolIx = await compressed.buildCreateTokenPoolInstruction(user, tokenMint); txSigs.push(await this.buildAndSendTransaction([createPoolIx])); } // Atomic tx: ensure ATA, withdraw_token (mint SPL), compress (burn SPL → cToken). const convertIxs: TransactionInstruction[] = []; if (!ataInfo) convertIxs.push(createAssociatedTokenAccountInstruction(...)); convertIxs.push(await this.program.methods.withdrawToken(new BN(amount.toString())).accounts({...}).instruction()); convertIxs.push(await compressed.buildCompressInstruction(user, tokenMint, amount, user, ata)); txSigs.push(await this.buildAndSendTransaction(convertIxs)); return txSigs; } ``` ## The compute budget gotcha Light Protocol operations need more than the default 200K compute units. The same diff bumps every transaction the launchpad client builds: ```ts // Light Protocol operations need more than the default 200K CUs transaction.add( ComputeBudgetProgram.setComputeUnitLimit({ units: 400_000 }), ); ``` This is the kind of thing that's "obvious" once you've spent half a day staring at `Custom program error: Program failed to complete` logs and finally noticed the CU exhaustion in the simulation output. Mine that lesson once, write it down, never lose it again. ## The follow-up: SDK exports I shipped the conversion path before the SDK exports were in place, which broke the Cloudflare build. Fix landed in [`4224352` — `Add compress/decompress helpers to CompressedTokenHelper`](https://github.com/Dax911/z_trade/commit/4224352b36723fd3e03c14a4d06e87452c1222d8) eight minutes after the parent commit. The `launchpad_client` was importing `compressed.isTokenPoolRegistered`, `compressed.buildCreateTokenPoolInstruction`, `compressed.buildCompressInstruction` — none of which I'd actually exported on the helper class. Eight minutes is not a flex. CF Pages caught what my local typecheck didn't because I'd checked into the repo without re-running the SDK build. The lesson: any commit that adds a new public method on the SDK has to re-build the SDK barrel. CI for that is on my TODO list. ## Trade-offs **Why not migrate every stuck user automatically with a cron?** Two reasons. First, signing transactions on behalf of users without their explicit click is a regulatory and security minefield. Second, "stuck" is a reversible state — a user *can* trigger the conversion themselves. Forcing it for them spends gas they may not want to spend if they're holding for a longer time horizon than I am. **Why not deprecate internal balances entirely?** Because they're the entire economic argument for the launchpad. Deprecating them means every microcap trade pays Light Protocol state-tree write costs, and the flatness of the bonding curve breaks. The internal-balance design is correct; the conversion path was just incomplete. **Why frontend instead of a relayer service?** Because a relayer service is another piece of infrastructure to operate, monitor, and pay for. The frontend conversion is exactly two transactions worst-case (create pool + atomic convert), entirely user-signed, and it requires zero new servers. ## What this taught me The cheapest fix is the one that doesn't touch on-chain code. If your design lets you compose a fix entirely out of existing instructions on the frontend, it should always win over a redeploy. The ZeraSwap design happened to be composable enough that the stuck-sell case had a cheap exit. That wasn't free — it cost me a `state_tree` field I'd been religious about [from the original AMM commit](/blog/zeraswap_compressed_amm/), and it cost me writing the `convertInternalTokensToCompressed` orchestration in the SDK. But it didn't cost a redeploy or a regression test marathon. The other thing this taught me: the moment you have a launchpad with a graduation flow, you have at least three "intermediate" account states that look broken to users. Document every one of them in the admin page's docs tab. I should have done this on day one of [the prediction markets sprint](/blog/prediction_markets_admin/). I did it on day 60, after a Discord ping with the words "I can't sell." ## Further reading - [The bug-fix commit](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d) - [The SDK exports follow-up](https://github.com/Dax911/z_trade/commit/4224352b36723fd3e03c14a4d06e87452c1222d8) - [Light Protocol — compress/decompress instructions](https://www.lightprotocol.com/) - [ZeraSwap origin post](/blog/zeraswap_compressed_amm/) - [Solana Compute Budget Program](https://docs.solana.com/developing/programming-model/runtime#compute-budget) --- # Being CEO and still shipping code Canonical: https://blog.skill-issue.dev/blog/being_ceo_and_still_shipping_code/ Description: The CTO-vs-CEO false dichotomy, why I still review every PR that touches the SDK core, and how I use Claude Code plus an MCP server over my own writing to keep technical leverage as the company grows. Published: 2026-04-18T08:00:00.000Z Tags: founders, leadership, ai, mcp, narrative, engineering-culture The advice founders get most often, once you cross the line from "engineer with a side project" to "engineer with a company," is: stop coding. Hire a CTO. Spend your time on customers and capital. Trust your team. The advice is not entirely wrong. It's just incomplete. Almost all of it is written by founders whose product was a SaaS dashboard or a marketplace. The product I'm shipping is a cryptographic SDK that has to interoperate, byte-for-byte, with an on-chain Rust program that itself has to interoperate with a Groth16 verifier whose constraint system has to match the prover's circuit. *Stop coding* is a luxury available to founders whose product is forgiving. Mine isn't. So I write code. I review every PR that touches the SDK core. I do not draft a single line of marketing copy without first having shipped something the marketing copy is allowed to be about. And — the part this post is mostly about — I have built an AI tooling layer that lets me hold that posture without becoming the bottleneck. ## The false dichotomy The version of "stop coding" that most founders absorb is something like: every hour you spend on code is an hour you're not spending on customers, capital, or hiring. The math, framed that way, is brutal. Ten hours of code is ten hours of *not closing the next round.* So stop. The math is wrong because the variables don't trade off the way it implies. The hours are not fungible. *My* hours of code are not the same as the next senior engineer's hours of code, in either direction. They are slower, because I context-switch more. They are more strategic, because I see the entire surface. They are more expensive per hour, because mine are also the hours that close customers. They are more *load-bearing*, because the parts of the codebase I touch tend to be the parts that determine whether the system is correct. The right way to do the math is: which hours of code create disproportionately more leverage downstream? For me, those are: - Architectural calls on the SDK boundary. (One hour. Saves the team thirty hours of refactor in two months.) - Reading every PR that touches `zera-core`. (Fifteen minutes per PR. Catches a bug class that would otherwise reach production.) - Writing the canonical example file when a new module ships. (Two hours. Replaces a documentation effort that would otherwise be much longer and worse.) - Drafting the technical content the company says, in public, that it stands behind. ([Most of the blog](/blog).) Those four buckets, in aggregate, are maybe 8–12 hours per week. The rest of the week is the actual CEO job. The mistake is picking either *all-code* or *no-code*. The right answer is *load-bearing code only*, and being honest about which is which. ## What I stopped doing For the record — the things I had to give up: - I do not pick up tickets in the SDK that aren't on the load-bearing path. The team is faster at them than I am. - I do not write tests anymore. Test coverage is a habit; tests are a downstream artifact of the habit. The team writes them and writes them well. - I do not own DX polish. Error messages, log formatting, CLI affordances — all owned by people who care more about them than I do at the moment. - I do not do code review on the wallet, the AMM, or the medical demo unless someone explicitly asks. Each of those has an owner whose taste I trust. - I do not personally set up the CI pipeline. (This was a hard one to give up. Give it up anyway.) The pattern: I stopped doing the things that, if I disappeared for a week, the company would still ship correctly. I kept doing the things that, if I disappeared for a week, the company would ship something subtly wrong. ## How AI fills the gap The reason this posture is workable in 2026 and was not workable in, say, 2019, is that the personal leverage you can build on top of an AI coding workflow is *the* difference between a working CEO-IC schedule and one that quietly destroys the company. I run two patterns and I think they're both worth describing. ### Pattern 1: Claude Code as a senior pair Most of my SDK reviews now happen with Claude Code in the loop. I don't mean "I ask the AI if the PR looks good." I mean: I read the PR; I write a short prompt summarising what I think is happening and what I'm worried about; the model walks the rest of the codebase to either confirm or refute my worry; I make my call. The leverage isn't in *doing* the review faster. The leverage is in being able to express, in one paragraph, the part of the codebase I'm worried about, and have an agent that can read that paragraph plus the entire rest of the codebase faster than I can. The review I do at the end is the same review I would have done. The *context-loading* is what I outsourced, and context-loading was 80% of the time cost. This works because Claude Code is good enough now to be *boring*. I don't have to phrase things magically. I write the review like I'd write it to a senior colleague. The model reads the whole repo and comes back with the cross-references I need. That's it. That's the workflow. ### Pattern 2: an MCP server over my own writing This is the one I get asked about more. I have three or four years of writing on this blog. There is, in that archive, the answer to most questions a reasonable person might ask me — *what's your stance on X, what was the architecture decision on Y, what's your bullet point on supply-chain attacks for an investor deck.* The thing is, when someone asks me one of those questions, the answer that ends up in my mouth is the *recent* answer, the one I happened to be thinking about that morning. The two-year-old version of me, who probably had a smarter take, doesn't get to vote. So I'm building (`TODO: Dax confirm exact ship date — Q2 2026`) **lib.skill-issue.dev**, an MCP server over the entire archive of this blog plus a small pinned set of references. It's a Cloudflare Worker. It uses Vectorize for retrieval and Workers AI for embedding. It exposes three or four tools to any MCP-aware client: `search`, `fetch`, `summarise`, `cite`. Every Claude Code session I run can hit it. Every customer call where I want to find the version of a take I had eighteen months ago can hit it. The point of building this is not that the world needs another personal RAG system. The point is that the company is going to grow, and the founder's writing is the cheapest possible scaling mechanism for *what the founder thinks*. The MCP server is the API I built so my own context isn't bottlenecked on me being awake. Same MCP pattern that ships in [zera-sdk's mcp-server](/blog/zera_sdk_scaffolding/), incidentally. We dogfood our own thesis. ## The team thing A short word on the part that doesn't fit cleanly into either of the patterns above. Being a technical CEO who still ships code only works if the team understands the *boundaries*. The team has to know which parts of the codebase are mine to push back on (the SDK core, the cryptographic surface, the visible API) and which parts are theirs (everything else, increasingly). If I drift into reviewing CI changes, I am *taking work away from people who are good at it*, and the message I'm sending is "your work isn't really yours." That's poison. So I draw a hard line. The architecture diagram of the SDK has my fingerprints on it. The CI pipeline has someone else's. The wallet's UI has someone else's. The Solana program has the person who wrote it. *Mine to push back on* is a small list, deliberately. The list is also visible to the team — they know what it is. This was harder to learn than I'd like to admit. I went through a phase where I was reviewing too many PRs because reviewing felt productive. The team got slower, not faster. The fix was to delete most of my review notifications and explicitly hand ownership of three subsystems to three people. Speed went up. My satisfaction went down for about two weeks and then went up permanently. ## The shape of a week If you're curious how this actually allocates: a representative week, give or take, is roughly: - 8–12 hours: load-bearing code (per the bullets above) - 6–8 hours: customer + investor calls - 4–6 hours: 1:1s and team async (Linear, GitHub, Slack) - 4–6 hours: hiring loop (sourcing, screens, panels) - 2–4 hours: writing — public posts, [founder letters](/blog/why_i_started_zera_labs/), customer briefs - The rest: triage, email, the long tail of things a CEO has to look at. I don't believe in the heroic 80-hour week, but I do believe in the *consistent* 50–55-hour week, and I think this allocation is roughly that. `TODO: Dax adjust if reality drifts.` ## Why this is the post I keep getting asked for Every founder I talk to who came up technical wants permission to keep coding. The permission is, of course, theirs to give themselves — but the framing in the broader founder canon (the [Hard Thing](https://www.amazon.com/Hard-Thing-About-Things-Building/dp/0062273205), [High Output Management](https://www.amazon.com/High-Output-Management-Andrew-Grove/dp/0679762884)) is mostly written for SaaS founders whose products do not require their hands. Cryptographic infrastructure is not SaaS. The product is correct or it is not. The CEO of a company shipping correctness *should* keep their hands in the codebase. The trick is the AI-augmented workflow that makes the math work — context-loading via Claude Code, archive search via your own MCP server, and a hard list of subsystems where you, personally, are the last review. That's the entire post. Keep coding. Build the AI scaffolding around yourself that lets you keep coding. Stay out of the parts of the codebase that aren't yours. Trust the team on those parts. Be loud about which parts are yours. ## Further reading - [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the strategic backdrop for any of this to be worth doing. - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the canonical "load-bearing code only" session. - [Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/) — where the discipline to draw the boundaries came from. --- # btc-tunnel.sh: SSH-jumping into a remote bitcoind for swap testing Canonical: https://blog.skill-issue.dev/blog/vanta_btc_tunnel_dev_environment/ Description: Three small bash scripts wire the desktop dev environment to a real mainnet bitcoind for atomic-swap testing. Tunneling, RPC wrapping, and an address watcher with auto-reconnect — and why exposing 8332 to the internet is a worse idea than you think. Published: 2026-04-17T05:52:57.000Z Tags: vanta, bitcoin, ssh, tunnel, shell, devenv, rpc The atomic-swap CLI [I wrote about](/blog/vanta_swap_htlc_walkthrough/) needs two RPC endpoints: one for the VANTA chain (easy — there's a `vantad` running on the desktop dev box) and one for the BTC chain (less easy — I'm not running a full Bitcoin Core node on every laptop I develop on). The Bitcoin chain is real-money mainnet, and a Bitcoin Core full node is a 700+ GB and growing footprint that's too big to live on a developer machine in 2026. The answer that landed in commit [`e624a8e7`](https://github.com/Dax911/vanta/commit/e624a8e70) on 2026-04-17 — `desktop: scripts for BTC RPC tunnel + address watcher + rpc helper` — is three tiny shell scripts that forward a remote `bitcoind`'s RPC port to localhost over an SSH jump host, then wrap the JSON-RPC calls so the swap CLI can speak to a real mainnet node from any laptop. This post walks the three scripts, explains why they're written the way they are, and ends with an unkind paragraph about the alternative (just exposing port 8332 directly). ## The scripts There are three of them, all in [`vanta/vanta-desktop/scripts/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop/scripts): - [`btc-tunnel.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh) — set up / tear down / probe the SSH tunnel - [`btc-rpc.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-rpc.sh) — make a single JSON-RPC call to the tunneled node - [`btc-watch.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-watch.sh) — poll an address for state changes, with auto-reconnect They are all `bash`, all `set -euo pipefail`, all under 100 lines. Code that looks like 1995. That's the right tool for what they do. ## btc-tunnel.sh: the SSH-jump forward The architecture: my laptop sits on whatever Wi-Fi I'm on. The `bitcoind` runs at `10.0.1.89` on my home LAN. I can't reach `10.0.1.89` from a coffee shop. I can reach a public-facing jump host (a small VPS on a port I won't share publicly), and the jump host can reach the LAN. `ssh -J jump@public:port lan-host` does the routing. `ssh -L 8332:127.0.0.1:8332 lan-host` does the port forward. Combine them, daemonise the connection with `-f -N -M`, point a control socket at `/tmp/btc-tunnel.sock`, and you have a process you can `up`/`down`/`status` against from any shell. The full flag set, [from the script](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh): ```bash ssh_args=( -o StrictHostKeyChecking=accept-new -o IdentitiesOnly=yes -o ServerAliveInterval=30 -o ServerAliveCountMax=3 -o ExitOnForwardFailure=yes -i "$BTC_TUNNEL_KEY" -J "${JUMP_USER}@${JUMP_HOST}:${JUMP_PORT}" -L "${BTC_LOCAL_PORT}:127.0.0.1:${BTC_TUNNEL_RPC_PORT}" -S "$SOCKET" ) ``` Five of these flags are load-bearing. Let me unpack them. **`StrictHostKeyChecking=accept-new`.** This is the "trust on first use" mode. It accepts a new host key on first connection but refuses any later mismatch. The strict-no setting (`yes`) would require pre-populating known_hosts; the strict-no-yes setting (`no`) would silently accept any host key including a man-in-the-middle. `accept-new` is the right middle ground for an interactive dev tool. **`IdentitiesOnly=yes`.** Tells SSH to use *only* the key passed in `-i`, not whatever else is in `~/.ssh/`. Without this, SSH will try every key in your agent, exhaust the server's `MaxAuthTries`, and fail with a confusing error. **`ServerAliveInterval=30` + `ServerAliveCountMax=3`.** Keep-alive every 30 seconds, kill the connection after 3 missed responses. A residential ISP will silently drop idle connections; this keeps the tunnel up for hours of intermittent use. **`ExitOnForwardFailure=yes`.** If the local port bind fails — say something else is on `:8332` already — exit immediately rather than maintaining a half-broken tunnel that can't actually carry traffic. The default behavior (silently keep the SSH connection up but not the forward) is a great way to spend twenty minutes wondering why your RPC calls hang. **`-S "$SOCKET"`.** Control socket. Lets a *separate* `ssh` invocation send commands to the same connection (`-O check`, `-O exit`). This is what makes `is_up()` work without parsing `ps` output: ```bash is_up() { ssh -S "$SOCKET" -O check "$BTC_TUNNEL_HOST" >/dev/null 2>&1 } ``` That's the whole "is the tunnel alive" check. SSH manages it; we just ask. ## btc-tunnel.sh: the RPC probe Once the tunnel is up, the `status` command goes one step further — it actually makes an RPC call through the tunnel to verify the remote `bitcoind` is reachable and synced: ```bash probe_rpc() { curl -s --max-time 5 --user "${BTC_RPC_USER}:${BTC_RPC_PASS}" \ --data-binary '{"jsonrpc":"1.0","id":"s","method":"getblockchaininfo","params":[]}' \ -H 'content-type:text/plain;' "http://127.0.0.1:${BTC_LOCAL_PORT}/" } ``` The output gets parsed by an inline Python one-liner that prints the chain, block height, header height, sync state, and verification progress in one terse line. Why Python and not `jq`? Because `jq` isn't preinstalled on a fresh macOS, and Python 3 is. Portability wins over elegance here. The `getblockchaininfo` RPC is the standard "is this node alive and what does it know" call. If it returns a coherent JSON body, the tunnel is end-to-end working. If it doesn't, you get a clear error and you know which layer to debug — the SSH connection (tunnel up but RPC dead) or the local port (tunnel down, no RPC at all). ## btc-rpc.sh: the one-shot wrapper This one is so small it fits in the post: ```bash wallet_scoped=0 if [ "${1:-}" = "--wallet" ]; then wallet_scoped=1 shift fi method="${1:?method required}" params="${2:-[]}" path="" [ "$wallet_scoped" = "1" ] && path="/wallet/${BTC_RPC_WALLET}" curl -s --user "${BTC_RPC_USER}:${BTC_RPC_PASS}" \ --data-binary "{\"jsonrpc\":\"1.0\",\"id\":\"cli\",\"method\":\"${method}\",\"params\":${params}}" \ -H 'content-type:text/plain;' \ "${BTC_RPC_URL}${path}" | python3 -m json.tool ``` The whole point is to give me a one-line shorthand for `bitcoind` debugging that doesn't require remembering the JSON-RPC envelope. From any shell with the tunnel up: ```bash $ btc-rpc.sh getblockchaininfo $ btc-rpc.sh --wallet getbalance $ btc-rpc.sh --wallet getnewaddress '["swap-test","bech32"]' $ btc-rpc.sh getrawtransaction '["", true]' ``` The `--wallet` flag is the difference between core RPCs (chain state, mempool) and wallet-scoped RPCs (balance, send, sign). Bitcoin Core changed the RPC URL convention in v0.18 — wallet RPCs route to `/wallet/`, core RPCs route to `/`. The wrapper handles that distinction by setting `path` and concatenating it onto `BTC_RPC_URL`. The `python3 -m json.tool` at the end is a pretty-printer. Two seconds of latency on the JSON pretty-print is the right amount of overhead for terminal readability. ## btc-watch.sh: the address watcher This is the one I use most. When you're testing an HTLC, you fund a P2WSH output, broadcast the funding transaction, wait for it to confirm, then build the spending transaction. "Wait for it to confirm" is what `btc-watch.sh` automates: ```bash addr="${1:?address required — see usage in header}" log="${2:-/tmp/btc-watch.log}" last_state="" while true; do ensure_tunnel resp=$(rpc listunspent "[0, 9999999, [\"${addr}\"]]" || echo '{}') state=$(echo "$resp" | python3 -c " import sys,json try: r = json.load(sys.stdin).get('result') or [] if not r: print('EMPTY'); sys.exit() parts = ['%s|%.8f|%d' % (u['txid'], u['amount'], u['confirmations']) for u in r] print(';'.join(sorted(parts))) except Exception as e: print('ERR:'+str(e)) ") if [ "$state" != "$last_state" ]; then # log + display the change last_state="$state" fi sleep "$BTC_WATCH_INTERVAL" done ``` Three design choices in here that took longer than they should have to get right. **`listunspent` with `minconf=0`.** Includes mempool. The HTLC funding transaction shows up *first* in the mempool with `confirmations=0`, then gains confirmations as blocks are mined. You want to know about both states. The default `listunspent` arguments are `[1, 9999999]` (confirmed-only); we override with `[0, 9999999, [addr]]` to include mempool and filter by address. **State diffing.** The watcher prints when the state *changes*, not on every poll. Otherwise the log is unreadable. The state representation is `txid|amount|confirmations`, joined with `;` and sorted. Sorted because `listunspent` doesn't guarantee output order; without sorting, two consecutive polls of the same UTXO set could produce different state strings. **`ensure_tunnel`.** Before each RPC poll, check that the tunnel's still up. If it's not, try to bring it back up: ```bash ensure_tunnel() { if rpc getblockcount '[]' '' >/dev/null 2>&1; then return 0; fi log_line "rpc unreachable — attempting tunnel up" if [ -x "${HERE}/btc-tunnel.sh" ]; then "${HERE}/btc-tunnel.sh" up || log_line "tunnel up failed" else log_line "btc-tunnel.sh not found next to this script; cannot auto-reconnect" fi sleep 2 } ``` The script is supposed to run for hours during a long swap test. If my coffee shop's Wi-Fi drops and reconnects, the tunnel breaks. Without `ensure_tunnel`, the watcher would silently fail every 10 seconds. With it, the tunnel comes back up automatically and the polling resumes. The first time this saved me a swap test was the moment I knew the script was worth committing. ## On exposing 8332 directly > **WARNING:** Do not put port 8332 on the public internet. Do not put it on a "VPN-only" subnet that you can't audit. Do not assume rate-limiting at your router is enough. If you read tutorials online — I have, you have, we all have — you'll find advice that says "just expose your bitcoind RPC port through your router." This is bad advice, and I'm going to be direct about why. The bitcoind RPC exposes wallet operations behind HTTP basic auth. If `RPCPASSWORD` ever leaks (in a CI log, in a screenshot, in a `.env` file in a git history, in a commit message) the attacker has full access to your wallet. They can sign transactions. They can drain your funds. There is no "I locked my wallet" safety net here — the unlock is part of the same RPC and it accepts a passphrase that can also be brute-forced once the connection is open. Even with no wallet, the RPC exposes call patterns that can be used to fingerprint your node, drain its mempool data, and probe for vulnerabilities. Bitcoin Core has a hardening guide for a reason. The SSH tunnel architecture solves all of this in one move. The RPC port is bound to `127.0.0.1` on the bitcoind host. The only path to it is over an authenticated SSH connection. The jump host doesn't see the RPC traffic — it sees only the encrypted SSH stream. Your laptop talks to the jump host using key-based auth (the `IdentitiesOnly` flag). The RPC password lives in `.env.local` and never leaves your machine. If your authoring environment doesn't have an SSH-jump architecture *available*, the second-best is to run a separate `bitcoind` on `regtest` mode, just for the development workflow. Mainnet RPC should not be on a public IP. Ever. ## Pulling it together: a swap test The end-to-end swap-test flow looks like this, on a fresh terminal: ```bash # 1. Bring up the tunnel. $ ./btc-tunnel.sh up tunnel up: 127.0.0.1:8332 -> dax@10.0.1.89:8332 # 2. Sanity check. $ ./btc-tunnel.sh status forward: 127.0.0.1:8332 -> dax@10.0.1.89:8332 chain=main blocks=874632 headers=874632 synced=True progress=1.000000 # 3. Generate a fresh receiving address for the swap. $ ./btc-rpc.sh --wallet getnewaddress '["swap-test","bech32"]' "bc1qjh9pjnqs5486d08yg4aafdlphwl3rc6ls0lf7w" # 4. Start watching the address in another window. $ ./btc-watch.sh bc1qjh9pjnqs5486d08yg4aafdlphwl3rc6ls0lf7w & # 5. Run the swap CLI from a third window. $ vanta-swap participate --amount 0.001 --hash ... ``` The watcher logs the funding transaction the moment it hits the mempool, and again every time it gains a confirmation. The CLI broadcasts the spending transaction. The watcher logs the spend. That whole loop takes about 30 seconds end to end on a happy mainnet. Without the tunnel scripts, it'd take twenty minutes per iteration of fumbling with `bitcoin-cli` arguments and SSH commands. ## What I changed my mind about I wrote the first version of `btc-tunnel.sh` as a one-liner pasted into Notion. It worked. I copy-pasted it dozens of times before I realised that `ssh` would silently exit if the home host was momentarily unreachable, leaving me typing into a tunneled port that wasn't listening to anything. The version that ships does three things the one-liner didn't: it uses a control socket so `is_up` is reliable, it sets `ExitOnForwardFailure` so the script crashes loudly instead of looking up, and it has a `restart` subcommand because manually re-running `up` after a network drop is the kind of friction that makes you stop using the tool. The general lesson — and this is the kind of thing I'd put in a "scripts I keep around" post — is that the dev tools you use *daily* deserve the same care you give production code. Not the same test coverage. But the same observability. A 60-line bash script with `set -euo pipefail`, control sockets, and a clear `status` mode is a different beast from a 60-line bash script that just `ssh`s and prays. ## Further reading - [`vanta/vanta-desktop/scripts/btc-tunnel.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh) — the tunnel script - [`vanta/vanta-desktop/scripts/btc-rpc.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-rpc.sh) — the RPC wrapper - [`vanta/vanta-desktop/scripts/btc-watch.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-watch.sh) — the address watcher - [Bitcoin Core's `bitcoin.conf` reference](https://github.com/bitcoin/bitcoin/blob/master/doc/bitcoin-conf.md) — every flag in the default config - [BIP-199 by hand](/blog/vanta_swap_htlc_walkthrough/) — the swap CLI these scripts support - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the dev workflow these scripts plug into --- # Block explorers for privacy chains: a Rust indexer for vanta Canonical: https://blog.skill-issue.dev/blog/vanta_explorer_rust_indexer/ Description: Patching btc-rpc-explorer got us to 'works.' Then we wrote vanta-explorer in Rust + React: an Axum backend, SQLite indexer, and a SPA that renders shielded transfers as opaque commitments without lying about what it knows. Published: 2026-04-13T17:34:24.000Z Tags: vanta, rust, explorer, axum, react, privacy When you're forking Bitcoin Core, you can't get away with not having a block explorer. People will ask you for one within hours of finding out the chain exists. So Vanta has had two: the [patched `btc-rpc-explorer`](https://github.com/Dax911/vanta/tree/main/explorer) (Node.js, the original "works in a weekend" answer) and the from-scratch [`vanta-explorer`](https://github.com/Dax911/vanta/tree/main/vanta-explorer) (Rust + React, the "actually models a privacy chain correctly" answer). This post is about how we got from one to the other. The interesting question isn't "how do you write an explorer" — that's well-trodden — it's "how do you write a *privacy* explorer that displays opaque commitments without misrepresenting what it knows." ## Phase one: patch btc-rpc-explorer The first explorer was a 5-day patch on top of [`janoside/btc-rpc-explorer`](https://github.com/janoside/btc-rpc-explorer). The diff is in [`explorer/`](https://github.com/Dax911/vanta/tree/main/explorer) and the work was mostly: rename strings (`bitcoin` → `vanta` everywhere), swap currency labels (BTC → VANTA, sat → zat), point at `vantad` instead of `bitcoind`, fix the mining-template URL, and update the favicon. The 2026-04-13 commit log shows the rebrand pass: ``` de8efe0b explorer: rebrand patches zeracoin -> vanta ``` This explorer is Node.js, ships a multi-megabyte `node_modules`, and rendered transactions as transparent UTXOs because that's what its templates are built for. Witness v2 commitments showed up in the UI as `value: 0.0` outputs of type `witness_unknown`, which is technically accurate but extremely useless. A user looking at our chain through this explorer saw transactions and concluded "all the value is in coinbase outputs." Wrong, *but the explorer wasn't lying* — it was just showing what its model of "transaction" knew how to show. The real value lived in commitments outside its model. ## Phase two: write a Rust explorer I started [`vanta-explorer/`](https://github.com/Dax911/vanta/tree/main/vanta-explorer) on 2026-04-13 (`2db4e060 explorer: scaffold vanta-explorer (Rust backend + React web)`). The pitch was "an explorer that knows what a witness v2 commitment is, doesn't pretend transparent volume is the only volume, and gives me a tool I can extend without fighting a Node.js codebase that wasn't designed for it." The shape: - **Backend** ([`vanta-explorer/backend`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend)) — Rust, Axum 0.7, SQLite via `sqlx`, polls `vantad` and `vanta-node` on intervals. Serves `/api/*`. - **Web** ([`vanta-explorer/web`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/web)) — React + Vite + Tailwind. Recharts for hashrate/throughput. SPA with React Router. Runs as static assets served by the Axum backend on the same port. - **Indexer modules**: `l1_poller`, `l2_poller`, `mempool_poller`, `pool_poller`. Each is a tokio task that pulls its source on a fixed interval and writes to SQLite. - **API modules**: `blocks`, `tx`, `address`, `mempool`, `network`, `pool`, `proofs`, `anon`, `l2`, `search`, `tip`, `sse`. Each is its own axum router section. The full backend `Cargo.toml`: ```toml [dependencies] axum = { version = "0.7", features = ["macros"] } tokio = { version = "1", features = ["full"] } tower-http = { version = "0.6", features = ["cors", "trace", "compression-br", "fs"] } sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite", "macros", "migrate", "chrono"] } reqwest = { version = "0.12", features = ["json", "rustls-tls"], default-features = false } serde = { version = "1", features = ["derive"] } chrono = { version = "0.4", features = ["serde"] } async-stream = "0.3" ``` Reqwest with `rustls-tls` to skip the OpenSSL dependency. SQLite with chrono support so timestamps are `DateTime` end-to-end. `async-stream` for SSE — we ship server-sent events for live tip updates so the explorer's homepage updates within a second of a new block. The startup is the canonical four-task pattern: ```rust indexer::spawn_all(state.clone()); let app = api::router(state); let listener = TcpListener::bind(&bind_addr).await?; tokio::select! { res = axum::serve(listener, app) => { res?; } _ = shutdown_signal() => { info!("shutdown requested, exiting"); std::process::exit(0); } } ``` The `std::process::exit(0)` on shutdown is a deliberate cheat. Background pollers and SSE streams are infinite loops; the tokio runtime drop blocks waiting for them to finish, which they never do. Calling exit explicitly when the user hits Ctrl-C makes the explorer shut down in milliseconds instead of however long the runtime decides to wait. Not pretty; it works. ## How shielded transfers are rendered Here's the part I want to be precise about. From the [executive-summary paper](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md): > Every non-coinbase witness v2 output carries `nValue = 0` on L1; the real amount lives inside the note commitment preimage and is never observable on the public ledger. The explorer can read every transaction in every block, but it cannot read amounts on shielded outputs. That's the whole point. So the question for a privacy-explorer designer is: *what does the user see?* Three options were on the table. **Option A: lie.** Pretend the output value is what `getrawtransaction` returns (zero) and label it "0 VANTA." Technically accurate, deeply misleading. **Option B: hide.** Don't show shielded transactions at all. Filter them out of the block view. Cowardly; users can read the raw RPC and see them. **Option C: render the commitment as the artefact.** Show the transaction. Show its inputs and outputs as opaque commitments — 32-byte hex strings that are *what the chain knows*. Show that the proof verified. Don't pretend to know more than that. We picked C. The 2026-04-16 commit `c912fc04 explorer: ZK transfers first-class + genesis scan + proof verification` is when this work landed. The `proofs` API endpoint pulls from `vanta-node`'s 500-slot proof event ring buffer (the one the [zkvm engineering paper](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) describes) and the explorer renders each verified proof with the public-input slots: SMT root, input commitments, nullifiers, output commitments, signed `value_balance`. A user looking at a shielded transaction sees: - the transaction's L1 outputs (mostly `nValue = 0` witness v2 commitments + maybe an OP_RETURN anchor) - a "ZK proof verified" badge - the public inputs from the proof, byte-accurately - the SMT root the proof was verified against - the nullifier (so they can confirm the spend isn't replayed) That's the whole story the chain has for that transaction. The explorer isn't hiding anything; it's rendering the right artefact. ## The L2 poller [`indexer/l2_poller.rs`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend/src/indexer) is the module that talks to `vanta-node` instead of `vantad`. It polls the L2 sidecar's REST API on a configurable interval and pulls: - `/status` for SMT root + commitment count + nullifier count - `/proofs/recent` for the proof event ring buffer - per-commitment lookup as the explorer's UI deep-links into specific notes The explorer never tries to *decrypt* notes. The encrypted-note inbox at `vanta-node` is for wallets, not for explorers — only the recipient's secret key can decrypt. The explorer's job is to render the public artefacts and link them. Pool stats come from the `pool_poller` against the public-pool's NestJS API (the 2026-04-13 commit `dbe62058 explorer: map real public-pool NestJS shape for /api/pool` is when that contract got nailed down). The explorer's pool page shows aggregate hashrate, recent shares, and recent block finds — it's a separate data source from L1 because the pool tracks shares and miners, not chain state. ## The polish pass A bunch of small commits in mid-April were polish: - `6c374159 explorer: populate l1_txs + real TxDetail` — moved transaction-detail rendering from a placeholder to actual chain data - `30fe0a04 explorer: persist pool metrics + historic hashrate chart` — historic hashrate via SQLite-backed time-series - `600d2a03 explorer: code-split recharts via React.lazy` — Recharts is large; lazy-load it so the homepage stays fast - `96333d42 explorer: client-side Merkle verify tool` — let users paste a transaction id and a Merkle root and verify inclusion locally, without the explorer - `22698e6e explorer: phase 9 polish + fast backend shutdown` — the `std::process::exit` shutdown trick above Each of these is a half-day of work. The explorer is *eternal* polish — there's always one more chart, one more endpoint, one more responsive-layout tweak. I'm choosing to call it done at "users can navigate from a transaction to its proof to its L2 commitment to its receiving address." ## What I would do differently 1. **Don't start with the patched explorer at all.** It got us to "we have an explorer" in three days, which mattered for the launch story. But the eventual full rewrite was inevitable. If I were doing this again I'd skip phase one and accept a one-week longer runway to launch. 2. **Push more rendering to the client.** The explorer renders most pages server-side and ships HTML. A more aggressive split (server is *only* the API, client is *all* of the rendering) would simplify the backend further. The current setup is fine; it could be cleaner. 3. **Move the SQLite into a real time-series database.** SQLite is lovely for the indexed transactional data, but pool metrics + historic hashrate + mempool depth want a TSDB-shaped store (downsampling, retention policies, etc.). On the list, not urgent. ## What I changed my mind about I started building this thinking the privacy aspect would be the hardest part — that getting the UI to render commitments correctly without leaking would be a design conversation. It wasn't. The hardest part was the *boring stuff*: making the SQLite indexer fast enough to keep up with 1-minute blocks while also catching up from a cold start; making React Router not lose its mind when a deep-link lands on a page whose data isn't loaded yet; making the homepage's hashrate chart not janky. The privacy rendering, once we'd decided on Option C, was code. The rest of the explorer is the kind of work that explorers always are. ## Further reading - [`vanta-explorer/backend`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend) — the Rust + Axum + SQLite indexer - [`vanta-explorer/web`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/web) — the React + Vite SPA - [`explorer/`](https://github.com/Dax911/vanta/tree/main/explorer) — the original patched btc-rpc-explorer - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain - [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the L2 the explorer reads from - [`janoside/btc-rpc-explorer`](https://github.com/janoside/btc-rpc-explorer) — the upstream we forked for phase 1 --- # iroh in production: encrypted-note gossip on a 1-minute-block chain Canonical: https://blog.skill-issue.dev/blog/vanta_iroh_gossip_in_production/ Description: Why vanta-node uses iroh-gossip for L2 P2P instead of libp2p, what the topic + ALPN setup actually looks like, the GossipMessage shape, and the saturating-decrement bug that taught me an event ordering lesson. Published: 2026-04-13T17:46:02.000Z Tags: vanta, iroh, p2p, gossip, rust, quic, l2 import { Mermaid, Sandbox, TradeoffTable, Quote, Aside } from "@/components/mdx"; The L2 sidecar [I wrote about previously](/blog/vanta_sidecar_architecture/) has four jobs: watch L1, serve a REST API, snapshot state, and gossip with peers. The first three are well-trod tokio-task territory. The fourth is the one that actually matters for L2 decentralisation, because if every peer has to fetch encrypted notes from one REST server, that REST server is a centralisation point, and the privacy chain isn't really a privacy chain. This post is the deep dive on the gossip layer specifically. The transport is [iroh.computer](https://iroh.computer) — a pure-Rust QUIC stack with an opinionated NAT-traversal story and a built-in gossip protocol that does most of what we need. The integration code lives in [`vanta/vanta-node/src/gossip.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs), which is where I'd point you to read first if you want the unvarnished version. ## Why iroh The architecture doc puts the rationale tersely. From [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md): **P2P:** iroh.computer — pure Rust, QUIC-based, NAT traversal, gossip protocol, content-addressed blobs. Chosen over libp2p for simplicity, built-in QUIC + NAT hole-punching, and document sync (useful for offline branch-and-merge). That's the polite version. Let me unpack it with a tradeoff table that does *not* pull punches. The "configuration tax" point is the one I want to underline. libp2p is in principle the right answer; we used it on an earlier prototype. The problem was that *every* libp2p deployment is a snowflake — yamux vs mplex, noise vs tls, mdns vs static seeds, gossipsub v1.0 vs v1.1 — and getting two different libp2p deployments to talk *predictably* across a real residential-NAT network was a recurring time sink. iroh ships an opinionated default. There is one transport (QUIC), one ALPN per protocol, and one NAT-traversal story (n0-relay-assisted hole-punching). When it works it works the same way every time. When it fails, the failure modes are bounded and documented. ## Topology The Vanta L2 gossip topology is one topic per chain, with content-addressed blob references for any payload that's too big for the gossip message-size limit (we cap at 64 KB per message, which is enough headroom for a single encrypted note plus headers). Berkeley, CA] P2[vanta-node #2
Berlin] P3[vanta-node #3
Tokyo] P4[Wallet's embedded
vanta-node] end P1 <-->|encrypted notes
+ commitments
+ nullifiers| P2 P2 <-->|gossip| P3 P3 <-->|gossip| P4 P1 <-->|hole-punched| P4 R[(N0 relay)] P1 -. relay if
direct fails .-> R P4 -. relay if
direct fails .-> R`}/> Every node joins the same topic. Every message broadcast on that topic ends up at every other peer (eventually — this is gossip, not multicast, so it's `O(log N)` hops in expectation). The N0 relays are a fallback for peers behind symmetric NATs or other hole-punching-resistant boundaries; once a direct path is found, the relay drops out. The topic is a SHA-256 of a fixed string in [`gossip.rs:42`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs): ```rust fn vanta_topic() -> TopicId { use sha2::{Sha256, Digest}; let mut hasher = Sha256::new(); hasher.update(b"Vanta/L2/Gossip/v1"); let hash = hasher.finalize(); let mut bytes = [0u8; 32]; bytes.copy_from_slice(&hash); TopicId::from_bytes(bytes) } ``` `Vanta/L2/Gossip/v1`. The `v1` is intentional: when we ship a breaking change to the message format, we'll bump to `v2` and the two networks will simply not see each other. That's the cleanest cross-version migration story we have, and it's a single-line change. ## The message shape Three message kinds, all bincode-serialised: ```rust #[derive(Debug, Clone, Serialize, Deserialize)] pub enum GossipMessage { NewCommitment { commitment: Hash }, NullifierRevealed { nullifier: Hash }, EncryptedNote(EncryptedNote), } ``` `Hash` is a 32-byte alias from `vanta_core`. `EncryptedNote` is an opaque ciphertext blob plus a recipient hint that wallets use to do trial-decryption. The ciphertext is encrypted-to-recipient-pubkey using the same envelope scheme [described in the nullifier-set post](/blog/vanta_l1_nullifier_set/) — `vanta-node` cannot decrypt a note even if it tries. The relevant point is what's *not* here. There's no "request-response" message. There's no "inventory" or "bloom filter" or pull-based sync. iroh-gossip is broadcast-only; if a peer joins late, they catch up via the L1 watcher (which scans block history) and then receive new state via gossip going forward. Decoupling history-sync from real-time-sync is a simplification: gossip is *always* real-time, history is *always* re-derived from L1. ## The send path Three small fan-out helpers, one private send method, in [`gossip.rs:53`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs): ```rust impl GossipHandle { pub async fn broadcast_commitment(&self, commitment: Hash) -> Result<()> { let msg = GossipMessage::NewCommitment { commitment }; self.broadcast(&msg).await } pub async fn broadcast_nullifier(&self, nullifier: Hash) -> Result<()> { let msg = GossipMessage::NullifierRevealed { nullifier }; self.broadcast(&msg).await } pub async fn broadcast_encrypted_note(&self, enc: EncryptedNote) -> Result<()> { let msg = GossipMessage::EncryptedNote(enc); self.broadcast(&msg).await } async fn broadcast(&self, msg: &GossipMessage) -> Result<()> { let bytes = bincode::serialize(msg)?; self.sender.broadcast(Bytes::from(bytes)).await?; Ok(()) } } ``` The `GossipHandle` is `Clone` and gets passed to the API server, the L1 watcher, and the swap module. Whoever has the handle can broadcast. The handle is a wrapper around `iroh_gossip::api::GossipSender`, which is a tokio-friendly mpsc-style channel into iroh's outbound queue. `bincode::serialize` is fine here because the message types are all simple plain-old-data with no `#[serde(skip)]` or recursion. The deserialization path (next section) is where the gotchas live. ## The receive path `start()` in [`gossip.rs:88`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs) is the function that brings up the whole gossip stack. It does five things: 1. Build an `Endpoint` with the `presets::N0` relay configuration. 2. Spawn a `Gossip` instance with a 64 KB max-message-size. 3. Wire a `Router` that accepts inbound gossip connections on the gossip ALPN. 4. Subscribe to the Vanta topic with the user's bootstrap peer list. 5. Spawn a tokio task to drain the inbound stream into `apply_gossip_message`. ```rust let endpoint = Endpoint::builder(presets::N0) .bind() .await?; let gossip = Gossip::builder() .max_message_size(65536) .spawn(endpoint.clone()); let router = Router::builder(endpoint.clone()) .accept(GOSSIP_ALPN, gossip.clone()) .spawn(); let topic_id = vanta_topic(); let topic = gossip.subscribe(topic_id, peer_ids).await?; let (sender, mut receiver) = topic.split(); ``` The `accept(GOSSIP_ALPN, gossip.clone())` call is what tells the router "any inbound QUIC connection that negotiates the gossip ALPN gets handed to this Gossip instance." iroh multiplexes multiple protocols on one endpoint; today we only run gossip, but the same router could in principle accept blob-sync or document-sync ALPNs. The receive loop calls `receiver.try_next()` in a tight loop and dispatches each event. There are three event types we care about: ```rust async fn handle_gossip_event( state: &L2State, peer_counter: &std::sync::Arc, event: iroh_gossip::api::Event, ) { use std::sync::atomic::Ordering; match event { iroh_gossip::api::Event::Received(message) => { match bincode::deserialize::(&message.content) { Ok(msg) => apply_gossip_message(state, msg), Err(e) => { tracing::debug!("Failed to deserialize gossip message: {e}"); } } } iroh_gossip::api::Event::NeighborUp(peer_id) => { let n = peer_counter.fetch_add(1, Ordering::Relaxed) + 1; tracing::info!("Gossip peer joined: {} (now {n})", peer_id); } iroh_gossip::api::Event::NeighborDown(peer_id) => { peer_counter .fetch_update(Ordering::Relaxed, Ordering::Relaxed, |v| { Some(v.saturating_sub(1)) }) .ok(); let n = peer_counter.load(Ordering::Relaxed); tracing::info!("Gossip peer left: {} (now {n})", peer_id); } _ => {} } } ``` The `_ => {}` is loud silence: iroh's Event enum has more variants than we care about (relay-state changes, lurker-mode signals) and we explicitly ignore them. ## The saturating-decrement gotcha The first version of `NeighborDown` was `peer_counter.fetch_sub(1, Ordering::Relaxed)`. In a happy path this was fine — every NeighborUp pairs with exactly one NeighborDown, the counter goes up and down, and `/status` shows the right number. In the actual iroh deployment, NeighborDown can fire without a corresponding NeighborUp ever having been observed. (Reasons: the event stream can drop messages under backpressure; a peer can be "down" from this node's perspective before this node has joined the topic enough to consider them "up.") The bug surfaced as `/status` returning `peer_count: 18446744073709551614`. I had wrapped from 0 → `usize::MAX - 1`. Counting backwards in unsigned arithmetic is a strict no. The fix is the `fetch_update` + `saturating_sub` pattern in the snippet above. It's slower than a single atomic op (it's a CAS loop) but it's load-bearingly correct: the counter never goes negative, and on the rare double-down-without-up the counter just stays at its current value. This is the kind of thing you don't notice until production. **TODO: Dax confirm we want to ship `peer_count` over `/status` as a `u32` and saturate there too** — even with the in-memory fix, a 64-bit counter shipped to a frontend could in principle overflow JavaScript's `Number.MAX_SAFE_INTEGER` if something ever went really wrong upstream. ## A toy iroh-shape demo We can't actually run iroh in a Sandbox — iroh isn't WASM-portable, and it wants real UDP sockets. But we *can* simulate the message-flow shape in plain Node, which is sometimes useful for understanding the topology when you read the Rust code. \${p.name}: \${msg.kind} \${msg.id}\`); p.broadcast(msg); // recursive flood — gossip is O(log N) in practice } } } } const a = new Peer("A"); const b = new Peer("B"); const c = new Peer("C"); a.connect(b); b.connect(c); // A is not directly connected to C console.log("A broadcasts NewCommitment(0xdeadbeef)"); a.broadcast({ id: "0xdeadbeef", kind: "NewCommitment" }); console.log("\\nC broadcasts EncryptedNote(0xcafebabe)"); c.broadcast({ id: "0xcafebabe", kind: "EncryptedNote" }); console.log("\\nFinal seen sets:"); for (const p of [a, b, c]) { console.log(\` \${p.name}: [\${[...p.seen].join(", ")}]\`); } `, "/package.json": `{ "name": "iroh-shape-demo", "version": "1.0.0", "main": "index.js" } `, }} /> This is the *shape* of gossip flooding. iroh's actual implementation uses HyParView + Plumtree — more sophisticated, with eager-push trees and lazy-pull repair — but the user-facing semantic is the same: broadcast on a topic, every peer eventually sees the message, exactly once. ## Encrypted notes specifically The largest message type, `EncryptedNote`, is what wallets actually consume. The flow is: 1. Sender's wallet generates a shielded transaction. Part of the witness is an encrypted ciphertext addressed to the recipient's pubkey. 2. Sender's `vanta-node` (via the desktop app) calls `broadcast_encrypted_note(ciphertext)`. 3. iroh-gossip floods every peer in the topic. Every L2 node — including the recipient's — has the ciphertext in memory. 4. The recipient's wallet calls `/notes/scan` against its local `vanta-node`, which trial-decrypts every recently-seen ciphertext against the wallet's secret key. 5. If a trial decryption succeeds, the wallet has detected a payment. There is no "addressing." There is no "the recipient asks for their notes." Every peer has every note. Each peer's wallet decides which notes are theirs by trying to decrypt. This is the same architectural pattern Zcash sapling uses — a public ciphertext stream with private addressability — and it's why the gossip layer can be totally untrusted: peers see ciphertexts, recipients see plaintext. ## What's not in this implementation A few things to flag, both for honesty and for the next person to read this. **No gossip-layer backpressure.** If a peer publishes 10,000 encrypted notes in a second, every other peer's tokio task for the receive loop has to deserialize all of them. There's no rate limit, no back-off, no "too many pending events" exception. This is fine on a 1-minute-block chain where the pool's submission rate is bounded, but it would be a real problem on a 250 ms-block chain. **No peer reputation.** Every peer is equal. A misbehaving peer (sending malformed messages, spamming) is just ignored on a per-message basis. We don't disconnect them, ban them, or de-prefer them in routing. iroh has the primitives (`endpoint.close_peer`) but we don't use them. **No persistence across restarts.** When `vanta-node` restarts, it forgets every peer it had ever seen and re-bootstraps from the static seed list. This costs ~2 seconds on warm starts. The L1 watcher catches state up from chain history regardless, so this isn't a correctness concern, but a peer cache would shave the startup window. **No multi-topic.** All Vanta nodes are on one topic. We'll need at least mainnet/testnet split when there's a testnet to speak of; right now the topic is `Vanta/L2/Gossip/v1` and that's literally the only topic that exists. **TODO: Dax confirm we add `Vanta/L2/Gossip/regtest` when the regtest deploy lands.** ## What I changed my mind about I'd been libp2p-curious for a long time. The crate is mature, it's used by IPFS and Polkadot, the docs are pretty good. I started the Vanta L2 with a libp2p prototype and it worked. Two things made me switch. **The configuration burden is per-developer.** Every new contributor who touches `vanta-node` would need to internalise the libp2p configuration matrix (or worse: would copy-paste it from somewhere and not understand what they were copying). iroh's `presets::N0` is a single import. The cognitive load is bounded. **NAT traversal is solved-default.** libp2p's NAT traversal is a la carte: configure DCUtR, configure STUN, configure relays. iroh's is built in. On a privacy chain whose users include anyone with a residential ISP, NAT traversal is not optional and the failure mode (peer can't be reached) cascades into "wallet stuck waiting for sync." Defaulting it on saved a class of bug I was tired of debugging. The cost of the switch was about a week of integration work. I'd take that trade every time. iroh has bugs (the saturating-decrement story above is one of mine), but they're bugs at a scope I can hold in my head. ## Further reading - [`vanta/vanta-node/src/gossip.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs) — the file this post walks - [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) — the rationale for picking iroh - [iroh.computer](https://iroh.computer) — the upstream project - [iroh-gossip on docs.rs](https://docs.rs/iroh-gossip) — the crate API - [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the daemon this gossip layer lives inside - [Cruiser: A Tauri Hookup App on iroh](/blog/cruiser_iroh_gossip_p2p/) — how the same primitives ship in a different product --- # L1 nullifier sets: enforcing no-double-spend at consensus Canonical: https://blog.skill-issue.dev/blog/vanta_l1_nullifier_set/ Description: Most privacy chains track spent notes in a wallet-side index and pray. Vanta puts the nullifier set in chainstate and lets the consensus rules do the praying. Here's why that line moved, and what it costs. Published: 2026-04-17T05:52:57.000Z Tags: vanta, zk, nullifier, consensus, bitcoin, utxo This is a follow-up to [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) and a sibling to [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/). The first post explains the chain. The second explains what nullifiers *are*. This one is about a deliberate, opinionated design decision: **nullifiers in Vanta live at consensus, not at the wallet.** I want to walk through what that means, what alternatives we considered, what it costs, and why the cost is worth paying. ## The problem statement In a shielded UTXO model, every spent note has a deterministic, single-use **nullifier** — a hash that proves to a verifier *"some unspent note has been consumed"* without revealing *which* one. The classic Zcash construction is roughly `Poseidon(note_secret_key, commitment)`. The same secret + the same commitment always produces the same nullifier; revealing it twice means two spends of the same note. The verifier needs to know the global set of nullifiers ever revealed. If the same nullifier appears twice, one of the two spends is invalid. That's how double-spend is detected. The question every chain has to answer: *where does that nullifier set live?* ## Three answers ### Answer 1 — wallet-side index The [original Zcash sapling protocol](https://zips.z.cash/protocol/protocol.pdf) materialises the nullifier set client-side. Every wallet trying to spend reads the chain, builds a local nullifier set, and refuses to construct a transaction whose nullifier already appears. This works. It's also fragile in a way that always made me uncomfortable. A wallet bug — or a malicious wallet — can construct a transaction whose nullifier matches a previous one. The transaction is *valid by chain rules* until the second spend is mined; only then do nodes notice. In practice this means a brief reorg window where a double-spend is technically possible. It also means **node operators can't run a privacy chain without the wallet code.** That's a sociological problem more than a technical one, but it's real. ### Answer 2 — separate nullifier-tracking smart contract / sidechain The Tornado-Cash-on-Ethereum approach. The nullifier set lives in a smart contract. The contract enforces uniqueness as a side effect of every withdraw. The chain itself doesn't know what nullifiers are — it just runs the contract. This works on Ethereum because Ethereum has expressive smart contracts that can hold and mutate large state cheaply (relative to L1 gas). It's a non-starter on a Bitcoin-fork chain because Bitcoin Script doesn't have arbitrary stateful contracts. You could put a precompile in. We didn't want to. ### Answer 3 — chainstate The nullifier set lives in the same database the UTXO set lives in. Validating a block means *(a)* checking script signatures, *(b)* checking the witness ZK proofs, *(c)* checking that no two spent nullifiers in this block (or this block + history) collide. Nodes that don't enforce nullifier-uniqueness reject blocks the network considers valid. They literally cannot stay in sync. This is what Vanta does. ## Why we picked answer 3 Three reasons. ### Soundness A nullifier collision in chainstate is a *consensus violation,* not a wallet bug. There is no version of the network where the double-spend is "valid for a few blocks until someone notices." Either the block is valid or it's not. The confidence story is the same as Bitcoin's UTXO model: a confirmed transaction is final under the same assumptions every other Bitcoin transaction is final under. This matters for an audience that already understands Bitcoin's finality assumptions. We did not want to introduce a *new* set of finality caveats for the privacy layer. ### Operator simplicity A node operator running `vantad` doesn't need to also run wallet software, doesn't need to trust an indexer, doesn't need to subscribe to a third-party "nullifier feed." The chain validates itself. This is the same reason most exchanges run their own Bitcoin nodes instead of trusting Blockchain.info: chainstate is the source of truth. ### Wallet flexibility If the chain owns nullifier-uniqueness, wallets become *commodity software.* You can have ten different wallets, three different proof systems, an iOS-native client, a CLI, a hardware-wallet integration — and they all rely on the same chainstate validation. The wallet's job collapses to "construct a valid spending transaction." The chain is the arbiter. ## What it costs Nothing is free. Three real costs: ### Storage Every nullifier ever revealed has to live in chainstate forever. With Poseidon-2 over BN254 the digest is 32 bytes. Vanta is a 1-minute-block chain with 100k VANTA per block; assume a long-term steady state of, say, 5 nullifiers per block (transparent transactions don't burn nullifiers; only shielded spends do). At ~525,600 blocks per year, that's `5 × 525600 × 32 = ~84 MB` of nullifier state per year. After ten years: ~840 MB. Compare to Bitcoin's UTXO set, which is currently ~12 GB. We're well below it. Storage is not the limiting factor. ### Sync time A new node has to download and verify the nullifier history. The verification cost is just a hash check per nullifier (no proof re-verification needed if the block was already validated by the network — the witness root in the coinbase commits to the SMT). At a few microseconds per hash, ten years of history validates in ~half an hour on a modern CPU. Acceptable. ### Sparse-Merkle-tree maintenance This is the real cost. We commit to the *root* of the nullifier set in every coinbase, so that light clients and SPV-style verifiers don't need the full chainstate to verify a proof. Maintaining an SMT over a growing set of 32-byte hashes is non-trivial. We use the [`smirk` Rust crate](https://crates.io/crates/smirk) (an SMT library written for exactly this kind of consensus-state use case) and the marginal cost per insert is `O(log n)` hashes — a few hundred microseconds in practice. The implementation lives in [`vanta/` (the Rust subtree)](https://github.com/Dax911/vanta/tree/main/vanta) and the binding into the C++ core happens in `src/validation.cpp` via FFI. **TODO: Dax confirm exact SMT crate name** — `smirk` is what we use today; if we switched to a custom implementation note that here. ## The witness-v2 dance Here's the part that took me longest to get comfortable with. Bitcoin's witness data (segwit) is verified after the script. We needed the ZK proof to be verified after the script too — the script confirms the spender knows the right commitment, the proof confirms the spend is valid under the rules of the shielded pool. Vanta extends segwit to a "witness v2" format that includes: 1. The classic script witness (signatures, etc.). 2. A new `proof_root` field — a 32-byte commitment to the proof's public inputs. 3. A new `nullifier` field — the 32-byte nullifier the spend is consuming. The C++ validator does three checks in order: 1. The script verifies (standard segwit path). 2. The `nullifier` is not already in the chainstate nullifier set. 3. The `proof_root` matches the in-block coinbase's SMT root for this transaction's logical position. The actual ZK proof verification happens **out of process** in the Rust sidecar. The C++ node fires off the proof to a local Unix socket and waits for `ok` or `not ok`. This sounds slow but in practice the prover-side work is what's expensive (4-8 seconds); the verifier-side check is ~30 milliseconds and well-amortised across the block. If the sidecar is unavailable, the node refuses to validate witness-v2 transactions and stays in IBD-style "I'm not caught up" mode. Better than silently accepting unverified shielded spends. ## What I changed my mind about I started this design wanting to put proof verification *inside* the C++ validator via a precompile-style C++ binding. That would have meant linking the entire `risc0-zkvm` Rust crate into Bitcoin Core's C++ build, which is — to put it mildly — not a small ask of a Bitcoin Core review process. The out-of-process sidecar pattern was a concession to "we will eventually want to upstream as much of this as possible." A node that talks to a sidecar over a Unix socket is a node that can be ported to the eventual full-Rust rewrite without changing its consensus contract. The sidecar is the ABI; the language behind it can move. I'm still not 100% sold on this trade. The audit surface is a lot bigger when there are two processes. **TODO: Dax confirm whether we end up upstreaming the proof verifier into the core process for v2.** ## Further reading - [`Dax911/vanta`](https://github.com/Dax911/vanta) — the chain - [`vanta/` Rust subtree](https://github.com/Dax911/vanta/tree/main/vanta) — the ZK sidecar - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the parent post - [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — the SDK-side primitive - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — what the commitments commit to - Hopwood, Bowe, Hornby, Wilcox — *Zcash Protocol Specification* (2022 edition) - Buterin et al. — [*Sparse Merkle Trees*](https://eprint.iacr.org/2016/683) --- # What's in vanta/papers — reading 17 design docs in 2026 Canonical: https://blog.skill-issue.dev/blog/vanta_papers_design_doc_tour/ Description: Vanta ships its whitepaper as 17 markdown files in the repo, not a PDF on a marketing page. This is the tour: what each doc covers, which one has the wording bug, and why the docs live next to the code. Published: 2026-04-14T19:50:56.000Z Tags: vanta, documentation, whitepaper, design A privacy chain in 2026 cannot ship as a one-page marketing site with a PDF link. The serious audience — auditors, exchanges, regulators, other engineers — needs to read the design without filling out a form. So Vanta's whitepaper isn't a PDF on a website. It's a directory: [`papers/`](https://github.com/Dax911/vanta/tree/main/papers) in the main repo, 17 markdown files, MIT-licensed, version-controlled, diffable. This post is a tour. One paragraph per doc, plus the planning notes that aren't in `papers/` but are in [`planning/`](https://github.com/Dax911/vanta/tree/main/planning). At the end I call out the wording bug the audit flagged but I haven't fixed yet. ## Why design docs in the repo Three reasons. **Diffability.** A change to the chain rules is a commit. The doc that explains the rule change is also a commit. Over time the doc diff and the code diff are in the same git history, so you can check whether the prose ever lagged the code. **No marketing intermediary.** A PDF on a website goes through whoever owns the website. A markdown file in the repo is *the* artefact; nobody can mis-summarize it without me noticing. This matters more than I expected when the audience for the docs is "people who will run nodes" rather than "people who will buy the token." **They're the input to the LLMs that read the codebase.** Increasingly, technical evaluation in 2026 is mediated by AI assistants reading the repo. Markdown in the repo is what those tools index. PDF on a website is not. **TODO: Dax confirm this is still our framing once the docs/ ship has matured.** ## The 17 papers `00-master.md` — index of everything else. If you only read one, this is the one to *not* read; jump to `01`. `01-executive-summary.md` — the headline pitch in long form. The opening line is the design: *"Vanta Protocol is the first sovereign Layer 1 blockchain where financial privacy is enforced by consensus on every non-coinbase transaction."* The interesting bits: the explicit refusal of Zcash-style optional shielding, the rule name `bad-vanta-v2-output-nonzero-value` that fires on any v2 output with non-zero `nValue`, the "fast privacy decay" coinbase pattern (one confirmation transparent, private after). `02-technical-whitepaper.md` — the deep dive. Witness v2 layout, the `VantaJournal` struct, the `value_balance` semantics (>0 burns hidden value to L1, <0 mints it from L1, =0 is a pure shielded transfer). The arithmetic of the consensus rule. This is the paper an auditor reads. `03-comparative-technical-analysis.md` — Vanta vs Zcash, vs Monero, vs Penumbra, vs CoinJoin-on-Bitcoin. Honest about where each peer is ahead and where each is behind. Calls out that Penumbra is on the wrong chain base (Cosmos SDK + Tendermint BFT) for our specific bet on PoW. `04-market-analysis.md` — TAM and competitive positioning. Not technical, but useful for understanding the framing. **TODO: Dax confirm the market sizing before I quote it elsewhere.** `05-layer-taxonomy.md` — the taxonomy of L1 / L2 / sidechain / app-layer privacy and where Vanta sits. Most useful for readers who have already absorbed Penumbra, Zcash, Tornado Cash, and want to triangulate. `06-pitch-deck.md` — the slides version. Repeats the executive summary at lower resolution. `07-business-plan.md` — operations, deployment, infra. Mostly internal but in the open repo because there's nothing actually private in a fair-launch chain's business plan. `08-tokenomics.md` — the supply schedule, halvings, fair launch. The numbers are the ones in the [original L1 post](/blog/vanta_zk_privacy_l1/): 100k VANTA per block, 42B total supply, 210k-block halving (~146 days). Zero premine, zero founders allocation. **TODO: Dax confirm we're not making any forward statements about $VANTA price or fundraising; I have *not* read these tokenomics with that lens and I won't quote any forward-looking number.** `09-performance-analysis.md` — block-time, propagation, proof-time benchmarks. Prover takes 30–60s on CPU, verifier ~30ms. Block validation overhead is the verifier cost amortised across block transactions. Acceptable for 1-minute blocks. `10-novelty-analysis.md` — what's new vs prior art. The honest version: very little is new at the *primitive* level (Pedersen commitments, nullifiers, SMTs, all known); the synthesis (mandatory privacy + Bitcoin Core fork + SP1 backend + AuxPoW path) is the contribution. `11-paradigm-research.md` — the broader research positioning. Reads as a literature review. Useful if you want to know where the design borrows ideas from (Zcash, Penumbra, Aleo) and where it deliberately diverges. `12-academic-paper.md` — the conference-paper version. Same content as `02`, formatted to academic conventions. The version we'd submit to a privacy-coin venue if we were doing that. `13-security-model.md` — what the chain protects against, what it doesn't. The "what it doesn't" list is the important part: targeted timing attacks on a single transaction, side-channel leakage through wallet behaviour, an adversary with control of the proof-network if the user uses one. Read this before you build on top. `14-public-roadmap.md` — what's shipped, what's coming. Phase 1/2 complete, Phase 3 in progress (L2 privacy layer with iroh gossip), Phase 4 future (full Rust node rewrite). The dates are deliberate ranges, not commitments. `15-regulatory-framework.md` — how the chain reads to a regulator. **TODO: Dax confirm before I quote any specific position; I am not a lawyer and the regulatory narrative belongs to the legal review, not to me writing a blog post.** `16-use-cases.md` — what people will actually do with the chain. Treasury operations, individual savings, atomic-swap liquidity. Honest about which use cases need *more than* mandatory privacy (e.g. payroll, where the recipient set has to be opaque too — that's a wallet UX problem, not a chain problem). `17-zkvm-engineering.md` — the deep dive on SP1, Plonky3, why we picked them, the abstraction layer that makes zkVMs swappable. I cited this paper extensively in [Why we shipped SP1 instead of RISC Zero](/blog/vanta_sp1_zkvm_circuits/). It's the most useful paper for an engineer evaluating Vanta against another zkVM-based chain. ## The planning notes [`planning/`](https://github.com/Dax911/vanta/tree/main/planning) is *not* in `papers/`. It's the loose work-in-progress notes I'm not ready to call canonical. Today there's one file: [`price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md). That doc is worth a separate post, which I wrote: [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/). The short version: if either side of an atomic swap is shielded, the rate `btc_amount / vanta_amount` is hidden from observers, which means no public price tape, which means no spot market formation. The doc walks through six options for how price could emerge without compromising the privacy property — voluntary post-trade rate publication, ZK-attested rate proofs, off-chain encrypted order books — and lands on a hybrid recommendation. I'm holding it in `planning/` rather than `papers/` because it's a *design exploration*, not a commitment. The status line at the top says exactly that: "Status: Design exploration, not a commitment. Written 2026-04-17." ## The wording bug I haven't fixed The repo's [`CLAUDE.md`](https://github.com/Dax911/vanta/blob/main/CLAUDE.md) flags an inconsistency I'm aware of: > Phase 2 papers wording is "code complete, activation pending" but code shows `ALWAYS_ACTIVE` from genesis — wording bug in `papers/01-executive-summary.md` to fix. The executive summary in `01-executive-summary.md` describes some of the privacy rules as "code complete, activation pending." The actual chain has those rules `ALWAYS_ACTIVE` from genesis — they're enforced from block 1, not gated behind a future activation. The doc lags the code. This is the kind of thing that *only* happens when you're rewriting both fast. The fix is a five-minute paragraph edit; I'm calling it out here because the right way to handle a doc-vs-code drift is to say "yep, doc lags, here's the fix" rather than to silently update and hope nobody noticed. **TODO: Dax confirm timing on shipping that fix.** ## Why the docs are the README's older sibling A reader who only reads the [README.md](https://github.com/Dax911/vanta/blob/main/README.md) gets the chain parameters and a roadmap. A reader who reads `papers/` gets the *case* for the chain — why these parameters, why this proof system, why mandatory privacy, why fair launch. The README is for someone who's deciding whether to spend an hour. The papers are for someone who's deciding whether to run a node, port a wallet, list the asset, write a regulatory memo, or audit the cryptography. Different audiences, different artefacts. Both live in the repo. Both are diffable. Both are MIT-licensed. That's the documentation discipline I'm trying to lock in: nothing about how the chain works lives behind a marketing page or a sales rep. ## Further reading - [`papers/`](https://github.com/Dax911/vanta/tree/main/papers) — the 17 markdown files - [`papers/00-master.md`](https://github.com/Dax911/vanta/blob/main/papers/00-master.md) — the index - [`papers/01-executive-summary.md`](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md) — the headline pitch - [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — the SP1/Plonky3 deep dive - [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md) — the swap-price design exploration - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the practitioner-flavored pitch - [Why we shipped SP1 instead of RISC Zero](/blog/vanta_sp1_zkvm_circuits/) — the post that quotes paper 17 most --- # Private atomic swaps and the price-discovery problem Canonical: https://blog.skill-issue.dev/blog/vanta_private_atomic_swaps/ Description: BTC ↔ VANTA atomic swaps via HTLC are the easy part. If the VANTA leg is shielded, no observer can compute the rate, and no rate means no public price. Walking through six designs and the hybrid recommendation in vanta/planning. Published: 2026-04-17T05:52:57.000Z Tags: vanta, atomic-swaps, htlc, price-discovery, planning The 2026-04-17 commit message — `planning: price-discovery design for private atomic swaps` — is one of the more interesting things in the Vanta repo, because it isn't code. It's a design exploration in [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md), and it's the kind of doc I wish more chains shipped: a problem statement, six options, an honest comparison, a recommendation, and an explicit *"this is not a commitment"* status flag. This post walks through the design. The HTLC machinery on the implementation side lives in [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap); the policy question is in `planning/`. ## What atomic swaps are, briefly A hash-time-locked contract (HTLC) lets two parties on different chains agree to a swap without trusting each other or a third party. Alice has BTC, Bob has VANTA. They agree to swap. Alice picks a random secret `s`, computes `h = sha256(s)`. They both lock their funds in HTLCs that pay out to *whoever knows `s`* (and refund to the original sender after a timeout, if `s` never gets revealed). The script for the HTLC is short — quoting [`vanta/vanta-swap/src/htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs): ``` OP_IF OP_SHA256 OP_EQUALVERIFY OP_CHECKSIG OP_ELSE OP_CHECKLOCKTIMEVERIFY OP_DROP OP_CHECKSIG OP_ENDIF ``` The `IF` branch is "claim with the preimage." The `ELSE` branch is "refund after the timelock." Both are P2WSH-wrapped. The receiver claims by revealing `s` to spend the HTLC; once `s` is on-chain, the other side claims their HTLC using the same `s`. If either side bails, both refund after the timeout. Same `hash` on both chains. Same `OP_SHA256`. Both Bitcoin and Vanta speak this script unchanged. That's why the swap implementation in [`vanta/vanta-swap/src/swap.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) works against both chains' RPCs with a single `ChainConfig` abstraction. ## The price-discovery problem The swap implementation today is *fully transparent on both sides*. From the planning doc: > Worth being precise: **the current swap implementation is fully transparent on both sides.** > > - `swap.rs` funds the VANTA leg via L1 RPC (`createrawtransaction` → `fundrawtransaction` → `signrawtransactionwithwallet` → `sendrawtransaction`). That's the transparent L1, not the shielded L2. > - So `vanta_amount` is plainly visible in the P2WSH output on L1. > - `btc_amount` is visible on Bitcoin. > - Price is therefore **already discoverable today** by anyone scanning matched hashes across the two chains. So the problem is *forward-looking*. Once the VANTA leg moves to a shielded note (commitment + encrypted amount, no visible value on L1), an external observer can: - find the BTC-side HTLC with amount `X` and embedded hash `h` - see *that* a note with commitment tied to `h` exists on VANTA L2, but not its amount `Y` - without `Y`, no `X/Y` rate No rate means no tape. No tape means no public order book. No public order book means no efficient price formation. *That* is the problem. I want to push back on a knee-jerk response that "privacy chains shouldn't have public prices." Of course they should — every market needs a price. The question is *how price emerges without compromising the privacy property*. That's not the same as "should there be a price at all," which is a question I think privacy maximalists sometimes confuse. ## Six options The doc walks through six designs. I'll abbreviate. ### 1. Do nothing — OTC negotiation only Peers find each other on Nostr / Telegram / a forum, agree privately, swap. Zero engineering. Zero price discovery. Hard to bootstrap a market. New users can't tell what a fair rate is. LPs won't come. Pros: trivial, full privacy. Cons: the market doesn't form. ### 2. Voluntary post-trade rate publication After a swap, either party signs a `{rate, timestamp}` statement and posts it to a relay (Nostr, an HTTP aggregator, whatever). An aggregator computes a median or time-bucketed mean. **Crucially: publish the rate, not the size.** Rate is a scalar; it leaks nothing about how much the signer actually traded. Pros: simple, opt-in, amounts stay shielded. Cons: self-reported, trivially fakeable. Anti-spam needs a cost function — proof of recent swap, a small VANTA burn, a reputation-weighted signer set. ### 3. ZK-attested rate proofs Use SP1 (already in the consensus stack) to prove: > "I participated in a swap whose hash is `H` (publicly known), and the rate was in `[r − ε, r + ε]`, without revealing either amount." The circuit takes `X`, `Y`, `r` as private witness, publishes `H` and `r` as public output. Anyone can verify the SP1 proof and see a rate without seeing amounts. Pros: cryptographically binding, not self-reported. Cons: non-trivial circuit work; SP1 proof costs (the doc notes the 5070 box is below the 24 GB GPU minimum, so we'd need CPU proving or a remote prover); UX friction. ### 4. Off-chain encrypted order book with HTLC settlement Bisq-style. Orders live in a P2P relay (Tor hidden service, Nostr, Waku). Orders are plaintext (amount, rate, counterparty pubkey) at *posting* time. Match happens, counterparties swap via HTLC, order disappears. Price discovery is from the *order book*, not from chain history. Pros: decouples price discovery (pre-trade order book) from settlement privacy (post-trade on-chain). The doc calls this "arguably the right architecture." Cons: requires a relay layer; orders-in-the-open weakens pre-trade privacy of unfilled orders. ### 5. Trusted LP / market maker Professional MMs run their own nodes, quote two-sided publicly, users trade against them via atomic swap. LPs willingly reveal quotes because that's their business. Pros: realistic bootstrapping path, CEXes already work this way. Cons: centralises price discovery; LPs need KYC/operational reality → potentially a regulatory attack surface. ### 6. Hybrid: opt-in transparent-swap mode Users opt into a "transparent swap" that pins the VANTA leg to L1 (visible). Those swaps contribute to a public price tape. Private traders settle on L2 and free-ride on the tape. Pros: zero new crypto; user-level privacy/contribution choice. Cons: tragedy-of-the-commons. Everyone wants privacy, nobody wants to be the transparent swapper. Requires incentive design (fee rebates for transparent swappers?). ## The recommendation The doc lands on a hybrid of #4 and #2: > For a near-term path: **combine #4 (off-chain order book) + #2 (voluntary rate publication)**. Rationale: > > - #4 gives us an actual market — users see bids/asks before committing. > - #2 gives us a historical tape — aggregators compile published rates into OHLC candles. > - Both respect the privacy invariant: amounts stay shielded. > - Both are boring engineering, not new cryptography. We can ship them. > - #3 (ZK rate proofs) is a "do it later if spam becomes a real problem" lever. I agree with this and want to underscore the framing: *boring engineering, not new cryptography*. New cryptography is expensive in the medium term — it has to be audited, the implementation has to land, the wallets have to integrate, the tooling has to mature. An off-chain order book + voluntary rate posts ship in a quarter using existing primitives. The ZK rate-proof option is a clean lever to pull *later*, if the simpler scheme proves insufficient against spam. Worth a moment on #3 specifically. ZK rate proofs are tempting because they're cool. They're also a chunk of circuit work, and the wallet UX gets one more "generate proof" wait. Building it before we know whether voluntary publication produces enough useful data is over-engineering. The principle: **build the simplest thing that could work, instrument it, then add cryptography when the simpler thing demonstrably fails.** ## Open questions the doc flags The planning note ends with five questions I haven't answered yet: 1. **Anti-spam for voluntary publication.** Cost function: proof of recent shielded spend? Small VANTA burn? Reputation-weighted signer? My current bias is "small VANTA burn weighted by chain age" — cheap to publish if you've held VANTA for a while, expensive if you haven't, no operational dependency on a reputation graph. 2. **Relay topology.** Nostr (easy, public), Waku, or a Tor hidden-service relay? Probably Nostr to start. **TODO: Dax confirm we want Nostr-first vs a custom relay.** 3. **Quote units.** sats/VANTA or VANTA/BTC? Pick one canonical representation up front and stick it in the whitepaper suite. I lean sats/VANTA because it makes for round numbers at current valuation. 4. **Handling the current transparent swap.** Migration path or permanent second mode? Affects whether the price-discovery design has to handle two worlds. **TODO: Dax confirm.** 5. **Cross-asset routing.** VANTA ↔ X ↔ BTC via multi-hop. Out of scope here, but on the longer-term roadmap. These are the kind of open questions that *should* be public. A privacy chain whose policy decisions are made behind closed doors is, sociologically, a chain you can't trust. Putting the design exploration in the open repo means the discussion happens in pull requests, not in a slack I run. ## What's *not* in this design A couple of things I want to flag explicitly because they often come up. **Oracles.** Vanta does not currently feed external prices into on-chain logic. There's no smart-contract platform, so there's no place to feed them *to*. Oracles are an L2 problem; they'll show up if and when programmable shielded contracts ship. **Loans / derivatives.** Out of scope. Spot atomic swaps are the spot market. DeFi primitives beyond spot are a much larger conversation. **A unified DEX.** I am skeptical of "one app to rule them all" DEX designs for a privacy chain. Composability is harder when amounts are shielded; the simplest path is probably *multiple* small-surface protocols (atomic swaps for cross-chain, order book for in-chain, AMM only if liquidity demands it). ## What changed my mind about the swap problem Two things. First, when I started thinking about this, I assumed ZK rate proofs (option 3) were the obvious answer because they're the most cryptographically clean. They're also the most cryptographically *expensive*. Once I actually thought about the user flow — generate a swap, generate a proof, *then* publish — I realised the friction would crater participation. The voluntary scheme is worse on cryptographic strength but enormously better on participation, and a market with weaker price proofs that more people use is a better market than a strong-proof market that nobody uses. Second, I underestimated how much of the answer is *just an order book*. Bisq's design has been working for years on exactly this problem (privacy-respecting BTC ↔ fiat). An off-chain encrypted order book with on-chain HTLC settlement is the architecture that *already works* in the wild for a closely-related problem. Reusing it for VANTA ↔ BTC is the smallest delta. Both of these updates landed *because the planning doc was a pull-out-the-options doc*, not a "here's the design" doc. Writing it forced the comparison. ## Further reading - [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md) — the doc this post walks through - [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap) — the HTLC implementation - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain - [What's in vanta/papers](/blog/vanta_papers_design_doc_tour/) — the canonical-papers tour - [Bisq's design overview](https://bisq.network/) — the existing implementation of "encrypted order book + on-chain settlement" - [BIP 199 (HTLC)](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) — the upstream pattern the swap script implements --- # BIP-199 by hand: a code walk through vanta-swap Canonical: https://blog.skill-issue.dev/blog/vanta_swap_htlc_walkthrough/ Description: A line-by-line tour of the Rust HTLC state machine that drives BTC ↔ VANTA atomic swaps. Redeem script bytes, the 2x/1x timelock dance, BIP143 sighash binding, and the witness layout that makes refund and claim routes provably distinct. Published: 2026-04-13T22:22:23.000Z Tags: vanta, atomic-swaps, htlc, bitcoin, bip-199, rust import { Mermaid, RustPlayground, TradeoffTable, Aside } from "@/components/mdx"; The companion to [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/) is a piece of code, not a planning document. The chain-policy decisions about *how prices form* are upstream of [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md). The actual swap mechanics — the bytes that go on the wire, the script that locks the funds, the witness that unlocks them — live in [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap), which landed in commit [`149c1a41`](https://github.com/Dax911/vanta/commit/149c1a419) on 2026-04-13. This post is a code walk. If you want the policy framing, read the other post first. If you want to understand what an HTLC actually *is* in 350 lines of Rust, you're in the right place. ## What [BIP-199](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) is, in one paragraph A hash time-locked contract is a Bitcoin output that pays whoever can produce one of two things: 1. The preimage of a public hash (the *claim* path), or 2. The original funder's signature, but only after a block-height locktime has passed (the *refund* path). That's a four-line redeem script. The protocol around it — generating the secret, picking timelocks, broadcasting in the right order, watching the chain for the preimage reveal — is the [BIP-199 atomic-swap state machine](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki). Two parties construct two HTLCs, one on each chain, with the *same* hash and *opposite-asymmetric* timelocks. Either both legs settle or both legs refund. There is no third outcome. ## The timelock math The whole thing rests on a piece of arithmetic that is one inequality: $$ t_{\text{now}} < t_{1} < t_{2} $$ where $t_2$ is the initiator's locktime (longer) and $t_1$ is the participant's locktime (shorter, conventionally $t_1 = t_2 / 2$). The initiator commits *first*, with the longer timelock. The participant matches with a shorter timelock. When the initiator claims the participant's HTLC (revealing the preimage), the participant has at least $t_1$ left to use that preimage on the initiator's HTLC. If the participant disappears, the initiator waits $t_2$ blocks and refunds. If the initiator disappears, the participant waits $t_1$ and refunds. The asymmetry matters. If the timelocks were equal, a malicious initiator could refund their own HTLC seconds before the participant claims, racing the participant for one of the funds. The 2x/1x ratio gives the participant a $t_1$-block buffer to react to the preimage reveal. In Vanta's CLI this shows up as a `--timelock` flag on `initiate` and a derived value the participant uses, printed as a hint by [`main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/main.rs): ``` The participant should use timelock = {timelock / 2} (half of yours). ``` Half. Not "your locktime minus epsilon." Half. Because the participant has to pick a value that gives the initiator enough time to claim *and* leaves the participant a meaningful refund window if the initiator vanishes. ## The state machine Four parties, four states. Created: initiator generates secret + hash Created --> Funded_I: initiator broadcasts HTLC on chain A (locktime t2) Funded_I --> Funded_P: participant broadcasts HTLC on chain B (locktime t1) Funded_P --> Claimed_P: initiator reveals preimage, claims chain B Claimed_P --> Claimed_I: participant uses revealed preimage on chain A Claimed_I --> [*]: swap complete Funded_I --> Refunded_I: t2 expired, no participant Funded_P --> Refunded_P: t1 expired, initiator never claimed Refunded_I --> [*]: aborted before participant Refunded_P --> [*]: aborted after participant`}/> The two refund paths are the *only* way the swap fails partially. Either both sides claim — atomic — or both sides refund — atomic. The mid-swap state where exactly one side has settled is unreachable, because the act of claiming chain B *publishes* the preimage on chain B, and chain A's HTLC reads the same preimage. (We'll come back to this.) The Rust enum that mirrors this is in [`swap.rs:48`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs): ```rust #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] pub enum SwapStatus { Created, Funded, Claimed, Refunded, } ``` Note: there's no `Aborted` or `Failed`. A swap that goes wrong refunds. There is no sad-path state. ## The redeem script, byte by byte Quoting [`htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs): ``` OP_IF OP_SHA256 OP_EQUALVERIFY OP_CHECKSIG OP_ELSE OP_CHECKLOCKTIMEVERIFY OP_DROP OP_CHECKSIG OP_ENDIF ``` The `IF` branch is the *claim* path. To take it, the spender pushes: 1. A signature over the spending transaction 2. A 32-byte preimage 3. `0x01` (OP_TRUE — selects the IF branch) 4. The redeem script itself (this is the P2WSH witness convention) The script then runs: pop `0x01` (truthy → enter IF), `OP_SHA256` the preimage, compare against the embedded ``, `OP_EQUALVERIFY` (fail if not equal), then ` OP_CHECKSIG` against the signature. The `ELSE` branch is the *refund* path: 1. A signature 2. The empty byte string (OP_FALSE — selects the ELSE branch) 3. The redeem script ` OP_CHECKLOCKTIMEVERIFY OP_DROP` is the BIP-65 incantation: pull `nLockTime` from the spending tx, compare against ``, fail if too early. Then ` OP_CHECKSIG`. The Rust that builds this lives in `redeem_script(&self) -> Vec`. It hand-emits opcodes. Worth quoting — there's no "script library" here, just a `Vec` that gets pushed on: {`// Simplified excerpt from vanta-swap/src/htlc.rs. // Hand-emitted Bitcoin script — no abstraction layer. mod op { pub const OP_IF: u8 = 0x63; pub const OP_ELSE: u8 = 0x67; pub const OP_ENDIF: u8 = 0x68; pub const OP_DROP: u8 = 0x75; pub const OP_SHA256: u8 = 0xa8; pub const OP_EQUALVERIFY: u8 = 0x88; pub const OP_CHECKSIG: u8 = 0xac; pub const OP_CHECKLOCKTIMEVERIFY: u8 = 0xb1; } fn build_redeem( hash: [u8; 32], receiver_pubkey: &[u8], sender_pubkey: &[u8], locktime: u32, ) -> Vec { let mut s = Vec::with_capacity(128); s.push(op::OP_IF); s.push(op::OP_SHA256); s.push(0x20); // push 32 bytes s.extend_from_slice(&hash); s.push(op::OP_EQUALVERIFY); s.push(receiver_pubkey.len() as u8); s.extend_from_slice(receiver_pubkey); s.push(op::OP_CHECKSIG); s.push(op::OP_ELSE); let lt = encode_script_number(locktime as i64); s.push(lt.len() as u8); s.extend_from_slice(<); s.push(op::OP_CHECKLOCKTIMEVERIFY); s.push(op::OP_DROP); s.push(sender_pubkey.len() as u8); s.extend_from_slice(sender_pubkey); s.push(op::OP_CHECKSIG); s.push(op::OP_ENDIF); s } fn encode_script_number(n: i64) -> Vec { if n == 0 { return vec![]; } let mut absn = n.unsigned_abs(); let mut r = Vec::new(); while absn > 0 { r.push((absn & 0xff) as u8); absn >>= 8; } if r.last().unwrap() & 0x80 != 0 { r.push(0x00); } r } fn main() { let hash = [0xaa; 32]; let recv = [0x02; 33]; let send = [0x03; 33]; let script = build_redeem(hash, &recv, &send, 144); println!("redeem script len = {} bytes", script.len()); println!("first opcode = 0x{:02x} (OP_IF)", script[0]); println!("second opcode = 0x{:02x} (OP_SHA256)", script[1]); println!("last opcode = 0x{:02x} (OP_ENDIF)", script.last().unwrap()); } `} The output is a fixed-shape ~110-byte script depending on locktime encoding. The P2WSH wrapper is the `OP_0 <32-byte sha256(redeem)>` two-byte-then-pushdata encoding that makes the *witness program* — the thing the network sees — a 34-byte commitment to the script's hash. `p2wsh_script()` in `htlc.rs` does the wrap: ```rust pub fn p2wsh_script(&self) -> Vec { let redeem = self.redeem_script(); let hash = sha256(&redeem); let mut script = Vec::with_capacity(34); script.push(op::OP_0); script.push(0x20); // push 32 bytes script.extend_from_slice(&hash); script } ``` The `assert_eq!(p2wsh.len(), 34)` test in [`htlc.rs:198`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) is the safety net for that: anyone reading the test sees the constant the wire format depends on. ## Why P2WSH and not Taproot Worth a brief note. Taproot is the cool kid in 2026, and a Schnorr-key-aggregation atomic swap could in principle use a single-key-path-spend that looks indistinguishable from a normal transfer. But: The simplest thing that could work is P2WSH. v1 ships P2WSH. The Taproot-key-path version is the v2 conversation, which I expect to come up the same time the shielded-VANTA-leg work lands. ## The sighash dance This is the part of HTLC code that's easy to get wrong and impossible to debug when you do. The witness script is sighashed differently in segwit than in legacy, and the spending side has to compute the *exact* same sighash the verifier will check. The relevant code is in [`swap.rs:215`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs): ```rust // Sign: BIP143 sighash over the witness program (the redeem script) let redeem_script = state.contract.redeem_script(); let witness_script = ScriptBuf::from_bytes(redeem_script.clone()); let privkey = PrivateKey::from_wif(privkey_wif).context("invalid WIF private key")?; let secp = Secp256k1::new(); let mut sighash_cache = SighashCache::new(&spending_tx); let sighash = sighash_cache .p2wsh_signature_hash(0, &witness_script, Amount::from_sat(htlc_value), EcdsaSighashType::All) .context("sighash computation failed")?; let msg = secp256k1::Message::from_digest(sighash.to_byte_array()); let sig = secp.sign_ecdsa(&msg, &privkey.inner); // DER-encode signature + sighash type byte let mut sig_bytes = sig.serialize_der().to_vec(); sig_bytes.push(EcdsaSighashType::All as u8); ``` Three things to notice: **`p2wsh_signature_hash`, not `legacy_signature_hash`.** This is BIP143 — the segwit sighash. It hashes the input value as part of the sighash so a signature that's valid for "spend X satoshis" can never be replayed for "spend Y satoshis." A legacy sighash doesn't include the value, which is why pre-segwit malleable signatures were a thing. **`Amount::from_sat(htlc_value)`.** The funding amount has to be exact. Off by one satoshi and the sighash mismatches, the signature is rejected, and the broadcast fails with a generic `mandatory-script-verify-flag-failed` from `bitcoind`. Welcome to the worst error message in cryptocurrency. **`EcdsaSighashType::All`** — the standard "sign every input and every output." The only time you'd want a different sighash type for an HTLC is if you wanted partial-input flexibility, which atomic swaps don't. The Rust [`bitcoin` crate](https://docs.rs/bitcoin/0.32) ships `SighashCache`, which precomputes the parts of the sighash that don't change per-input (the input/output digests) so you can sign multiple inputs without redoing the hash. We have one input, so the cache is trivial — but the API is the same and the per-input computation is correct. ## Witness layout: claim vs. refund The two witnesses look almost identical and they have to be carefully different. Both are stacks; the bottom of the stack is the redeem script, and what's above it controls which branch runs. Claim witness, from `claim_witness` in [`htlc.rs:97`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs): ```rust pub fn claim_witness(&self, signature: &[u8], preimage: &[u8; 32]) -> Vec> { vec![ signature.to_vec(), preimage.to_vec(), vec![0x01], // OP_TRUE — take the IF branch self.redeem_script(), ] } ``` Four items. Bottom-to-top of stack: redeem script, OP_TRUE, preimage, signature. After the `OP_PUSHDATA` consumes the script reveal, execution begins at OP_IF. The next pop is `0x01` → truthy → take the IF branch. The IF branch consumes the preimage with OP_SHA256, compares against the embedded hash via OP_EQUALVERIFY, and then the receiver pubkey + CHECKSIG consumes the signature. Refund witness, from `refund_witness` in `htlc.rs:107`: ```rust pub fn refund_witness(&self, signature: &[u8]) -> Vec> { vec![ signature.to_vec(), vec![], // empty — take the ELSE branch self.redeem_script(), ] } ``` Three items: redeem script, *empty bytes* (which Bitcoin script interprets as OP_FALSE), signature. OP_IF pops the empty bytes → falsy → take ELSE. The ELSE branch checks ` OP_CHECKLOCKTIMEVERIFY` against the spending transaction's `nLockTime`, drops the locktime, then verifies the sender's signature. Two failure modes are interesting: **Claim with the wrong preimage.** OP_SHA256 hashes whatever you push. OP_EQUALVERIFY fails. The script aborts with a verification error. The transaction is rejected. The HTLC is still spendable. **Refund before locktime expires.** OP_CHECKLOCKTIMEVERIFY pulls `nLockTime` from the spending transaction. If it's less than the embedded locktime, the script aborts. The Rust code in [`swap.rs:266`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) preflights this check before broadcast: ```rust let current_height = rpc::get_block_height(&client)?; if current_height < state.contract.locktime as u64 { anyhow::bail!( "Locktime not reached: current height {} < locktime {}. Wait {} more blocks.", current_height, state.contract.locktime, state.contract.locktime as u64 - current_height, ); } ``` You could just let the broadcast fail. Better UX is to refuse to construct the transaction in the first place. ## Why the preimage reveal is atomic Worth dwelling on this because it's the part of HTLC theory that students always blink at. If Alice claims Bob's HTLC by revealing `s`, why does that make Bob's claim of Alice's HTLC inevitable? Because the preimage `s` is now on Chain B's mempool/blockchain in plaintext. Any node, any explorer, any indexer running on Chain B can extract `s` from the witness stack of Alice's claim transaction. Bob's wallet polls Chain B for the spending of his HTLC, finds `s`, and now has the secret needed to spend Alice's HTLC on Chain A. The "atomic" property is that revealing `s` to Chain B is *necessarily* publishing it. There is no way to construct a P2WSH spend that hides the witness data — the witness stack is part of the transaction, the transaction is part of the block, the block is gossiped to the network. By the time Alice's claim is mined, Bob already knows. If Alice never claims (refund path), `s` is never revealed. After $t_1$ blocks, Bob refunds his HTLC. After $t_2$ blocks, Alice refunds hers. Both got their original funds back. No third outcome. The math: the only way the swap settles partially is if Alice claims chain B and then *somehow* prevents Bob from claiming chain A within the window $t_2 - t_1$. The 2x/1x ratio is what makes that window large enough that Bob's ordinary chain-watching software can detect, parse, and broadcast inside it. ## What the wallet doesn't do (yet) A fully shielded VANTA leg — where the HTLC's *amount* is hidden — is the missing piece. Today, the value of the P2WSH output on the VANTA chain is plaintext, exactly as it is on Bitcoin's. From [the price-discovery post](/blog/vanta_private_atomic_swaps/): > the current swap implementation is fully transparent on both sides The plan, gestured at in [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md), is to replace the P2WSH output on the VANTA leg with a witness v2 commitment whose amount is hidden behind a Pedersen blinding. The HTLC pubkey path becomes a shielded-pool note; the claim/refund logic becomes a ZK proof of pubkey ownership + preimage knowledge, instead of a script-level CHECKSIG. Same atomic property; different cryptographic primitive. That work is real, and it's not in the current `vanta-swap`. The `vanta-swap` we have today is the simplest thing that could possibly work, in 350 lines of Rust, with the same script semantics on both chains. The shielded version is a different post. ## What I changed my mind about The first version of `htlc.rs` used the `bitcoin::ScriptBuf::builder()` API — the abstraction-layer way of constructing a Bitcoin script. It was 30% shorter and 100% less debuggable. When the OP_CHECKLOCKTIMEVERIFY encoding was wrong (script-number encoding for negative numbers and 128 has a sign-bit edge case the builder API didn't trigger), I had to rewrite half of it as raw byte pushes anyway to instrument the failure. The version that ships is the boring `Vec` with explicit opcode pushes. Every byte is visible. When something doesn't verify, I read the script in a hex dumper and spot the wrong byte. That's a lower abstraction level than I'd ordinarily reach for, but BIP-199 *is* a wire format, and wire formats want to be visible. The script-number encoding bug was specifically `encode_script_number(128)` returning `[0x80]` (which Bitcoin script interprets as `-0`) instead of `[0x80, 0x00]` (which encodes positive 128). The test in [`htlc.rs:236`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) is the regression catch: ```rust assert_eq!(encode_script_number(128), vec![0x80u8, 0x00]); ``` I'd estimate I'd have caught that bug six hours faster if I'd been building the script as bytes from the start. ## Further reading - [`vanta/vanta-swap/src/htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) — script construction + witness builder + tests - [`vanta/vanta-swap/src/swap.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) — initiate / participate / claim / refund state machine - [`vanta/vanta-swap/src/main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/main.rs) — CLI surface and the timelock-halving hint - [BIP-199](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) — the upstream HTLC pattern - [BIP-143](https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki) — segwit sighash, the BIP this signing path implements - [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/) — the policy framing the implementation rides on - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain doing the verification --- # The unified dashboard: collapsing private and transparent into one wallet view Canonical: https://blog.skill-issue.dev/blog/vanta_unified_dashboard_wallet_ui/ Description: Two pages — one for private balance, one for transparent — taught users to think in two heads. The 2026-04-17 commit folded them. The wallet now shows one balance, one feed, with the privacy boundary inside the data, not the URL. Published: 2026-04-17T05:52:57.000Z Tags: vanta, wallet-ui, react, design, ux, privacy The 2026-04-17 commit message — `wallet-ui: merge privacy view into unified dashboard + rescan endpoint` — is one of the smallest functional commits in the Vanta repo and one of the most consequential UX decisions. Up to that point the wallet had a `/dashboard` page (transparent UTXOs) and a separate `/privacy` page (shielded notes). Two pages. Two balance numbers. Two transaction feeds. Users — including, embarrassingly, me — would forget which page they were on and wonder why a transaction "didn't arrive" when it had simply landed on the other page. The fix was to collapse them. This post is about why that decision matters more than it looks, what the new layout does, and the discipline a privacy chain needs to keep when it ships a wallet. ## Why two pages was wrong The original split came from a literal reading of the architecture. The chain has a transparent layer (the L1 UTXO set) and a privacy layer (the L2 SMT of commitments). The wallet had two stores backing two pages. Easy mental model for a developer. For a user, this is the wrong frame. A user has *money*, and the money is *in different states*. Transparent and shielded are states the money happens to be in, like "in checking" vs "in savings." A bank doesn't make you flip between two browser tabs for those. The user thinks "what's my balance" and "what came in lately" — not "let me check my transparent feed *and* my shielded feed." Worse, two pages teaches people that the privacy/transparent boundary is something they have to think about. It is — sometimes. But mostly the wallet should *handle the boundary* and present *the actual account*. The boundary should be visible *inside the data* (each note or UTXO has a privacy badge), not in the URL. ## What the unified dashboard does The new [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) is the page. The structure is roughly: 1. **One balance**, prominently — labelled "Private Balance" because that's what 99% of the value should be once the chain is mature, with the small print being "X VANTA transparent" if the user has any. 2. **L2 status card** — SMT root, commitment count, nullifier count, last block height. Auto-refreshing every 5 seconds. This is the chain's privacy view, surfaced *as a number on the dashboard*, not hidden in a settings page. 3. **Quick actions** — send, receive, sync. 4. **One activity feed** — interleaving transparent transactions and shielded notes by timestamp. Each row has a privacy badge (`ShieldCheck` icon for shielded, `EyeOff` for transparent) and the badge is *the* indicator of which state the value is in. The L2 status card auto-fetches on a `setInterval`: ```tsx useEffect(() => { fetchL2Status() const id = setInterval(fetchL2Status, 5000) return () => clearInterval(id) }, []) ``` 5 seconds is a deliberate cadence. The chain produces blocks every 60 seconds. SMT root updates land at most once per block, on average. 5 seconds gives the user a perception of "this is live" without hammering the L2 sidecar's REST endpoint. ## The rescan endpoint The other piece of the commit is a `/api/sync` endpoint (and the matching `sync()` action in the Zustand store) that triggers a re-scan of L1 + L2 against the wallet's keys. The rescan reads: - the L1 transparent UTXOs the wallet's addresses control - the L2 encrypted-note inbox, trial-decrypting against the wallet's secret to find shielded notes addressed to it Before this endpoint, "my balance is wrong" was an unrecoverable error state — the user would have to restart the wallet. With the rescan endpoint, "my balance is wrong" is a button click. The button reports `{ newNotes, scannedToIndex, balance, unspentCount }` so the user sees something concrete: *"found 2 new notes."* This is the kind of feature that's invisible until you don't have it, at which point support tickets stack up. Shipping it alongside the unified dashboard was the right pairing — the dashboard makes the user expect their balance to be live; the rescan endpoint backstops them when it isn't. ## The badge discipline The activity feed shows transparent and shielded events together. Each row gets a privacy badge: - `ShieldCheck` (purple) — shielded transaction - `EyeOff` (purple) — incoming shielded note - `Hash` — transparent transaction - `Layers` — L2-only event (commitment landed but not yet associated with a wallet note) The colour discipline is consistent across the wallet: purple is the privacy-feature colour, used for L2 elements and shielded states. Transparent elements use the default text colour. The viewer doesn't have to read a label to know which is which. This is small. It also took longer than I expected to settle on. Earlier drafts had transparent transactions in green and shielded in purple, on the theory that "green = good, purple = brand colour." That backfired immediately — green coded as "fine, no need to look closer" and purple as "interesting, look closer," when on Vanta the desired hierarchy is the opposite (privacy is the default, transparent is the exception). The current discipline: **shielded is the unmarked default; transparent is *marked* by being non-shielded.** A row without a special badge isn't transparent; it's shielded. A row with a transparent badge is the exception. Visual weight matches the expected long-run distribution. ## Why this is hard Privacy-coin wallets have shipped with two-pane "shielded vs transparent" UX for years, and it's mostly *not* their fault. The frame leaks from the chain when the chain treats shielded as a separate pool. On Zcash, you literally have shielded addresses (`zs1...`) and transparent addresses (`t1...`) — two different address families — and a wallet has to render that. Vanta dodges that frame because the chain treats commitments and UTXOs as two states of the same value, with a single address family (`vnt1...`) on top. That gives the wallet *room* to present a unified view. The wallet has to *take* the room, which is the part the dashboard collapse is doing. The principle: **the wallet's frame should match the chain's frame, not the wallet's data model.** The data model has commitments and UTXOs and an L2 sidecar and an L1 RPC. The frame the user sees should be "money in, money out, what's it doing." The data model is the wallet's problem. ## What I'd ship next Three things on the list, in priority order. 1. **Per-note privacy decay indicator.** Coinbase rewards land transparent for one confirmation before they private-decay (the "fast privacy decay" pattern in [`papers/01-executive-summary.md`](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md)). The wallet should show, on each row, *whether* a note is in its decay window. If it is, the user gets a "wait one block before spending" hint. Today the wallet doesn't surface this — a power-user can read it from the L2 status, but no normal user will. 2. **"Send" with no privacy choice.** The send flow today asks "transparent or private?" — even though the answer is *almost always* private. Make private the default; offer a "transparent send" advanced option behind a disclosure. Most users will never need to know transparent sends exist. 3. **Address book scoped to the wallet.** Privacy-respecting wallets often skip address books because of the linkability concern. Vanta can do this *in the wallet*, since the wallet is the only thing that knows which addresses the user has interacted with. Address book entries don't leak to the chain. This was the user-facing thing missing the longest. The dashboard collapse is the foundation; these are the next-step UX wins it enables. ## The architectural lesson The dashboard refactor is small but it's an example of the larger principle that runs through the whole wallet: **the privacy boundary is in the data, not the URL.** Two URLs implies two domains of knowledge a user has to manage. One URL, with private/transparent as a property of each row, implies *the wallet manages this and presents one cohesive thing.* I want this principle to extend. The settings page shouldn't have a "privacy" tab. The send flow shouldn't have a "privacy" toggle as a primary control. The receive page shouldn't ask the user to choose between a shielded and a transparent address. Privacy is the default; transparent is the exception; everything else is the wallet's job to handle. Some of those changes are shipped. Some are on the list. The dashboard collapse was the one that mattered most because it landed first and it set the discipline for everything else. ## Further reading - [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) — the unified dashboard - [`wallet-ui/src/pages/privacy.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/privacy.tsx) — the old privacy view (kept for a while as a deep-dive page) - [`wallet-ui/src/stores/privacy-store.ts`](https://github.com/Dax911/vanta/tree/main/wallet-ui/src/stores) — the Zustand store the dashboard pulls from - [The vanta wallet HTTP API](/blog/vanta_wallet_axum_api/) — the L1 service the dashboard calls - [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the L2 service the dashboard calls - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — where this dashboard ends up living for end users --- # The vanta wallet HTTP API: an Axum bridge to vantad RPC Canonical: https://blog.skill-issue.dev/blog/vanta_wallet_axum_api/ Description: Before the Tauri desktop wallet there was an Axum web wallet. It is a five-route Rust service that wraps vantad's JSON-RPC and serves a single static page. Boring on purpose — and the boring is the point. Published: 2026-04-13T18:46:45.000Z Tags: vanta, rust, axum, wallet, api, rpc The first wallet I shipped for Vanta wasn't a desktop app. It was a [Rust/Axum HTTP service](https://github.com/Dax911/vanta/tree/main/wallet) that wraps `vantad`'s JSON-RPC behind a small REST API and serves a single static HTML page. Five routes, one `Cargo.toml`, one `main.rs`, ~250 lines of Rust. Boring on purpose. The boring is the point — when you're bringing up an L1, the wallet has to be a tool you can debug inside of, not a black box. This post is a read-along of [`wallet/src/main.rs`](https://github.com/Dax911/vanta/blob/main/wallet/src/main.rs), what the route surface buys you, what's coming next as the desktop app picks up the unified-dashboard work, and what the Axum service is *not* (it's not a key holder; it's a thin bridge). ## The dependency tree The whole [`Cargo.toml`](https://github.com/Dax911/vanta/blob/main/wallet/Cargo.toml) fits in a screenshot: ```toml [dependencies] bitcoin = { version = "0.32", features = ["serde", "rand-std"] } bitcoincore-rpc = "0.19" axum = { version = "0.7", features = ["macros"] } tokio = { version = "1", features = ["full"] } tower-http = { version = "0.6", features = ["cors", "fs"] } serde = { version = "1", features = ["derive"] } serde_json = "1" anyhow = "1" ``` That's it. Axum for routing, `bitcoincore-rpc` for the typed RPC client (which works against `vantad` because the RPC contract is unchanged from Bitcoin Core v27.0), `bitcoin` for address parsing, `tower-http` for CORS and static-file serving. No database, no auth middleware, no template engine, no ORM. The wallet is a pass-through — the only real state is *whatever* `vantad` says. This is the choice I want to be loudest about. The temptation when you're forking Bitcoin is to ship a wallet that re-implements everything `vantad` already does. Don't. The wallet's job is *to make `vantad` legible from a browser.* ## The route surface Five HTTP endpoints, registered in `main()` with the canonical Axum router: ```rust let app = Router::new() .route("/", get(index)) .route("/api/info", get(get_info)) .route("/api/transactions", get(get_transactions)) .route("/api/blocks", get(get_recent_blocks)) .route("/api/send", post(send_zer)) .route("/api/address/new", post(new_address)) .layer(CorsLayer::permissive()) .with_state(state); ``` Each route maps to one or two RPCs. Walking through them: **`GET /` — the index.** This serves the static HTML+JS page bundled into the binary at compile time via `include_str!("../static/index.html")`. The page calls the four JSON endpoints below. Compile-time bundling is a one-binary deploy story: copy `vanta-wallet`, run it, the UI is *there*. **`GET /api/info` — wallet + network status.** Five RPCs in one handler: ```rust let balance = rpc.get_balance(None, None).unwrap_or_default(); let unconfirmed = rpc.get_balances().map(|b| b.mine.untrusted_pending).unwrap_or_default(); let block_count = rpc.get_block_count().unwrap_or(0); let info = rpc.get_network_info().ok(); let mining = rpc.get_mining_info().ok(); ``` Returns `WalletInfo { balance, unconfirmed_balance, block_count, connections, mining_address, difficulty }`. This is the polled-every-5-seconds heartbeat the index page uses. **`GET /api/transactions` — last 50.** A direct passthrough to `listtransactions`, with a small Rust struct mapping over the result so the JSON the browser sees is stable across `bitcoincore-rpc` upgrades. **`GET /api/blocks` — recent 10 blocks.** Walks `(height-10..=height)`, calls `getblockhash` and `getblockinfo` for each, returns a `Vec`. The single-RPC-per-block makes this O(n) but n is 10, so it's fine. **`POST /api/send` — send VANTA.** Takes `{ address, amount }`, parses the address against the network (so `Z`-legacy and `vnt1`-bech32 both work), constructs an `Amount` from the float, and calls `send_to_address`. Errors are wrapped with `BAD_REQUEST` for parse failures and `INTERNAL_SERVER_ERROR` for RPC failures. **`POST /api/address/new` — fresh receiving address.** Calls `getnewaddress` with an optional label. That's the entire surface. There is intentionally no `wallet/create`, no key-import, no PSBT signing. Those operations go through `vanta-cli` directly — the wallet user is implicitly a `vantad` user. This is fine for the testnet phase. It is *not* fine for shipping to the public, which is why the desktop app exists. ## The settxfee dance One detail in the `main()` startup that took me longer than it should have: ```rust let _ = rpc.call::("settxfee", &[serde_json::json!(0.0001)]); ``` Bitcoin Core's fee estimator uses historical mempool data to predict the fee per byte. On a fresh chain with low traffic, it has no data. The default behaviour when the estimator can't decide is to error on `sendrawtransaction` — *not* to fall back to a default. You discover this the first time you try to send a tx on a fresh chain and get back "fee estimation failed." The fix is `settxfee` at startup with a sane fallback. `0.0001 VANTA/kB` is roughly nothing in real terms (one ten-thousandth of a unit, when each block pays out 100,000 units), but it's enough to satisfy the estimator's "have a fee" check. Same trick is in `txbot/src/main.rs` for the same reason. The Bitcoin Core devs are aware of this footgun and there's been talk of a `fallbackfee` config that fires automatically. For now, a one-line workaround at every RPC client's startup. ## Auth, or the lack thereof The Axum wallet binds to `0.0.0.0:8085` and runs `CorsLayer::permissive()`. Translation: anyone on the network can hit it. There's no token, no password, no rate limit. This is fine **for what it is** — a single-operator tool you run on a host you control, with the assumption that the only consumer is the static page bundled into the same binary. It is not fine for a multi-tenant deployment. The host firewall is the auth boundary. If you put this on the open internet you've made a mistake. The desktop app fixes this by running the equivalent logic in-process via Tauri IPC — there is *no* HTTP listener, so there's nothing for a browser tab on a malicious site to talk to. Read [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) and [Vanta Desktop](/blog/vanta_desktop_tauri_wallet/) for the longer story on that boundary. ## What the API doesn't have, and where it goes The Axum wallet was written *before* the privacy layer was wired in. So it shows transparent UTXOs only. That's why the wallet-ui split exists: there's a [`wallet-ui/`](https://github.com/Dax911/vanta/tree/main/wallet-ui) React app that calls *both* the Axum service and `vanta-node`'s REST API, and renders a unified view that interleaves transparent transactions with shielded notes. The 2026-04-17 commit message that motivated this whole post — > wallet-ui: merge privacy view into unified dashboard + rescan endpoint — is what landed when we collapsed the previously-separate `/privacy` page into the `/dashboard` page so users see *one* balance ("private balance") and *one* feed of activity. Behind the scenes the dashboard is calling: - `GET /api/info` against the Axum wallet for L1 status (block count, connection count) - `GET /status` against `vanta-node` for the L2 status (commitment count, nullifier count, SMT root) - `GET /notes` against `vanta-node` for the wallet's shielded note inventory - `POST /api/sync` (the new rescan endpoint) to trigger a re-scan of L1 + L2 against the wallet's keys The unified-dashboard logic lives in [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx). The L2 status card is a five-second auto-refresh that pulls SMT root, commitment count, nullifier count, and last block height, and renders it as four monospace numbers under the "L2 Privacy Layer" header. That's the surface a user sees; behind it are two Rust services and a C++ node. ## Why a separate service instead of merging into vanta-node A reasonable design question: why does `wallet/` exist at all? Why isn't this one of `vanta-node`'s API endpoints? Two reasons. **Bitcoin-RPC stays as the wallet boundary.** The set of operations the L1 wallet does (send, receive, balance) maps 1:1 to Bitcoin Core RPC calls. Wrapping those in a small Axum service means the service is *replaceable* by anything that speaks the same five endpoints — a CLI, a different language wallet, a hardware-wallet integration. That'd be harder if the L1 wallet primitives were tangled into the L2 sidecar's REST API. **`vanta-node` runs without a wallet.** A node operator who wants to index the chain but doesn't have a wallet on the node — say, a cold-storage setup or an indexer service — should be able to run `vanta-node` cleanly without a transparent-wallet listener implicitly bound. Keeping them separate means each service does one job. The desktop app is the unified frontend that talks to both. The web wallet is the developer/debug frontend that talks to the L1 service. In the medium term I expect the web wallet to be deprecated in favour of "you run the desktop app" — but the Axum service is staying for as long as anyone wants a portable HTTP-shaped wallet. ## What I would do differently Three things. 1. **Bind to 127.0.0.1 by default.** The current `0.0.0.0:8085` is a footgun for someone who runs this in a non-trusted network without thinking about firewalls. Default to localhost; the user can opt-in to LAN exposure with a flag. 2. **Drop `bitcoincore-rpc` for hand-rolled `reqwest`.** The crate is fine but I have hit type-mismatch issues every time `vantad` returns a slightly off-vanilla shape (e.g. our extra `value_balance` field on transactions). Going hand-rolled lets the wallet evolve with the chain without the upstream crate's maintainer in the loop. 3. **Type the receive endpoint against bech32.** Right now `getnewaddress` defaults to whatever the node is configured for (legacy `Z` or bech32 `vnt1`). The wallet should pass `bech32` explicitly so the address format the user sees is consistent. None of these are urgent. The Axum wallet does its job. It's not the wallet I want to ship to a million users. It is the wallet I want behind the wallet I ship to a million users — a debug surface for me, when something is wrong with the chain and I want to talk to it from `curl`. ## Further reading - [`wallet/src/main.rs`](https://github.com/Dax911/vanta/blob/main/wallet/src/main.rs) — the entire Axum service, 250 lines - [`wallet/Cargo.toml`](https://github.com/Dax911/vanta/blob/main/wallet/Cargo.toml) — the dependency tree (small on purpose) - [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) — the React dashboard that calls this service - [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — what replaces this for end users - [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — how `vanta-node` complements this service - [Bitcoin Core JSON-RPC docs](https://bitcoincore.org/en/doc/27.0.0/) — the upstream contract the Axum service wraps --- # Stratum v1, the from-scratch Python version Canonical: https://blog.skill-issue.dev/blog/vanta_stratum_python_pool/ Description: Solo mining Vanta requires a Stratum server. Public-pool is fine for normal chains; mandatory privacy pushes the pool toward shielded coinbases, encrypted-note submission, and an L2 retry queue. pool/stratum_server.py does it all in stdlib Python. Published: 2026-04-13T17:34:24.000Z Tags: vanta, mining, stratum, python, bitaxe, privacy I wrote about [mining VANTA with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the hardware, the watts, the difficulty math, why solo mining a privacy fork actually pays off where solo mining Bitcoin in 2026 doesn't. This post is the deeper companion: what the Python Stratum server *does* once you've decided to write one from scratch, and why the privacy chain forced a few changes that wouldn't be required on a vanilla Bitcoin fork. The whole server is one file: [`pool/stratum_server.py`](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py). No external dependencies — pure stdlib Python. Around 600 lines. Every line earns its place; this isn't an optimised pool, it's a *correct* pool that I can debug from a terminal at 4 AM. ## Why Python at all A reasonable thing to ask: "you're shipping Rust everywhere else, why is the pool Python?" Three reasons. 1. **Stratum v1 is a 200-line protocol.** It's JSON over a long-lived TCP connection. `socketserver.ThreadingTCPServer` is exactly the right shape: one thread per connected miner, blocking I/O, no async machinery to argue about. 2. **The interesting work is talking to `vantad` over JSON-RPC and to the L2 sidecar over REST.** Both are HTTP-shaped. `http.client` and `urllib.request` are stdlib. Zero dependency surface. 3. **I can edit the running pool.** When you're debugging a chain at 4 AM and your Bitaxe disconnected, "edit the script and restart" is a faster path than "edit the Rust, recompile, redeploy, kill and restart." Python wins on the iteration loop. The full upstream story is that Vanta originally used a public-pool fork (Node.js) and the Python server is the *replacement* I wrote when the public-pool fork couldn't handle the privacy-coinbase requirements. That's the part the rest of this post is about. ## Mandatory privacy mining Vanta v2 chain consensus rejects any non-coinbase transaction that doesn't satisfy the witness-v2 commitment-binding rules. **Coinbase transactions are also required to be witness v2.** From the top of the Stratum server: ```python SHIELDED_PUBKEY = os.environ.get("SHIELDED_PUBKEY", "").strip() if not SHIELDED_PUBKEY or len(SHIELDED_PUBKEY) != 64: print("[FATAL] SHIELDED_PUBKEY env var is required (32-byte hex, 64 chars).", file=sys.stderr) print(" Vanta v2 chain has no transparent mining payouts.", file=sys.stderr) sys.exit(1) ``` The pool refuses to start without a shielded pubkey. There is no transparent fallback. A miner who points a Bitaxe at this server is paying out into a shielded note from block one. The note-construction code is also worth quoting because it pinned down the on-chain format we ended up shipping: ```python def create_mining_note(value: int, owner_pubkey: bytes) -> tuple: """Create a private note for auto-shielded mining reward. Returns (commitment_hash, randomness).""" randomness = os.urandom(32) preimage = ( struct.pack(' bytes: """Build witness v2 scriptPubKey: OP_2 PUSH32 .""" return bytes([0x52, 0x20]) + commitment_hash ``` `OP_2 PUSH32 ` is the witness-v2 anchor format; the C++ consensus code parses this exactly and uses the pushed 32 bytes as the input commitment when a future spend witness comes through. From the chain's perspective, the coinbase pays into "this commitment" and the value field on the transaction is zero. The pool also adds an OP_RETURN anchor for L2 indexers to find: ```python def commitment_anchor_script(commitment_hash: bytes) -> bytes: """Build OP_RETURN anchor: OP_RETURN PUSH34 bb00 .""" payload = bytes([0xbb, 0x00]) + commitment_hash # 34 bytes return bytes([0x6a]) + encode_varint(len(payload)) + payload ``` `OP_RETURN 0xbb 0x00 ` is the indexer-side anchor; `vanta-node`'s L1 watcher scans for this byte sequence and feeds matches into the SMT. ## Solo-mining accounting vs pool accounting Public pools track per-miner shares and pay out at end-of-round based on share contributions. The math is non-trivial: PPLNS, FPPS, score-based, etc. A solo pool doesn't need any of that. Whoever finds the block keeps the whole reward. The Stratum server has miners, but the only thing it does with their share-count is *monitoring*, not accounting. This simplification is huge. There's no payout database, no end-of-round settlement, no fee policy, no withdrawal endpoint. The pool's only persistent state is: 1. The pending-L2-submission queue (`pending_l2_submissions.json`). 2. Optionally the local note backup (`SHIELDED_NOTES_FILE`, off by default). Both are JSON files. The pool can be killed, restarted, even moved between machines, and the only state that matters is the on-chain commitment + the L2 sidecar's encrypted-note inbox. The pool host isn't the source of truth for anything user-visible. This is *exactly* how a solo-mining server should work. The complexity of a public-pool comes from settling between multiple miners. A solo-pool inherits none of that. ## The L2 retry queue This is the thing that ate a week to get right. The flow: 1. Bitaxe submits a share that meets block difficulty. 2. Pool calls `submitblock` against `vantad`. 3. `vantad` accepts the block. 4. Pool generates the encrypted note for the miner's reward. 5. Pool POSTs the encrypted note to `vanta-node`'s `/submit` endpoint. What if step 5 fails? The L2 sidecar might be restarting, network might be flaky, sidecar might be slow under load. We can't lose the encrypted note — without it, the miner's wallet can't discover the reward. The first version retried in-process, blocking the share-acceptance loop. Bad idea: a slow L2 stalls the whole pool. The second version queued the failed submission to a file and a background thread retried every 30 seconds: ```python def _retry_worker(): """Background worker — drains the L2 retry queue every 30 seconds.""" while True: try: drain_pending_l2_queue() except Exception as e: print(f"[SHIELD] retry worker error: {e}") time.sleep(30) ``` This is the version that shipped. Failed submissions go to `pending_l2_submissions.json`, get retried until accepted, get removed from the queue. The pool host can be restarted and the queue persists. A subtle detail: this is called *only* on `submitblock` accept, not on every Stratum job-template push. From the comment in `save_shielded_note`: ```python """Persist mining note and submit encrypted note to L2 for wallet discovery. Called ONLY after a winning block is accepted by submitblock — never from the per-share job-template path, otherwise the L2 SMT fills up with phantom commitments for templates that never won the PoW race. """ ``` The first version called this from the job-template path, which is the path that runs every time the pool decides to push fresh work to its connected miners. With a 1-minute block time and longpoll discipline, that's roughly every 1–2 seconds. So the L2 SMT was getting hundreds of phantom commitments per actual block, all for blocks that never won the PoW race. That bug shipped to a testnet for about 6 hours before I noticed; the cleanup involved replaying the L2 from genesis with the fix applied. Don't put non-idempotent side effects in your job-template path. ## Encrypted-note construction The `encrypt_note_for_recipient` function is the bit that lets a miner's wallet *find* its reward without the chain leaking what was paid: ```python def encrypt_note_for_recipient(recipient_pubkey: bytes, value: int, asset_type: int, randomness: bytes, commitment: bytes) -> dict: """Encrypt note data so the recipient can discover it via L2 sync. Matches vanta-core encrypt.rs exactly: domain-separated SHA256 + XOR stream.""" ephemeral_secret = os.urandom(32) ephemeral_pubkey = hash_with_domain(b"Vanta/Ephemeral/v1", ephemeral_secret) shared_secret = hash_with_domain(b"Vanta/SharedSecret/v1", ephemeral_pubkey + recipient_pubkey) plaintext = struct.pack(' This is a working post in my [PhD-by-publication track](/about). The arithmetic is checked against [Barbulescu and Duquesne (2019)](https://eprint.iacr.org/2017/334) for the security level estimates and the [IETF CFRG pairing-friendly curves draft](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/) for the standardisation status. ## The minimum cryptography you need A pairing is a bilinear map $$ e : \mathbb{G}_1 \times \mathbb{G}_2 \to \mathbb{G}_T $$ with $\mathbb{G}_1, \mathbb{G}_2$ cyclic groups of prime order $r$ on an elliptic curve $E$, and $\mathbb{G}_T$ a multiplicative subgroup of an extension field $\mathbb{F}_{p^k}$. *Bilinear* means $$ e(a P, b Q) = e(P, Q)^{ab} $$ for any $a, b \in \mathbb{Z}_r$ and generators $P, Q$. That single equation is the entire reason pairing-based cryptography exists — it lets you "multiply in the exponent" across two different groups, which is exactly what Groth16's verification equation needs. Two parameters drive everything. **The embedding degree $k$** is the smallest integer with $r \mid p^k - 1$; it sets the size of the target field $\mathbb{F}_{p^k}$. **The base field characteristic $p$** sets the cost of every operation in $\mathbb{G}_1$. The security of the pairing rests on: - The discrete log problem (DLP) in $\mathbb{G}_1$ and $\mathbb{G}_2$ — protected by Pollard's rho, cost $\sqrt{r}$, so we want $r \approx 2^{256}$ for 128-bit security. - The DLP in $\mathbb{F}_{p^k}^*$ — protected by the **number field sieve**, cost subexponential in $p^k$, so we want $p^k$ large enough that NFS is no easier than $\sqrt{r}$. The trick of pairing-friendly curve design is to find $(p, r, k)$ where both DLPs are hard *and* $p$ is small enough that field operations don't dominate. BN curves use the parameterisation $$ p(t) = 36 t^4 + 36 t^3 + 24 t^2 + 6 t + 1, $$ $$ r(t) = 36 t^4 + 36 t^3 + 18 t^2 + 6 t + 1, $$ with $E: y^2 = x^3 + 3$ defined over $\mathbb{F}_p$ and embedding degree $k = 12$. Pick $t$ such that $p$ and $r$ are both prime, and you get a curve. BN254 is the choice $t = 4965661367192848881$ — an integer carefully chosen so $p$ has 254 bits and $r$ has 254 bits, and so the resulting field arithmetic is reasonably efficient. ## Where the 128 bits went When BN254 was deployed in 2010-2015, the security argument was: $p^{12} \approx 2^{3048}$, the NFS algorithm at the time required $\approx 2^{128}$ field operations to break the DLP in $\mathbb{F}_{p^{12}}^*$, and Pollard's rho on $\mathbb{G}_1$ required $\approx 2^{127}$. Both legs landed at 128-bit security. Done. Then [Kim and Barbulescu (2016)](https://eprint.iacr.org/2015/1027) introduced **exTNFS**, an extended Tower NFS variant that exploits the structure of $\mathbb{F}_{p^k}$ when $k$ has a non-trivial factorisation (which $k = 12 = 4 \cdot 3$ does). The complexity of NFS dropped, and the [Barbulescu-Duquesne (2019) update](https://eprint.iacr.org/2017/334) re-estimated the security of BN254 at **roughly 100-110 bits** — depending on which constant in the NFS asymptotic you trust. That is the gap. The curve is not broken. The pairing still works. But "BN254 = 128-bit security" was the marketing line, and after 2016 it should have been "BN254 ≈ 100 bits." The honest table: The blast-radius column is the load-bearing one. **BN254 is not broken in 2026.** A 100-bit security level still costs an attacker $\sim 2^{100}$ field operations, which is not within the budget of any actor we model. But it is also not a curve you start a fresh decade-long deployment on. ## The migration hierarchy The pairing-friendly curve landscape, drawn as a hierarchy of "what would I deploy next": BLS381[BLS12-381 — ~120-126 bit, Ethereum/Filecoin/Sapling] BLS381 --> BLS446[BLS12-446 — clean ~128 bit with margin] BLS381 --> BLS24[BLS24-509 — embedding degree 24, niche] BLS446 --> PQ[Post-quantum candidates: lattice (Falcon/Dilithium), STARK-based] BLS24 --> PQ BN254 -.-> PQ classDef now fill:#1a1a1a,stroke:#4ade80,color:#4ade80 classDef next fill:#1a1a1a,stroke:#a3a3a3,color:#e8e8e8 classDef long fill:#1a1a1a,stroke:#737373,color:#a3a3a3 class BN254 now class BLS381,BLS446,BLS24 next class PQ long`}/> The bottom row is what kills pairing-based cryptography eventually. Shor's algorithm runs in polynomial time on a sufficiently large quantum computer, the discrete log breaks, and every curve in the diagram above goes to zero overnight. The realistic time horizon for that is *not 2026* — the largest credible quantum factorisation as of last year is still toy-scale — but it is the reason you build a hash function migration story into your protocol from day one. We did this in [zera-sdk](/blog/zera_sdk_scaffolding/) by isolating the curve choice to a single `crates/zera-sdk-core/src/curve.rs` module. A future migration to BLS12-381 is one type alias and a regenerated `.zkey`. A migration to a lattice-based scheme is a bigger lift but the seam is clean. ## Why the IETF still hasn't picked one The IETF CFRG has been running a pairing-friendly curves working group since 2018. As of [draft 11](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/), the recommendation lists **BLS12-381 and BN462** as the two curves with 128-bit security after exTNFS. BN254 is explicitly *not* recommended for new deployments — the draft notes: The BN curves with smaller parameters such as BN254 should not be used for applications requiring 128-bit security level due to the recent improvements of the number field sieve algorithm. Implementations targeting the 128-bit security level SHOULD use BLS12-381 or BN462. The reason BN254 is still the production default in 2026 despite this is one part path-dependence (the Ethereum precompile is BN254 and rewriting that is a hard fork) and one part cost (BLS12-381 is roughly 50% slower per pairing and 50% larger per group element). For a privacy pool that is already paying tens of milliseconds per proof, the trade-off is real. The clean argument: BN254 today, BLS12-381 next, lattice-based when the quantum threat becomes credible. That ordering is what every serious protocol designer I've talked to in the last year converges on. ## Pairing arithmetic, by hand The pairing itself is a Miller-loop algorithm followed by a final exponentiation. It is unreasonable to derive in a blog post — go read [Barreto, Galbraith, Ó hÉigeartaigh, Scott (2007)](https://eprint.iacr.org/2007/077) for the optimal-Ate construction — but the *bilinearity check* is one line and worth seeing: $$ e([a]P, [b]Q) = e(P, Q)^{ab} $$ The toy below verifies this property over a tiny pairing-friendly toy curve. It uses a synthetic group and a synthetic pairing — *not* BN254, because computing a real BN254 pairing in 60 lines of TypeScript is not honest pedagogy. The shape of the relations is real. The numbers are not. 0n) { if (exp & 1n) r = (r * base) % m; base = (base * base) % m; exp >>= 1n; } return r; } // "G_1" and "G_2" are both the same multiplicative group here for the // demo. In real pairing curves they're DIFFERENT EC subgroups. function scalarMul(P: bigint, k: bigint): bigint { return modpow(P, k, Q); } // "Pairing": e(P, Q) = (P^Q) -- not real, just a synthetic bilinear map. // Real pairings are far more involved; this is the algebra. function pair(P: bigint, R: bigint): bigint { // We model e(g^a, g^b) = g^{ab} by pairing exponents. // Recover a, b via baby-step (only feasible because Q is tiny). const a = babyStepGiantStep(P); const b = babyStepGiantStep(R); return modpow(G, (a * b) % ORD, Q); } function babyStepGiantStep(target: bigint): bigint { // tiny demo helper — fine for Q ~ 2^31. const m = 1n << 16n; const table = new Map(); let cur = 1n; for (let j = 0n; j < m; j++) { table.set(cur.toString(), j); cur = (cur * G) % Q; } const factor = modpow(modpow(G, m, Q), ORD - 1n, Q); // G^{-m} let gamma = target; for (let i = 0n; i < m; i++) { const hit = table.get(gamma.toString()); if (hit !== undefined) return (i * m + hit) % ORD; gamma = (gamma * factor) % Q; } throw new Error("no log"); } (async () => { const out = document.getElementById("out")!; const lines: string[] = []; const a = 17n, b = 23n; const P = scalarMul(G, a); const R = scalarMul(G, b); // Direct: e(P, R) const direct = pair(P, R); // Bilinear factor: e(G, G)^{ab} const eGG = pair(G, G); const expected = modpow(eGG, (a * b) % ORD, Q); lines.push(\`a = \${a}, b = \${b}\`); lines.push(\`P = G^a = \${P}\`); lines.push(\`R = G^b = \${R}\`); lines.push(\`e(P, R) = \${direct}\`); lines.push(\`e(G, G)^{ab} = \${expected}\`); lines.push(\`bilinear holds: \${direct === expected}\`); // Try with mismatched scalars to confirm the structure. const P2 = scalarMul(G, 5n); const R2 = scalarMul(G, 11n); lines.push(""); lines.push(\`e(G^5, G^11) = \${pair(P2, R2)}\`); lines.push(\`e(G, G)^{55} = \${modpow(eGG, 55n, Q)}\`); out.textContent = lines.join("\\n"); })(); `, "/index.html": `
running...
`, }} /> The Rust shape of a real pairing — using [arkworks](https://github.com/arkworks-rs) — is much closer to a one-liner once the curve is in scope: {`// Skeleton showing how arkworks expresses bilinearity. Won't compile here // without the ark-bn254 / ark-ec deps; this is the SHAPE of the production // code in zera-sdk-core/src/pairing.rs. // use ark_bn254::{Bn254, Fr, G1Affine, G2Affine}; // use ark_ec::{pairing::Pairing, PrimeGroup}; fn main() { // let g1 = G1Affine::generator(); // let g2 = G2Affine::generator(); // let a = Fr::from(17u64); // let b = Fr::from(23u64); // // let p1 = (g1 * a).into(); // let q1 = (g2 * b).into(); // // let lhs = Bn254::pairing(p1, q1); // let rhs = Bn254::pairing(g1, g2).pow(&[(a * b).into_bigint().0[0]]); // // assert_eq!(lhs, rhs); // bilinearity // // The whole Groth16 verifier reduces to a constant number of these // pairings — three for Groth16, plus a multi-pairing-product check. println!("see crates/zera-sdk-core/src/pairing.rs for the real code"); } `} The real implementation lives in [arkworks-rs/algebra](https://github.com/arkworks-rs/algebra) and [supranational/blst](https://github.com/supranational/blst). The latter is what production Ethereum and Solana ZK code links against — `blst` is the BLS12-381 pairing library written by Pierre-Yves Strub and Sergey Vasilyev, audited, and with constant-time multi-scalar multiplication that beats anything else in the open source. ## What changes if we move to BLS12-381 The migration cost is not the curve. The migration cost is everything that touches the curve. 1. **Re-run the trusted setup.** Groth16 needs a per-circuit setup. Migrating to BLS12-381 means a fresh ceremony for every circuit. That is non-trivial — a Powers-of-Tau ceremony runs for months — but it is also not blocked on cryptography. 2. **Regenerate the verifying keys.** Every on-chain verifier ships a verifying key (a few KB of curve points). Those have to be regenerated and re-deployed. On Solana, that's a program upgrade. On Ethereum, that's a fresh contract deploy. 3. **Update every prover.** snarkjs, rapidsnark, the Rust prover in [zera-sdk-core](https://github.com/Dax911/zera-sdk) — all of them. The `ff` and `pairing` crate ecosystem in Rust is curve-generic, so this is an `ark-bn254` → `ark-bls12-381` swap and a recompile. The TypeScript side is harder because circomlibjs is BN254-pinned in places. 4. **Eat the verifier-cost hit.** A BLS12-381 pairing is roughly 50% more expensive than BN254. On Solana that's 1500 extra compute units per pairing-based verification. Multiplied by the four pairings in a typical multi-input transfer proof, that's 6000 CUs — meaningful but absorbable. We've scoped this work for [zera-sdk](/blog/zera_sdk_scaffolding/) v2 but not yet committed to a date. The bet is: BN254 carries us through 2027 deployments comfortably, and the BLS12-381 migration is a clean lift the moment the broader Solana ecosystem standardises on it (the `alt_bls12_381_*` syscalls have been in cargo-audit feature flags since 2025). ## Where this lands in the stack In `crates/zera-sdk-core/src/curve.rs` the curve is a single type alias: ```rust // Currently: pub type Curve = ark_bn254::Bn254; pub type Fr = ark_bn254::Fr; pub type G1 = ark_bn254::G1Affine; pub type G2 = ark_bn254::G2Affine; // Future: // pub type Curve = ark_bls12_381::Bls12_381; // pub type Fr = ark_bls12_381::Fr; // ...etc ``` The whole SDK reads from `Curve`, `Fr`, `G1`, `G2`. The migration is a four-line swap and a re-ceremony. The cleanliness is on purpose — see [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) for why we boxed the curve choice the way we did. What I tell people who ask "should I deploy on BN254 or BLS12-381?": deploy on BN254 if you need to compose with the Ethereum precompile or the Solana `alt_bn128_*` syscall *today*, deploy on BLS12-381 if you don't and you want the security headroom. The math is the math. The deployment surface is what makes the call. ## Further reading - [Updating Key Size Estimations for Pairings](https://eprint.iacr.org/2017/334) — Barbulescu, Duquesne (Journal of Cryptology 2019) — the post-exTNFS security recompute. - [Extended Tower Number Field Sieve: A New Complexity for the Medium Prime Case](https://eprint.iacr.org/2015/1027) — Kim, Barbulescu (CRYPTO 2016) — the attack that dropped BN254's security. - [Pairing-Friendly Curves (IETF CFRG draft)](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/) — the standardisation path. - [BLS12-381 For The Rest Of Us](https://hackmd.io/@benjaminion/bls12-381) — Ben Edgington's accessible explainer. - [arkworks-rs/algebra](https://github.com/arkworks-rs/algebra) — the curve-generic Rust implementation we use. - [supranational/blst](https://github.com/supranational/blst) — the production BLS12-381 library. - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — the sister piece on what we commit *with*. - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash that lives inside circuits over the same curve. --- # Privacy's broadband moment Canonical: https://blog.skill-issue.dev/blog/privacys_broadband_moment/ Description: ZK got fast, hardware got attestable, AI agents started carrying their own wallets, and regulators stopped trying to ban math. Four curves crossed and privacy stopped being a research topic — it became infrastructure. Published: 2026-04-15T08:00:00.000Z Tags: zera, zk, cryptography, strategy, founders There is a phrase we keep using internally at [Zera Labs](https://zeralabs.org): *privacy's broadband moment.* It started as a slide-deck line, the kind of thing you put in front of an investor to explain why a fifteen-year-old idea is suddenly a 2026 product. After a year of saying it I realised it is also the most precise description I have for what is actually happening in the cryptography stack right now. Broadband did not arrive because someone invented broadband. It arrived because **four unrelated curves crossed at the same time**: fibre got cheap, video codecs got good, last-mile rights-of-way got resolved, and people stopped thinking of "the internet" as a separate thing they used at a desk. None of those four was sufficient. All four together were inevitable. Zero-knowledge cryptography is having the same moment. I want to lay out the four curves I see, one at a time, and then say what we are doing about it. ## Curve 1 — proof systems finally got fast For most of the last decade, "fast ZK" meant Groth16 over BN254 with a trusted setup and proving times measured in seconds for circuits that did anything useful. That was good enough for academic papers and bad enough for products. People shipped in spite of it. Tornado Cash circuits took four-plus seconds to prove on a laptop in 2020. That is not a consumer experience; that is a research demo. The thing that actually changed in 2024 and 2025 is the boring thing: **hash-friendly arithmetisation went mainstream.** Poseidon (and the Poseidon-2 successor) went from a "cool paper at SAC 2019" to the default ZK-friendly hash inside almost every modern proof system. Once you have a hash that costs ~250 constraints per permutation instead of the ~24,000 that SHA-256 takes inside a SNARK, the entire calculus of "what circuits are practical to prove on a phone" inverts. The [`zera-sdk` Rust core](https://github.com/Dax911/zera-sdk) ships Poseidon as the only commitment hash. We did not invent that decision; we inherited it. Every serious privacy pool in 2026 made the same call. The reason ZERA can talk about *unified* shielding — one pool that holds USDC and USDT and SOL and `$ZERA` and a dozen other tokens at once — is that the per-note proof cost finally dropped below the threshold where wallet UX would tolerate it. I wrote about how this looks at the metal level in [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) and [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/). Short version: the production implementation is six lines of code per primitive, and the line of code that made it six lines instead of six hundred is the choice of Poseidon. ## Curve 2 — hardware attestation stopped being theatre The second curve is the one nobody likes to talk about because it sounds like 2014 trusted-execution marketing. But it is real now in a way that it was not. Apple's Secure Enclave shipped in 2013. For a decade it was a place you stored your fingerprint hash and your Apple Pay tokens. In 2026 it is a place you can ship cryptographic primitives that the OS itself cannot read or steal, *with attested provenance.* Pixel devices have Titan M2. Modern AMD chips have SEV-SNP. ARM TrustZone is everywhere. The attestation chains are documented, the developer APIs are stable enough to build against, and — critically — the *threat model* for what a TEE actually buys you stopped being aspirational. This matters for the [True Offline Payments](https://zeralabs.org/#features) pillar of ZERA in a way that is hard to overstate. "Offline P2P payments" without a hardware trust anchor is a euphemism for "double-spend forever." With one, it is a sequence-numbered key-attested signature over a note that the rest of the network can verify when they reconcile. The cryptography is the easy part. The cryptography has been ready for a long time. What was not ready until very recently was the assumption that the user has a real TEE in their pocket and that we can tell whether they do. [Foundry Digital taught me to think like an operator](/blog/what_running_a_bitcoin_mine_taught_me/) — the hardware *is* the system. ZERA Hardware exists for the same reason mining ASICs exist: when the math is fixed and the silicon is differentiated, infrastructure is where the next decade of value lands. ## Curve 3 — AI agents grew wallets The third curve is the one I genuinely did not see coming until late 2025. Coinbase shipped [x402](https://www.coinbase.com/developer-platform/discover/protocols/x402) — a stablecoin-payment protocol over HTTP — and the AI agent ecosystem absorbed it within a quarter. Anthropic's MCP standard went from "interesting Anthropic side project" to "ten thousand public servers, ninety-seven million SDK downloads a month" in the same window. Two things that should not have collided collided: **autonomous AI agents now carry their own wallets**, and the protocols they use to pay each other are running on stablecoin rails. The implication for privacy is not subtle. An autonomous agent that buys a search result for `0.001 USDC` is making a transaction that — under any current rail — is permanently legible to anyone watching the chain. If your agent buys ten thousand search results across an afternoon while it does research for you, the sum of those transactions is a *behavioural signature* of you. Not your agent. *You.* Because the agent is acting on your instructions. This is the use-case that turned privacy from "a thing crypto people argue about on Twitter" into "a thing every AI platform team will be procuring by Q4." There is no version of an autonomous-agent economy that is also a transparent-by-default payments graph. Either agents acquire privacy primitives, or agents stop being economically rational to operate at scale. We are betting that the first thing happens. I wrote the threat-model framing for this earlier in the year — see the post on the [x402 honeypot research artifact](/blog/x402_honeypot_disclosure/) for why this is a 2026 problem and not a 2028 one. ## Curve 4 — the regulatory weather changed I do not love writing about regulation. I will keep this short. For most of the last decade, "we are building privacy infrastructure" was a sentence you said at a developer conference in Berlin and not at a meeting at the SEC. The Tornado Cash sanctions in 2022, the chilling effect on Nym and Aztec, the post-FTX legislative panic — all of it pushed serious privacy work either offshore or underground. Two things shifted that. First, the [district court ruling overturning the Tornado Cash sanctions](https://www.fifthcircuit.gov/) in late 2024 re-established that *immutable code is not a sanctioned entity*. Second, the broader 2025-2026 stablecoin clarity work in the US, EU MiCA implementation, and the Hong Kong VASP regime made it possible for compliant venues to handle privacy assets the way they handle any other asset class — with KYC at the edges and pseudonymity in the middle. ZERA is built **token-agnostic, chain-agnostic, and compliance-aware.** The pool holds USDC. USDC has a freeze function. We do not pretend it does not. The interesting design question stops being "how do we build a system that defies the regulator" and becomes "how do we build a system the regulator can verify *without* the regulator becoming a panopticon." The answer to that question is zero-knowledge. The reason the answer is finally usable is that curves one through three made it cheap. ## What we are doing about it Four curves crossing is necessary but not sufficient. Someone has to actually ship the thing. That is what Zera Labs is for. Concretely: - **One unified shielded pool** instead of one per asset class. The pool is built on Solana for the [account-compression-driven cost model](/blog/zeraswap_compressed_amm/) — Light Protocol's compressed accounts let us amortise the per-note state cost down to something that works at consumer-payment scale. - **A wallet that does not assume you are sitting at a desk.** [Zera Wallet](https://wallet.zeralabs.org) targets desktop, iOS, and Android *with the same primitives* — the offline-P2P story is real and is the reason we keep saying "digital cash" instead of "private DeFi." - **An SDK with an MCP server in the box.** Every modern privacy primitive should be callable by an AI agent under a verifiable policy. We made that the default rather than the afterthought. See [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/). - **A research line that publishes.** I am doing a [PhD by publication](/about) in zero-knowledge proof systems while running the company. Every paper has a corresponding production component. Every production component has a paper that would not embarrass me in a peer-review queue. ## The thing I keep telling people You can be early to the right idea by a decade and watch the wave roll in without you. The question is never "is this the future?" The question is "did the four curves cross *yet*?" Privacy's four curves crossed in 2026. The next ten years are infrastructure-build. We are going to be a stupid fraction of that infrastructure or none of it, and either way the wave is happening. If that sounds like the kind of thing you want to be in the middle of, [my calendar is open](https://cal.com/daxts). ## Further reading - [zeralabs.org](https://zeralabs.org) — product surface - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) - [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) - [Why I started Zera Labs](/blog/why_i_started_zera_labs/) - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) - [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) - Grassi et al., *Poseidon: A New Hash Function for Zero-Knowledge Proof Systems* (USENIX Security 2021) - Anthropic, *Model Context Protocol Specification* (2025-11-25) --- # Generating mempool with a Rust txbot Canonical: https://blog.skill-issue.dev/blog/vanta_txbot_synthetic_mempool/ Description: Empty blocks lie. A new chain whose miners are mining empty templates is not exercising any of the code that fails in production. The txbot is a 200-line Rust loop that round-robins coins through 114 addresses to keep mempool honest. Published: 2026-04-13T17:39:39.000Z Tags: vanta, rust, txbot, mempool, testnet There is a class of bug in a Bitcoin-style chain that you only ever see when the mempool is non-trivially full. Fee-rate accounting, RBF replacement, package relay, mempool eviction policies — all of it is only ever stressed by *real spend pressure*. A new chain whose miners are mining empty templates against an empty mempool is, by definition, not exercising any of that code. So you ship a transaction bot. The Vanta txbot is [`txbot/src/main.rs`](https://github.com/Dax911/vanta/blob/main/txbot/src/main.rs), a 200-line Rust loop that round-robins random spends across 114 pre-funded Z-addresses on the testnet wallet. This post is a tour of what it does, what it found, and why the synthetic-load approach is non-negotiable when you're bringing up an L1. ## The problem statement In April 2026 I had a chain that worked. Bitaxes were finding blocks, the explorer was rendering them, the wallet was sending and receiving. There were also days where the mempool depth was zero for hours at a time, because the only people transacting were me, and I sleep. A pre-mainnet chain that produces empty blocks is *less debugged* than one with mempool pressure. Things you don't notice when blocks are empty: - Fee estimation has nothing to estimate against and falls through to `fallbackfee`. - Coin selection is trivial when there are 12 UTXOs in the wallet. With 1,000 UTXOs across 114 addresses, you start hitting `bnb`-vs-knapsack edge cases. - Block-template construction never sees competition between transactions. Every fee policy is moot. - The mempool's eviction policy, ancestor/descendant limits, and policy-vs-consensus split — all untested. The fix isn't "wait for users." Users come *after* the chain is debugged. The fix is to ship synthetic load and let the chain talk to itself. ## The bot in 200 lines The configuration up top sets the spend envelope: ```rust const MAX_SPEND_RATIO: f64 = 0.40; const MIN_SPEND_RATIO: f64 = 0.05; const MAX_OUTPUTS: usize = 12; const MIN_DELAY_MS: u64 = 200; const MAX_DELAY_MS: u64 = 2000; ``` Every round, the bot picks a random fraction of its current balance between 5% and 40%, splits it into 1–12 random output amounts, picks 1–12 destination addresses uniformly from the address pool, and sends. Then sleeps a random 200ms–2000ms and goes again. The address pool is hardcoded inline as a `&[&str]` of 114 Z-prefix addresses. They're real addresses owned by the testnet wallet (the bot is running against the wallet RPC), so coins keep round-robining through the same wallet — never net leaving, just churning. The spend loop is the simplest thing that works: ```rust loop { round += 1; let balance = match get_balance(&rpc) { Ok(b) => b, ... }; if balance < 1.0 { std::thread::sleep(Duration::from_secs(10)); continue; } let spend_ratio = rng.gen_range(MIN_SPEND_RATIO..=MAX_SPEND_RATIO); let total_spend = balance * spend_ratio; let num_outputs = rng.gen_range(1..=MAX_OUTPUTS); let amounts = random_split(&mut rng, total_spend, num_outputs); // ... send each output via sendtoaddress, log txid let delay = rng.gen_range(MIN_DELAY_MS..=MAX_DELAY_MS); std::thread::sleep(Duration::from_millis(delay)); } ``` `random_split` divides the total spend into `n` pieces by sampling `n-1` uniformly random cut points. This produces uneven splits — most outputs are small, a couple are medium, occasionally one is large. That distribution is *closer to organic spending* than equal splits would be, and it stresses coin selection harder. `sendtoaddress` is called once per output rather than once per `n`-output transaction. This was a deliberate choice: it produces more transactions per round (which is the point), and it lets the chain pick how it batches them in mempool selection. ## What it actually exercised The bot ran for weeks against the Latitude testnet. Things it surfaced: **Fee estimation falls through.** The first time the bot sent a transaction the call returned `"Fee estimation failed."` The fix was the now-canonical `settxfee` at startup with a 0.0001 fallback. Same line is in the [Axum web wallet's main.rs](/blog/vanta_wallet_axum_api/) for the same reason. **Wallet RPC contention.** When the bot rate is high, multiple `sendtoaddress` calls in flight contend on the wallet's lock. The bot is single-threaded so it's only contending against itself plus whatever else uses the wallet (the web UI, occasional manual sends). The lesson: if you're going to run the bot at high rate, give it a dedicated wallet via `loadwallet`. **Mempool eviction.** With the bot churning 5–10 transactions per round and a 1-minute block time, mempool depth would creep up during slow blocks and drain on fast blocks. This was the first time I watched the eviction policy actually run. It's *fine* — Bitcoin Core's mempool is one of the most-tested pieces of state in the codebase — but watching it from the outside helped me build a model of how it behaves at our parameters (1-min blocks, 100k subsidy, low fee floor). **Pool L2 retry queue.** The 2026-04-13 commit `ops: vanta-node systemd unit + docker compose + pool L2 retry queue` landed a feature where the [Stratum server's L2 submission](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py) is enqueued for retry when the L2 sidecar is unreachable. We discovered the need for that retry queue because the txbot was generating enough block-finding pressure that the pool was sometimes hitting `submitblock` while the L2 sidecar was being restarted. Without the retry queue, those blocks' encrypted notes would be lost. With it, they get replayed when the L2 comes back. That's the synthetic-load test paying for itself in a feature that ended up in the production pool code. ## Things the bot is *not* designed to be The bot is a stress generator, not a fuzzer. It does not: - Construct invalid transactions to test rejection paths. (That's the functional tests in `test/`.) - Try to double-spend. (The wallet won't let it; the chain wouldn't accept it.) - Generate shielded transactions. (No SP1 prover in the bot loop. Yet.) - Negotiate fees adversarially. (Single fixed fallback fee.) I have *thought* about adding all of these. The shielded-transaction one is the interesting next step. A txbot that includes some fraction of shielded sends would exercise the SMT growth path, the nullifier-set growth path, and the encrypted-note inbox at `vanta-node` — all of which currently only get exercised by manual sends from the wallet. Adding ZK proof generation to the bot loop is the trade-off though. SP1 proofs take 30–60 seconds on CPU, so a bot that does 10 sends per minute can't be all-shielded. **TODO: Dax confirm whether we want a `--shielded-ratio 0.2` flag to mix.** ## Why not synthetic at the protocol level A reasonable counter-design: instead of running a separate bot, make `vantad` itself emit synthetic transactions in a `regtest`-only mode. Two reasons we didn't: 1. **The bot is real.** Every transaction the bot sends is signed by a real key, broadcast through real RPC, and validated by real consensus. It exercises the same code paths a user transaction would. A built-in synthetic mode is cheaper but it is at risk of taking shortcuts that a real RPC client wouldn't take. 2. **Operational separation.** The bot is a thing I can stop, restart, retarget, or add features to without touching `vantad`. That separation matters; the consensus binary should not contain test-traffic-generation code. The bot lives in `txbot/`, separate Cargo workspace, separate binary. The cost of that separation is a few extra lines of `bitcoincore-rpc` setup. The benefit is that I can iterate on the bot during a deploy without rebuilding the chain. ## What I would change A list, in order of priority: 1. **Multiple workers.** A single-threaded bot maxes out around 10 tx/sec because of RPC round-trip latency. A 4-worker version with a shared rng seed would 4x the rate without changing the workload shape. Easy. 2. **Shielded mix.** As above. Adds the SP1 dependency and an L2-sidecar URL to the bot's config; cost is per-tx latency. 3. **Adversarial replacement.** Send a tx, then send a higher-fee replacement before the first confirms. Tests RBF policy. Easy. 4. **Mempool snapshot logging.** After each send, query `getmempoolinfo` and `getmempoolancestors` for the txid. Log the mempool depth and ancestor count. This produces a time series I can graph against block-find events to see how mempool pressure correlates with confirmation latency. Low priority. The bot is also *load-bearing for the explorer*. The 2026-04-13 commit `explorer: privacy throughput + anonymity charts (recharts)` shows transaction-count and mempool-depth charts on the explorer dashboard; those charts are flat without the bot running. ## Further reading - [`txbot/src/main.rs`](https://github.com/Dax911/vanta/blob/main/txbot/src/main.rs) — the entire bot in one file - [`txbot/Cargo.toml`](https://github.com/Dax911/vanta/blob/main/txbot/Cargo.toml) — dependency-light by design - [The vanta wallet HTTP API](/blog/vanta_wallet_axum_api/) — sister piece on the Axum wallet that talks to the same RPC - [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain the bot is exercising - [Mining VANTA with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the hardware that consumes the bot's mempool pressure - [Bitcoin Core mempool docs](https://github.com/bitcoin/bitcoin/blob/master/doc/policy/mempool-replacements.md) — the policy surface the bot indirectly tests --- # Latitude bare-metal primary, Fly.io backup: the deploy story for a 1-min-block chain Canonical: https://blog.skill-issue.dev/blog/vanta_flytoml_latitude_baremetal/ Description: Vanta v1 went LIVE on a Latitude bare-metal box at 64.34.82.145:9333 with a Fly.io seed fleet as auto-failover. Why a 1-min-block chain hates cold starts, what the fly.toml has to say about it, and the cost math that picks bare metal. Published: 2026-04-13T21:18:29.000Z Tags: vanta, deploy, latitude, fly, baremetal, infra The 2026-04-13 commit `d3d532cc deploy: vanta v1 LIVE on Latitude` is the moment Vanta moved from "regtest on a Mac mini under my desk" to "mainnet on a real-world internet host." The seed node IP — `64.34.82.145:9333` — has been the bootstrap addnode in the [desktop wallet's auto-config](/blog/vanta_tauri_ergonomics/) since that commit. What the commit message doesn't tell you is that there's a *second* deploy target. The `fly.toml` in the repo declares an 11-region fleet on Fly.io, hardcoded to an old `zeracoin-seed` app name. That fleet is the *backup* — the failover that the network falls back to when the bare-metal box goes down. Bare metal is primary. Fly is the safety net. This post is the architecture, the fly.toml walk-through, the cost math that makes bare metal cheaper than equivalent Fly machines, and a candid paragraph about why a 1-minute-block chain particularly hates cold starts. ## The two-tier topology There's a single primary bare-metal box, and a fleet of small Fly machines. The wallet's auto-config lists *both* IPs for redundancy: ``` addnode=64.34.82.145:9333 # Latitude bare metal — primary addnode=66.241.124.138:9333 # Fly.io fleet — backup ``` The bitcoind P2P protocol picks whichever it can reach first and rotates if a peer disappears. There's nothing fancy here — Bitcoin Core's peer discovery does the work. The architecture is "primary host, secondary host, network sorts itself out." The reason for two tiers (and not just two bare-metal boxes, or just a Fly fleet) is *operational*. Bare metal is cheap when you can give it your full attention. Bare metal is brittle when you can't — disk failures happen, ISPs renumber, hardware ages. The Fly fleet is the "I am asleep, the chain stays up" insurance. ## fly.toml, annotated The full [`fly.toml`](https://github.com/Dax911/vanta/blob/main/fly.toml) is short. The interesting parts are below. ### App name: the rebrand artefact ```toml app = "zeracoin-seed" primary_region = "iad" ``` The Fly app is *still* named `zeracoin-seed` — the pre-rebrand name. Renaming a Fly app requires recreating it (you lose the IPs and volumes), and the IPs are baked into the desktop wallet's `addnode` lines. Recreating the app would force a wallet upgrade for every existing user. The fix lives in commit [`1b72aec6`](https://github.com/Dax911/vanta/commit/1b72aec6c) — `fly: match actual app name (zeracoin-seed) + clamp grace_period` — which is the moment I committed to the rebrand-postponement and updated the deploy script to match the actual app name instead of pretending we'd already migrated. The tradeoff is: ugly artefact in `fly.toml` vs. forcing a migration every existing user has to participate in. The artefact wins. ### Kill signal and timeout ```toml kill_signal = "SIGTERM" kill_timeout = "120s" ``` Bitcoin Core flushes its database on shutdown. Get SIGKILL'd mid-flush and you can corrupt chainstate or block files. The 2-minute `kill_timeout` is the window we give Fly's orchestrator to wait before escalating; in practice `vantad` flushes in 10–20 seconds, so 120 is generous insurance. Fly defaults to a 5-second `kill_timeout`. Five seconds is not enough to flush a UTXO database, full stop. Every Bitcoin-Core deploy I've seen on Fly that didn't override this had at least one chainstate-corruption incident. **Override it.** ### Volumes ```toml [mounts] source = "vanta_data" destination = "/root/.vanta" ``` A persistent volume mounted at `~/.vanta` — the Bitcoin Core data dir. Fly creates one volume per machine (the volume names get auto-numbered: `vanta_data`, `vanta_data_v2`, etc). The volume survives machine restarts; only a `fly volumes destroy` deletes it. The data dir contains chainstate, blocks, the mempool, the peers cache, and the wallet (if any). On a fresh deploy this is empty and the machine does an initial-block-download from peers; on a restart it picks up where it left off. The volume is what makes "restart a machine" cheap and "destroy a machine" expensive. ### Rolling deploy strategy ```toml [deploy] strategy = "rolling" max_unavailable = 0.25 wait_timeout = "10m" ``` Rolling deploys take at most 25% of the fleet down at once. With 11 machines spread across 11 regions, that's about 3 machines unavailable during any given deploy. The other 8 keep the network reachable for the wallet's `addnode` lookups. `wait_timeout = "10m"` gives each machine ten minutes to come back up and pass health checks before the deploy considers it failed. Bitcoin Core sometimes takes that long to verify chainstate at startup, especially on a small machine; default Fly wait_timeout (5m) was tripping us during deploys and leaving the cluster in a partially-deployed state. ### Health checks ```toml [[services]] internal_port = 9333 protocol = "tcp" auto_stop_machines = false auto_start_machines = true [[services.ports]] port = 9333 [[services.tcp_checks]] interval = "30s" timeout = "5s" grace_period = "1m" ``` `auto_stop_machines = false` is intentional. Fly's autostop will spin a machine down after a few minutes of no traffic. A *seed node* with no traffic is suspicious, but it's not "stop the machine" suspicious — peer discovery is bursty, and a seed that's stopped when a wallet starts up is a seed that's not doing its job. `auto_start_machines = true` lets Fly *start* a stopped machine on a cold tcp connection. This is the safety net for any case where the autostop did fire. `tcp_checks` is a 30-second TCP-handshake probe against port 9333. If `vantad` dies or wedges, its P2P listener goes away, the TCP check fails, and Fly restarts the machine. The `grace_period = "1m"` is the startup window where we don't penalise a machine for being mid-IBD. `grace_period` is capped at 1m by Fly — anything higher gets clamped, which is a thing I learned by setting it to 5m and watching the deploy log it as "1m (clamped)." The 1-minute window is enough for a warm restart but not enough for a cold IBD; we work around it by not destroying machines casually. ### Sizing ```toml [vm] size = "shared-cpu-1x" memory = "2gb" swap_size_mb = 1024 ``` `shared-cpu-1x` is Fly's smallest paid tier. 2 GB RAM is bumped from the default 1 GB because `txindex=1` plus the UTXO set needs headroom on a Vanta-sized chain. 1 GB swap is insurance against OOM kills during IBD bursts (specifically: the moment when the UTXO set is being loaded into memory at startup). This is sized for a *seed* node, not a *miner* node. We don't run mining workloads on Fly. The Bitaxe rig at home is the [actual mining setup](/blog/mining_vanta_with_a_bitaxe/). ## The Latitude box The bare-metal primary is on [Latitude.sh](https://latitude.sh) (formerly Latitude.net), a smaller-than-OVH-but-bigger-than-Hetzner bare-metal provider with hourly billing. The spec is a single AMD Ryzen 9, 32 GB ECC RAM, 1 TB NVMe, with a /29 subnet and an unmetered 1 Gbps port. **TODO: Dax confirm the exact tier — I have it as `c2.medium.x86` but want to verify against the Latitude billing dashboard.** What it runs: - `vantad` — the L1 node, listening on port 9333 (P2P) and 9332 (RPC, bound to localhost). - `vanta-node` — the L2 sidecar, listening on port 9380 for the REST API. - `nginx` — TLS termination for the L2 REST API (port 443 → 9380). - The Bitaxe pool (port 3333) — the home rig actually plugs into a separate machine, but the *pool stratum server* lives on the Latitude box. - The vanta-explorer (port 80 → 8080) — block explorer. - The fly-deploy mirror — a backup of the Fly fleet's deploy state, in case Fly itself goes down for an extended period. This is more than a "seed node." It's the primary operational deploy of the chain. The Fly fleet is, again, the *seed fallback* — they don't run the explorer or the L2 sidecar. They just keep the P2P network reachable. ## Why a 1-minute-block chain hates cold starts Worth dwelling on this. On Bitcoin (10-minute blocks), a node that's been off for an hour comes back up and is six blocks behind. Catching up is fast. The chain's "average" block production rate is generous enough that a 60-second startup delay is invisible. On Vanta (1-minute blocks), an hour off is sixty blocks behind. A 60-second startup is *one full block of latency*. If the seed nodes are slow to come back up, wallet UX degrades visibly: the user opens the wallet, sees "syncing," and waits sixty seconds where Bitcoin would have synced in ten. > **WARNING:** This is the operational property that makes Fly's autostop *dangerous* for a fast-block chain. A seed node that's been auto-stopped after 30 minutes of idle, then woken up by a wallet's first connection, takes ~15 seconds of cold start. During that 15 seconds, the wallet sees no peers and reports "L1 disconnected." This is a real user-visible regression compared to a warm seed. The mitigations are stacked: 1. `auto_stop_machines = false` in `fly.toml` — Fly never stops the seeds. 2. The Latitude bare-metal primary handles 99% of the bootstrap traffic, so most wallets never even hit the Fly fleet. 3. The Fly fleet keeps machines warm by *each other's* P2P traffic — bitcoind's peer-keepalive interval is short enough that the machines stay active even with no client traffic. 4. The Latitude box has a [`systemd` unit](https://github.com/Dax911/vanta/blob/main/contrib/init) with `Restart=always` so any local crash recovers in under 10 seconds. I'd not run a fast-block chain on a serverless-by-default platform. Fly is a great fit because it can be configured to behave like a always-on host. Fly's *defaults* are not. ## Cost math: Latitude vs Fly Approximate, monthly: | Component | Latitude (bare metal) | Equivalent Fly | |---|---|---| | 1× AMD Ryzen 9 (8c/16t) | ~$140 | shared-cpu-8x: ~$160 | | 32 GB RAM | included | $80 (32 GB at $2.50/GB) | | 1 TB NVMe | included | $150 (1 TB at $0.15/GB) | | 1 Gbps unmetered | included | bandwidth metered, est. $30 | | **Total per box** | **~$140** | **~$420** | Latitude's all-included pricing for a single bare-metal box is roughly *one third* the cost of an equivalently-specced Fly machine. The Fly fleet (11 small seeds at ~$5–$10/month each) costs another ~$80/month combined. So the total bill: Latitude $140 + Fly fleet $80 = ~$220/month for *primary + 11-region failover.* An equivalent Fly-only deploy (one big primary + 11 small seeds) would be ~$500/month for a worse outcome (no actual bare-metal performance for the L2 indexer, no NVMe write-throughput for the chainstate, no dedicated network port). This is a textbook case for hybrid deploy. The thing you're optimising for cost on (the heavy, always-on workload) goes on bare metal. The thing you're optimising for *availability* on (the geographic-redundancy seed fleet) goes on the platform with built-in geographic distribution. ## A tradeoff table I keep telling people to do this kind of comparison explicitly, so: | Option | Cost (1 yr) | Latency to seed | Cold-start risk | Operational burden | |---|---|---|---|---| | Bare metal only (Latitude) | ~$1,700 | Variable by region (single PoP) | Low — always on | High if hardware fails | | Fly fleet only (11 regions) | ~$5,000 | Low (regional anycast) | High if autostop is enabled | Low — managed platform | | Hybrid (Latitude primary + Fly backup) | ~$2,600 | Low (Fly fronts geographic) | Low (primary always on) | Medium | | DigitalOcean / Linode dedicated | ~$1,200–$2,000 | Moderate (one PoP per droplet) | Medium | Medium | | Hetzner dedicated | ~$700–$1,400 | High (mostly EU PoPs) | Low | Medium | The Hetzner option is genuinely tempting on cost grounds — half the price of Latitude. The reason I didn't pick it for *this* chain is that Hetzner's IP ranges are widely flagged by reputation services as "spam-adjacent" (because they're cheap and hosters use them for everything), and a small-network seed node whose IP gets transiently blocked by some random ISP's anti-spam filter is a problem I do not want. DigitalOcean's $40/mo "premium intel" droplets would have worked too, but the bandwidth charges add up — DO meters at $0.01/GB above the included amount, and a chain seed serving IBD to fresh nodes can easily push 100 GB/day during a busy period. ## What changes after Phase 4 Phase 4 in the [architecture roadmap](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) is "full Rust node rewrite using rust-bitcoin stack." When that lands, the deploy story shifts: - The L2 sidecar and L1 node are *one binary*, not two. Operationally that's a smaller blast radius — one PID to monitor instead of two. - The Rust node is statically linked and ships as a single ~30 MB binary. Container size collapses. - We can in principle deploy on smaller Fly machines (256 MB instead of 2 GB) once the C++ is gone. But Phase 4 is the *future*. The current deploy story is "C++ node + Rust sidecar on bare metal, with a Fly fleet of C++-node-only seeds for failover." ## What I changed my mind about I started this project assuming Fly was the right deploy target for *everything.* It's a great platform, the developer experience is unmatched on its tier, and the ergonomics of `fly deploy` after years of Kubernetes is genuinely refreshing. The thing that changed my mind was the cold-start property. A 1-minute-block chain has a different operational profile than a request-response web service. Fly's defaults — autostop, autoresurrect on demand, regional load balancing — are tuned for a workload where 100 ms latency is fine and 5 second cold starts are tolerable. Neither is fine for a chain seed. Once I'd configured Fly *out of* its defaults — `auto_stop_machines = false`, larger memory, longer kill_timeout, longer wait_timeout — I was running a Fly machine as if it were a always-on box. At which point: a always-on box is what bare metal *is*, at one-third the price, with a real network interface and dedicated NVMe. The Fly fleet still has a job — geographic redundancy, multi-region warm seeds — that bare metal can't do without a substantial multi-PoP investment. So Fly stays as the backup ring. Latitude is the primary. Both are needed; neither is sufficient. ## Further reading - [`fly.toml`](https://github.com/Dax911/vanta/blob/main/fly.toml) — the Fly config this post walks - [`fly-deploy.sh`](https://github.com/Dax911/vanta/blob/main/fly-deploy.sh) — the multi-region deploy wrapper - [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) — the infra section in the architecture doc - [`Dockerfile`](https://github.com/Dax911/vanta/blob/main/Dockerfile) — the container both Latitude and Fly run - [Mining Vanta with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the home-rig side of the operation - [What running a Bitcoin mine taught me](/blog/what_running_a_bitcoin_mine_taught_me/) — the small-operator unit-economics post that informs all of this --- # The MCP server inside zera-sdk Canonical: https://blog.skill-issue.dev/blog/mcp_server_inside_zera_sdk/ Description: Most SDKs ship as a library. zera-sdk also ships as a Model Context Protocol server. Here is why an AI agent should be able to call shielded-pool primitives directly, and how we keep that interface from becoming a footgun. Published: 2026-04-08T16:42:00.000Z Tags: zera, mcp, sdk, ai-agents, rust, typescript When we [scaffolded the SDK monorepo](/blog/zera_sdk_scaffolding/) in early March, the first non-obvious decision was including an [MCP](https://modelcontextprotocol.io) server in the box. Not as an example. Not as a future-work bullet. As a first-class crate alongside the Rust core and the TypeScript surface. Six weeks later it still feels like the right call. Here is the reasoning. ## What MCP actually is, and what it is not MCP — Model Context Protocol — is Anthropic's open JSON-RPC standard for letting LLM-driven applications call tools, read resources, and surface reusable prompts from any compliant server. By the start of 2026 there were over 10,000 public MCP servers and ~97 million SDK downloads per month across the Python and TypeScript implementations. The standard is in the boring-but-load-bearing phase: every major model vendor speaks it, the spec is on a regular cadence, the working groups have process. What MCP is *not* is "an AI feature." It is a protocol layer. The AI part is incidental. What MCP gives you is a typed, schema-described, discovery-friendly RPC surface that any client — model, CLI, IDE, agent — can connect to and immediately understand without bespoke glue. The most useful frame is *"USB-C for tool calls."* That comparison gets thrown around to the point of cliché but it is also accurate: before USB-C you wrote per-cable glue; after, the cable is part of the device. MCP does the same thing for tool surfaces. The interesting question for an SDK author in 2026 is not *"should I expose an MCP server?"* — that question is settled by the AI-agent-economy curve I wrote about [in the broadband-moment post](/blog/privacys_broadband_moment/). The interesting question is *which* surface to expose, and how to keep it from becoming a footgun. ## The tools the SDK actually exposes The first version of `zera-mcp` shipped four tools and three resources. I want to talk about each one, because the choice of what to expose is more meaningful than the protocol mechanics. ### `search_posts(query, k=5)` Wait, no — that is the *blog's* MCP server, not the SDK's. (Yes, [the blog has one too](/blog/privacys_broadband_moment/), and I am building a longer post about that. Let me get back to the SDK.) The SDK's four tools, as of [commit `e350707`](https://github.com/Dax911/zera-sdk): 1. **`compute_commitment(asset, amount, randomness)`** — returns a Poseidon commitment to a `(asset, amount)` pair under a caller-supplied blinding factor. This is the primitive an agent uses to *describe a payment that has not happened yet* — it can hand the commitment to a human for review without ever revealing the amount. 2. **`derive_nullifier(note_secret, commitment)`** — returns the deterministic, single-use nullifier for a previously-committed note. As discussed in [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/), this is the hash that proves a note has been spent without revealing which one. Agents call this during proof generation. 3. **`build_spend_proof(note, recipient, amount)`** — runs the full Groth16 prover for the canonical spend circuit and returns the proof bytes. **This is the only tool that touches the prover.** Doing this in-process via MCP is much better than asking an agent to shell out to a Rust binary; the agent gets a typed, schema-described return value with proof bytes and a public-input vector. 4. **`get_pool_state()`** — read-only resource. Returns the current root hash of the commitment Merkle tree and the count of unspent notes. Agents that want to check whether their proof is still valid against the latest pool state poll this. It is a *resource*, not a tool, in MCP terms — the difference matters for caching and for explaining to the agent that the call is side-effect-free. That is the entire surface. Four tools. No `transfer`, no `withdraw`, no `set_owner`. **An agent can compose payments, prove them, and inspect pool state. It cannot move funds without a human signing the resulting transaction.** That asymmetry is deliberate and I will defend it for as long as MCP exists. ## The asymmetry rule Every time I add a tool to `zera-mcp`, I run it through a single test: > *If the agent is compromised — adversarial prompts, model jailbreak, supply-chain payload in the tool-calling library — what is the worst it can do?* If the answer is "compute a commitment that the human can audit," fine. If the answer is "move funds," not fine. The line is **whether the tool has unilateral authority to change pool state.** The current SDK MCP draws that line at proof construction. The proof itself is just a bunch of bytes; submitting it to the chain still requires a transaction signed by a wallet that the agent does not have direct authority over. This is the same threat model I argued for in the [x402 honeypot disclosure post](/blog/x402_honeypot_disclosure/) and in [Rusty Pipes](/blog/rusty_pipes/) before that. You assume the agent is compromised. You design the surface so a compromised agent cannot drain the pool. Everything else is detail. ## What it looks like when an agent uses it Concretely, here is the flow when a user asks Claude (or ChatGPT, or any MCP-enabled client) to *"send $50 of USDC to alice.sol from my shielded balance, but show me the commitment first":* 1. The agent calls `get_pool_state()` to fetch the current Merkle root. 2. The agent picks an unspent note from the user's local wallet that is `≥ $50`. 3. The agent calls `compute_commitment(USDC, $50, fresh_randomness)` to construct the visible commitment. 4. The agent surfaces the commitment to the human in a message that says, in effect, *"here is the commitment for the $50 send to alice.sol; proceed?"* 5. The human approves. 6. The agent calls `derive_nullifier(...)` and `build_spend_proof(...)` and gets back the spend witness. 7. The agent hands the proof to the *wallet* — not to the chain — and the wallet signs and submits the transaction. The wallet has policy: it will not co-sign a proof whose public inputs have not been displayed to the human in step 4. Step 7 is where the privilege boundary lives. The MCP tools never touch a private key. They never broadcast a transaction. They are pure compute against pool state. ## Why this generalises I have argued for this pattern in three places now: the SDK's MCP server, the blog's MCP server, and the [proposed `lib.skill-issue.dev`](/blog/why_i_started_zera_labs/) personal MCP that exposes my writing as queryable resources. The pattern is the same in all three: > **Expose typed read + compute primitives. Do not expose state-changing authority. Push every authority decision back through the human or through a wallet that has its own policy.** If we are entering a decade where AI agents are going to be calling cryptographic primitives, this is the boundary that needs to hold. The cryptography is finally ready, the protocols are finally ready, and the *interface design* is the part that is still up for grabs. I would rather we set the precedent now than discover the right shape after the first six-figure agent-driven drain. ## What I changed my mind about When I first started writing `zera-mcp` I assumed I would expose the prover as a *resource* (cacheable, repeatable) rather than a *tool* (potentially side-effecting). The ZK community talks about provers as deterministic functions — given the same witness, you get the same proof — so it felt natural to treat them like a read. I changed my mind after watching an agent hammer the prover during testing. **The prover is computationally side-effecting even if it is mathematically pure.** Eight seconds of CPU per call adds up fast when an agent is in a loop. Resources in MCP are aggressively cached by clients; tools are not. By moving the prover behind a tool I forced the client to think about whether to call it again. Worth it. ## Further reading - [zera-sdk on GitHub](https://github.com/Dax911/zera-sdk) — the actual code - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — origin - [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) - [Privacy's broadband moment](/blog/privacys_broadband_moment/) - [Model Context Protocol specification](https://modelcontextprotocol.io/specification) (Anthropic, current draft) - The MCP working-group meeting notes are on the spec repo and worth a quarterly skim --- # Range proofs in 80 lines: Pedersen commitments and a tiny Bulletproof Canonical: https://blog.skill-issue.dev/blog/range_proofs_in_80_lines/ Description: How a Bulletproof actually compresses a range proof to logarithmic size. Derive the inner-product argument from scratch, run a toy prover/verifier in the browser, and pick the right range-proof primitive for 2026. Published: 2026-04-08T16:00:00.000Z Tags: cryptography, bulletproofs, pedersen, range-proof, zk, phd, math import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx"; A confidential transaction has to prove one annoying little thing: that the hidden amount is non-negative and bounded. Without that, an attacker can mint coins out of thin air by committing to a "negative balance" that wraps around the field. The cryptographic primitive that does the proving is the **range proof**, and the question of which range proof to ship in 2026 is — surprisingly — still live. This post does three things: 1. Derives the inner-product argument that makes Bulletproofs short. 2. Walks an 80-line, runnable toy Bulletproof prover/verifier in the browser. 3. Maps the trade-offs between Bulletproofs, classical range proofs, and SNARK-based range proofs onto the deployment surface I keep hitting in [zera-sdk](/blog/zera_sdk_scaffolding/). It's a sibling piece to [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/). Read that one first if "Pedersen" still feels like a textbook word to you. This one assumes you know what `C = a·G + b·H` is and want to know what to do with it. ## What a range proof has to do The setup. A prover holds a value $v$ and a blinding factor $\gamma$, and publishes a Pedersen commitment $$ V = v \cdot G + \gamma \cdot H $$ with $G, H$ independent generators of an elliptic-curve group of prime order $q$. The hiding property of $V$ comes from $\gamma$ being uniformly random; the binding property comes from $G$ and $H$ being independent (no known $\beta$ with $H = \beta G$). The prover then wants to convince a verifier that $v$ lies in some range, typically $[0, 2^n)$ for $n = 32$ or $n = 64$. Crucially, $v$ stays hidden. The verifier learns *only* the fact that the committed value is in range. The naive proof of "$v \in [0, 2^n)$" is to commit bit-by-bit: write $v = \sum_{i=0}^{n-1} a_i \cdot 2^i$ with $a_i \in \{0,1\}$, commit to each $a_i$, and prove each $a_i (a_i - 1) = 0$. That works. It takes $O(n)$ commitments and $O(n)$ proof size, which is what Confidential Transactions in Bitcoin shipped in 2015 and which is roughly **2.5 KB per transaction** at $n = 64$. [Bünz, Bootle, Boneh, Poelstra, Wuille, and Maxwell (2018)](https://eprint.iacr.org/2017/1066) — the Bulletproofs paper — got that down to about **672 bytes**, with no trusted setup, by replacing the linear blob with a logarithmic-size inner-product argument. The compression ratio is roughly 4× over the naive bit-commitment scheme, and it gets better as the range grows. ## The inner-product argument, derived The whole game in Bulletproofs is the inner-product argument (IPA). Forget range proofs for a paragraph. The IPA proves the following: **Statement.** Given commitments $P \in \mathbb{G}$ and $\mathbf{G}, \mathbf{H} \in \mathbb{G}^n$, plus a scalar $c \in \mathbb{F}_q$, the prover knows vectors $\mathbf{a}, \mathbf{b} \in \mathbb{F}_q^n$ such that $$ P = \langle \mathbf{a}, \mathbf{G} \rangle + \langle \mathbf{b}, \mathbf{H} \rangle \quad \text{and} \quad \langle \mathbf{a}, \mathbf{b} \rangle = c. $$ The naive proof is to send $\mathbf{a}$ and $\mathbf{b}$ — that's $2n$ scalars. The IPA gets it to $2 \log_2 n$ group elements plus two scalars. The trick is recursion. Split each vector in half: $\mathbf{a} = (\mathbf{a}_L \,|\, \mathbf{a}_R)$, same for $\mathbf{b}, \mathbf{G}, \mathbf{H}$. The prover sends two cross-terms: $$ L = \langle \mathbf{a}_L, \mathbf{G}_R \rangle + \langle \mathbf{b}_R, \mathbf{H}_L \rangle, \quad R = \langle \mathbf{a}_R, \mathbf{G}_L \rangle + \langle \mathbf{b}_L, \mathbf{H}_R \rangle. $$ The verifier responds with a random challenge $x \in \mathbb{F}_q^*$. Both parties then compute folded vectors of half the length: $$ \mathbf{a}' = x \cdot \mathbf{a}_L + x^{-1} \cdot \mathbf{a}_R, \quad \mathbf{b}' = x^{-1} \cdot \mathbf{b}_L + x \cdot \mathbf{b}_R, $$ and the verifier folds the generators in the dual direction: $$ \mathbf{G}' = x^{-1} \cdot \mathbf{G}_L + x \cdot \mathbf{G}_R, \quad \mathbf{H}' = x \cdot \mathbf{H}_L + x^{-1} \cdot \mathbf{H}_R. $$ The new commitment is $$ P' = x^2 \cdot L + P + x^{-2} \cdot R, $$ and you can check by direct expansion that $P' = \langle \mathbf{a}', \mathbf{G}' \rangle + \langle \mathbf{b}', \mathbf{H}' \rangle$ exactly when the original $P$ relation held. Recurse on $(\mathbf{a}', \mathbf{b}', \mathbf{G}', \mathbf{H}', P')$. After $\log_2 n$ rounds, the vectors are length 1 and the prover just sends the two remaining scalars. That's the entire IPA in seven lines of math. Total proof size: $2 \log_2 n$ group elements (the $L_i$ and $R_i$ from each round) + 2 final scalars. At $n = 64$, that's 12 group elements + 2 scalars ≈ 416 bytes. S1[split into halves] S1 --> X1[prover sends L_1, R_1] X1 --> C1[verifier sends challenge x_1] C1 --> F1[fold to length n/2] F1 --> R{length 1?} R -->|no| S1 R -->|yes| F[send final a, b] F --> V[verifier checks single point]`}/> ## From IPA to range proof in two reductions The range proof reduces to the IPA in two steps. **Step 1: bit decomposition as a vector identity.** Write $v = \langle \mathbf{a}_L, 2^{\mathbf{n}} \rangle$ where $\mathbf{a}_L \in \{0,1\}^n$ is the bit decomposition and $2^{\mathbf{n}} = (1, 2, 4, \dots, 2^{n-1})$. Define $\mathbf{a}_R = \mathbf{a}_L - \mathbf{1}^n$ (so each $a_{R,i} \in \{0, -1\}$). The conjunction "$v \in [0, 2^n)$" becomes the vector identities $$ \mathbf{a}_L \circ \mathbf{a}_R = \mathbf{0}^n, \quad \mathbf{a}_L - \mathbf{a}_R = \mathbf{1}^n, \quad \langle \mathbf{a}_L, 2^{\mathbf{n}} \rangle = v. $$ The first identity (Hadamard product is zero) is exactly the bit constraint $a_i (a_i - 1) = 0$ rewritten. **Step 2: collapse three vector identities to one inner product.** The verifier samples challenges $y, z$. The prover constructs polynomials $$ \mathbf{l}(X) = (\mathbf{a}_L - z \cdot \mathbf{1}^n) + \mathbf{s}_L \cdot X, $$ $$ \mathbf{r}(X) = \mathbf{y}^n \circ (\mathbf{a}_R + z \cdot \mathbf{1}^n + \mathbf{s}_R \cdot X) + z^2 \cdot 2^{\mathbf{n}}, $$ with $\mathbf{s}_L, \mathbf{s}_R$ random blinding vectors. The inner product $t(X) = \langle \mathbf{l}(X), \mathbf{r}(X) \rangle$ is a quadratic in $X$, and the constant term $t_0$ collapses to $$ t_0 = z^2 \cdot v + \delta(y, z), \quad \delta(y, z) = (z - z^2) \langle \mathbf{1}^n, \mathbf{y}^n \rangle - z^3 \langle \mathbf{1}^n, 2^{\mathbf{n}} \rangle. $$ The verifier knows $\delta(y, z)$ (it's all public scalars) and knows $V$ (the commitment to $v$), so it can check the $t_0$ equation against $V$. The prover then runs the IPA on $\mathbf{l}(x)$ and $\mathbf{r}(x)$ for a fresh challenge $x$, and *that* is what gets compressed to $\log_2 n$. The whole construction is one Pedersen commitment, two challenges, two polynomial-coefficient commitments, and an IPA. It fits in a paragraph and runs in a browser. ## The 80-line toy This is a runnable Bulletproof-style range proof for $n = 4$ (so $v \in [0, 16)$). It is intentionally small. It uses scalar arithmetic in a tiny prime field instead of an elliptic-curve group, which means it demonstrates the *protocol shape* but provides zero cryptographic security. Read it for the algebra, not the hardness. ((x % Q) + Q) % Q; const add = (a: bigint, b: bigint) => mod(a + b); const sub = (a: bigint, b: bigint) => mod(a - b); const mul = (a: bigint, b: bigint) => mod(a * b); function inv(a: bigint): bigint { // Fermat: a^(Q-2) mod Q let r = 1n, base = mod(a), e = Q - 2n; while (e > 0n) { if (e & 1n) r = mul(r, base); base = mul(base, base); e >>= 1n; } return r; } const dot = (a: bigint[], b: bigint[]) => a.reduce((s, ai, i) => add(s, mul(ai, b[i])), 0n); const had = (a: bigint[], b: bigint[]) => a.map((ai, i) => mul(ai, b[i])); // Fiat-Shamir: deterministic challenge from a transcript. async function challenge(transcript: string): Promise { const buf = new TextEncoder().encode(transcript); const h = await crypto.subtle.digest("SHA-256", buf); let x = 0n; for (const b of new Uint8Array(h)) x = (x << 8n) | BigInt(b); return mod(x) || 1n; // never zero } // PROVER ----------------------------------------------------------- async function prove(v: bigint) { if (v < 0n || v >= 1n << BigInt(N)) throw new Error("out of range"); // Bit decomposition. const aL = Array.from({ length: N }, (_, i) => (v >> BigInt(i)) & 1n); const aR = aL.map((b) => sub(b, 1n)); const ones = new Array(N).fill(1n); const twos = Array.from({ length: N }, (_, i) => 1n << BigInt(i)); // Sanity: aL . 2^n == v, aL o aR == 0, aL - aR == 1 console.log("aL . 2^n =", dot(aL, twos), " (should be", v, ")"); console.log("aL o aR =", had(aL, aR), " (should be all 0)"); // Verifier challenges (Fiat-Shamir over the public commitment v). const y = await challenge(\`y|\${v}\`); const z = await challenge(\`z|\${v}|\${y}\`); // y-vector const yN = Array.from({ length: N }, (_, i) => { let r = 1n; for (let k = 0; k < i; k++) r = mul(r, y); return r; }); // l(x), r(x) at x=1 (one round of IPA — toy) const lVec = aL.map((a, i) => sub(a, z)); const rVec = had(yN, aR.map((a) => add(a, z))) .map((v, i) => add(v, mul(mul(z, z), twos[i]))); // Inner product t = const t = dot(lVec, rVec); // delta(y,z) = (z - z^2) <1, y^n> - z^3 <1, 2^n> const z2 = mul(z, z), z3 = mul(z2, z); const sum1y = dot(ones, yN), sum1_2n = dot(ones, twos); const delta = sub(mul(sub(z, z2), sum1y), mul(z3, sum1_2n)); // The relation: t == z^2 * v + delta(y,z) const expected = add(mul(z2, v), delta); return { t, expected, lVec, rVec, y, z }; } // VERIFIER --------------------------------------------------------- function verify(p: Awaited>) { const ok1 = p.t === p.expected; const ok2 = dot(p.lVec, p.rVec) === p.t; return { ok1, ok2, ok: ok1 && ok2 }; } // IPA fold (one round, demonstrating the recursion pattern) ------- function ipaFoldOnce(a: bigint[], b: bigint[], x: bigint) { const half = a.length / 2; const xi = inv(x); const aP = a.slice(0, half).map((v, i) => add(mul(x, v), mul(xi, a[half+i]))); const bP = b.slice(0, half).map((v, i) => add(mul(xi, v), mul(x, b[half+i]))); // The cross terms L, R have the same inner product as the original. return { aP, bP }; } // DEMO ------------------------------------------------------------- (async () => { const out = document.getElementById("out")!; const lines: string[] = []; for (const v of [0n, 7n, 15n]) { const p = await prove(v); const r = verify(p); lines.push(\`v=\${v.toString().padStart(2)} t=\\\${p.t.toString().slice(0,18)}... ok=\${r.ok}\`); } // One-shot IPA fold to demonstrate the recursion shrinks the vectors. const a = [3n, 5n, 7n, 11n], b = [2n, 4n, 6n, 8n]; const x = await challenge("ipa|demo"); const folded = ipaFoldOnce(a, b, x); lines.push(""); lines.push(\`ipa fold: a length \${a.length} -> \${folded.aP.length}\`); lines.push(\` b length \${b.length} -> \${folded.bP.length}\`); // Out-of-range case should fail (negative -> wraps if we let it). try { await prove(-1n); lines.push("ERROR: prover accepted v=-1"); } catch (e) { lines.push(\`prover rejected v=-1: \${(e as Error).message}\`); } out.textContent = lines.join("\\n"); })(); `, "/index.html": `
running...
`, }} /> The shape is the thing. A real Bulletproof replaces my BigInt scalars with elliptic-curve points (typically Ristretto or BN254 G1), runs the IPA recursively to length 1 instead of just one fold, and uses Fiat-Shamir over a transcript that includes every public group element. The protocol stays under 700 bytes for $n = 64$, and the verifier cost stays at $O(n)$ multiplications (the prover dominates at $O(n \log n)$). ## Choosing a range proof in 2026 The trade-off space has settled enough to write down honestly. The pattern: if you're already running a SNARK for the privacy proof, embed the range check inside it and pay nothing extra. If you don't have a SNARK and you want short proofs without a trusted setup, Bulletproofs are the right answer. The naive bit-commitment scheme is what you ship when you don't trust the cryptanalysis of either and you're willing to pay 2.5 KB per transaction. STARKs are aspirational for transfers and the right tool for rollups. In [zera-sdk](/blog/zera_sdk_scaffolding/), the range check on `amount` is a 64-bit decomposition inside the Groth16 transfer circuit. Cost: 64 R1CS constraints (one per bit), zero additional bytes on chain. The Bulletproof would have been 672 bytes per spend, which on Solana at 5,000 lamports per byte adds up faster than the constraint cost in the prover. ## What I'd reach for, and when The framing I keep coming back to: range proofs are a **feature** of a privacy system, not a product. The product is the privacy pool. The range proof exists because, without it, the pool is exploitable. Pick the one that disappears most quietly into the rest of your system. For [the unified shielded pool](/blog/pedersen_commitments_in_production/) on Solana, the SNARK-embedded approach wins for compute units and bytes. For a chain that doesn't already have a SNARK, Bulletproofs are the line where the cryptography costs roughly the same per-byte on chain as a multisig and you stop arguing about it. For anything post-quantum, STARKs are the only answer — the discrete-log assumption everything else here leans on collapses to a quantum adversary, and the bullet has to be biting. Bulletproofs greatly improve on the linear (in the bitlength of the range) proof size of confidential transactions. They are also a drop-in replacement for the range proofs used in Monero and other confidential-transaction systems, requiring no trusted setup and relying only on the discrete-logarithm assumption. The 80-line toy at the top of this post is the entire algebraic core of that paper, with the elliptic curve removed. Once you see that the inner-product argument is just *fold the vector in half, prove a smaller statement*, the rest of the construction is bookkeeping. ## Further reading - [Bulletproofs: Short Proofs for Confidential Transactions and More](https://eprint.iacr.org/2017/1066) — Bünz, Bootle, Boneh, Poelstra, Wuille, Maxwell (IEEE S&P 2018) — the original. - [Bulletproofs+: Shorter Proofs for a Privacy-Enhanced Distributed Ledger](https://eprint.iacr.org/2020/735) — Chung, Han, Hwang, Kim, Lee (2020) — the ~15% smaller refinement. - [dalek-cryptography/bulletproofs](https://github.com/dalek-cryptography/bulletproofs) — the canonical Rust implementation; constant-time, audited. - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on what we're committing *to*. - [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — sister piece on the hash function inside our circuits. - [Privacy's broadband moment](/blog/privacys_broadband_moment/) — why these primitives shipped together in 2026. - [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — production Rust implementation of the range-check-inside-Groth16 path. --- # Nullifiers without the witchcraft Canonical: https://blog.skill-issue.dev/blog/nullifiers_without_witchcraft/ Description: Nullifier Generation is on the ZERA front page next to Pedersen Commitments and Zero-Knowledge Proofs. The Rust + TypeScript implementations are six lines apiece. Here is what they actually do, and why the design borrows from Zcash. Published: 2026-04-02T15:30:00.000Z Tags: zera, cryptography, nullifier, poseidon, zcash, zk, solana The [zeralabs.org](https://zeralabs.org) front page lists three "Cryptographic Innovations": **Pedersen Commitments**, **Zero-Knowledge Proofs**, **Nullifier Generation**. I wrote about [the first one](/blog/pedersen_commitments_in_production/) already. The second one is what makes the protocol work at all — Groth16 over BN254, the fast lane that lets ZK leave the laboratory. This post is the third. Nullifier Generation sounds like a wizard's incantation. In practice, on a privacy chain, it is the most boring possible thing: a hash, with an exact and well-known input set, computed at exactly one moment in the lifecycle of a note. The reason it gets a top-line marketing slot is not because the math is exotic. It's because nullifiers are the entire reason a privacy pool can prevent double-spending without revealing which note got spent. They are the load-bearing trick. If you understand them, you understand UTXO-style ZK. ## What a nullifier is, in one sentence A nullifier is a hash of two things — a secret only the owner of a note knows, and the on-chain commitment of that note — published once, when the note is spent, so the chain can refuse a second spend without learning anything else about the note. That sentence has every piece. The owner has a secret. The chain has a commitment. The owner spends, reveals the hash of (secret, commitment), and the chain stamps "spent" next to that hash. If anyone else, ever, tries to spend the same note, they will produce the same hash. The chain notices and rejects. The reason this matters: in a transparent UTXO system (Bitcoin, original Solana SPL), the chain knows which UTXO got spent because it sees the input. In a shielded system, the chain doesn't know which note got spent — that's the whole point of the privacy layer — so we need a way for the chain to refuse double-spends *without learning the identity of the spent note*. Nullifiers are that way. ## The Zcash inheritance This is not a ZERA invention. The nullifier construction goes back to Zcash Sprout (2016) and the Sapling upgrade (2018), and the [Zcash protocol specification](https://zips.z.cash/protocol/protocol.pdf) is still the canonical reference. In Sapling, the nullifier of a note is `PRF^nf(nk, ρ)` where `nk` is the spending key and `ρ` is a per-note nonce derived from the note's commitment. The construction has two essential properties: 1. **Deterministic given the secret material.** The same note always produces the same nullifier, so a second spend is detectable. 2. **Unlinkable without the secret material.** An observer who sees the commitment cannot derive the nullifier; only the owner of the spending key can. ZERA's construction is the same idea, simplified for the deployment surface. Sapling has a richer key tree (`ask`/`nsk`/`nk`/`ovk`/`ivk`) because it ships viewing keys, expiry windows, and a separate proof-spend key. ZERA's MVP keeps the same roles inside one `secret` field per note. If the protocol grows a viewing-key abstraction (and it will — see the wallet's HKDF-derived viewing keys in [the v3 wallet post](/blog/zera_wallet_v3_zkp/)), the nullifier construction can absorb that without breaking, because the input set is `Poseidon(secret, commitment)` and `secret` is the part that gets specialised. ## The six lines of TypeScript Open [`packages/sdk/src/note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) in the SDK and search for `computeNullifier`. The whole function is: ```ts /** * Compute the nullifier for spending a note. * * ``` * nullifier = Poseidon(secret, commitment) * ``` */ export async function computeNullifier( secret: bigint, commitment: bigint, ): Promise { return poseidonHash([secret, commitment]); } ``` That's it. Two field elements in, one field element out, one Poseidon call in the middle. The accompanying tests in [`note.test.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.test.ts) are equally bare: ```ts describe("computeNullifier", () => { it("returns a deterministic bigint", async () => { const note = createNote(100n, 1n); const commitment = await computeCommitment(note); const a = await computeNullifier(note.secret, commitment); const b = await computeNullifier(note.secret, commitment); expect(a).toBe(b); }); it("different secrets produce different nullifiers", async () => { const note1 = createNote(100n, 1n); const note2 = createNote(100n, 1n); const commitment = await computeCommitment(note1); const n1 = await computeNullifier(note1.secret, commitment); const n2 = await computeNullifier(note2.secret, commitment); expect(n1).not.toBe(n2); }); }); ``` The first test asserts determinism — same inputs, same output, every time. The second asserts independence — two notes with the same `(amount, asset)` but different secrets must produce different nullifiers, otherwise the privacy property collapses. The Rust mirror lives at [`crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs) — same shape, same Poseidon, same input order. The whole point of having two implementations under one cross-validated test vector ([see the cryptography doc](https://github.com/Dax911/zera-sdk/blob/main/docs/CRYPTOGRAPHY.md)) is that the host language never matters. JS in the wallet, Rust in the on-chain program, Rust-via-Neon in Node consumers — all four pipelines have to agree on the byte representation of `Poseidon(secret, commitment)`. They do, because the test vectors say so on every CI run. ## Why `secret` and `blinding` are different fields The note struct from [`note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) has two random fields: ```ts return { amount, asset, secret: randomFieldElement(), blinding: randomFieldElement(), memo: memo ?? [0n, 0n, 0n, 0n], }; ``` I noted this in [the Pedersen post](/blog/pedersen_commitments_in_production/) but it's worth restating in nullifier-context: the `secret` is what the nullifier depends on. The `blinding` is what gives the *commitment* its hiding property. They are separated because they fail differently. If `blinding` leaks (say, via a buggy memo encryption scheme), the worst case is the commitment becomes enumerable for small `amount` spaces. Bad, recoverable. If `secret` leaks, the nullifier becomes predictable, which means an attacker can stamp the chain with the nullifier *before* the legitimate owner does, and the legitimate spend gets rejected as a double-spend. This is the worst possible failure mode in a privacy pool. The note becomes unspendable. Sampling them as independent 248-bit field elements means an attacker who compromises one does not get the other for free. The cost is ~62 bytes of additional state per note. The benefit is decorrelating the two failure modes that would otherwise chain. ## The lifecycle, in one diagram ``` 1. CREATE (off-chain) note = createNote(amount, asset) ├── secret = randomFieldElement() // private, kept by owner └── blinding = randomFieldElement() // private, kept by owner 2. COMMIT (on-chain, via deposit or transfer-output) commitment = Poseidon(amount, secret, blinding, asset, memo[0..3]) --> commitment is appended to the on-chain Merkle tree at leafIndex 3. HOLD (off-chain, in the wallet) Owner stores { note, commitment, leafIndex, nullifier? } locally. Nullifier may be precomputed but is NOT yet on-chain. 4. SPEND (on-chain, via withdraw or transfer-input) nullifier = Poseidon(secret, commitment) proof = Groth16( public: nullifier, root, recipientHash, amount, asset private: secret, blinding, memo, leafIndex, merkle_path ) --> on-chain program checks: a. proof verifies under verifying_key b. nullifier_pda(nullifier) does not yet exist c. root matches a recent on-chain root --> if all pass, program creates nullifier_pda(nullifier). 5. REJECT (any future spend attempt with the same nullifier) nullifier_pda(nullifier) exists --> program returns "DoubleSpendDetected" without ever learning which note it was. ``` The reject step is the magic. The on-chain program does not know which note is being respent. It does not know which leaf in the Merkle tree the nullifier corresponds to. It only knows that a PDA seeded by the nullifier hash already exists, and it refuses to recreate it. From [`packages/sdk/src/pda.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/pda.ts), the seed shape is `["nullifier", nullifierBytes32]`: ```ts export const NULLIFIER_SEED = "nullifier"; ``` Each nullifier on the chain is a 32-byte BN254 field element packed into a PDA. PDAs are cheap on Solana, but they are not free, and the rent-exempt minimum balance for a tiny PDA is the actual cost of "stamping the chain with a nullifier." It is sub-cent on devnet and mainnet alike. That is the cost of double-spend protection in this design. ## What the circuit actually proves The transfer circuit (one-input, two-output) and the withdraw circuit (one-input, one-recipient-hash) both compute the nullifier *inside the circuit* from the witness and assert equality with the public input. From [`packages/sdk/src/prover.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/prover.ts): ```ts const input = { // Public root: tree.root.toString(), nullifierHash: nullifierHash.toString(), recipient: recipientHash.toString(), amount: note.amount.toString(), asset: note.asset.toString(), // Private secret: note.secret.toString(), // ... }; ``` The circuit's predicate, in pseudocode: ``` 1. computed_commitment = Poseidon(amount, secret, blinding, asset, memo[0..3]) 2. computed_nullifier = Poseidon(secret, computed_commitment) 3. assert computed_nullifier == nullifierHash (public input) 4. assert Merkle(root, leafIndex, path) == computed_commitment 5. assert amount, asset bind to public inputs ``` That's the whole privacy proof. The chain learns the nullifier and the new output commitments. It does not learn the amount inside, the asset, the original commitment, or the leaf index. The nullifier is the only piece of identifying information leaked, and the only thing it identifies is *itself* — there is no on-chain link from nullifier back to commitment without breaking the hash. This is also why the `secret`-as-witness matters. If the nullifier could be derived from public information alone, anyone could replay it. The privacy story collapses. Because the secret is sampled per-note and is part of both the commitment witness and the nullifier preimage, only the holder of the secret can produce the proof. That binding is what stops one user from frontrunning another's spend. ## What an attack looks like, briefly There are exactly two things an attacker can try, and they both lose. **Attack 1: precompute someone's nullifier and stamp the chain first.** Requires `secret`. Without it, you can't compute `Poseidon(secret, commitment)`. The note's `secret` is sampled with 248 bits of CSPRNG entropy and reduced mod the BN254 prime, so brute-force is not on the table. Mitigation: the keystore in [the wallet](/blog/zera_wallet_v3_zkp/) keeps the secret in Rust, behind a ChaCha20-Poly1305 layer derived from an Argon2id-hardened password, and never lets it touch JavaScript. **Attack 2: replay a nullifier from a previous valid spend.** This is the "spam the chain with old nullifiers" attack. It loses immediately because the on-chain program checks for PDA existence on every spend, and an existing PDA is exactly the signal "this nullifier has been seen before, reject." There is no clever ordering that gets around this — the PDA is monotonically created. The thing that's not in the threat model: a global attacker who can correlate metadata about *when* spends happen. That's a network-layer problem, not a cryptographic one. Tor-style mixing, relayer rotation, and the [voucher / private-cash](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/voucher.ts) flow are all the answer to that, and they are deliberately layered on top of the nullifier system rather than baked into it. ## Why this matters for the marketing pillar [zeralabs.org](https://zeralabs.org) ships "Anti-Double Spending" as one of its six pillars, alongside True Offline Payments, Cryptographic Privacy, Perfect Divisibility, Secure Enclaves, and Solana Speed. Anti-Double Spending and Cryptographic Privacy *both* live or die on this construction. The pillar is real because the construction is real. It is not a stitched-together promise that turns into a complicated multi-party signing scheme later. It is one Poseidon hash and one PDA, and it has been the right answer since 2016. The boring answer is the right answer. The marketing word is "Nullifier Generation." The implementation is six lines. The reason it sits next to Pedersen Commitments on the front page is that without it, the privacy pool is just a private deposit box you can drain twice. ## Further reading - [zeralabs.org](https://zeralabs.org) — the "Cryptographic Innovations" section names this construction. - [`packages/sdk/src/note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) — `computeNullifier` and friends. - [`packages/sdk/src/note.test.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.test.ts) — the determinism and independence tests. - [`packages/sdk/src/prover.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/prover.ts) — where the nullifier becomes a public input. - [`packages/sdk/src/pda.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/pda.ts) — the `NULLIFIER_SEED` constant and PDA derivation. - [Zcash Protocol Specification](https://zips.z.cash/protocol/protocol.pdf) — the prior art this construction is descended from. - [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — the sibling-cryptography post. - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — where these primitives first landed in the SDK monorepo. - [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the founding letter that sets up why these primitives are the line where ZK leaves the lab. --- # Pedersen commitments, in production Canonical: https://blog.skill-issue.dev/blog/pedersen_commitments_in_production/ Description: ZERA marketing says "Pedersen Commitments" on the cryptography page. The SDK ships Poseidon. Both are right — and the gap between them is the whole story of what shipping ZK in 2026 actually looks like. Published: 2026-04-01T20:36:33.000Z Tags: zera, cryptography, pedersen, poseidon, bn254, rust, zk The [ZERA Labs site](https://zeralabs.org) lists three "Cryptographic Innovations" on the front page: **Pedersen Commitments, Zero-Knowledge Proofs, Nullifier Generation.** If you read the SDK, you will not find a function called `pedersenCommit`. You will find `computeCommitment` and behind it a Poseidon hash. The first time someone asked me to reconcile the two, I gave a bad answer. This post is the answer I should have given. A Pedersen commitment, in the textbook sense, is `C = a·G + b·H` where `G` and `H` are independent elliptic-curve generators, `a` is the value, and `b` is the blinding factor. The construction is **homomorphic** (you can add commitments and you add their values) and **perfectly hiding under the discrete-log assumption** (without the blinding factor, the commitment leaks zero information about the value). Bitcoin's Confidential Transactions used Pedersen commitments. So did the original Zcash Sprout for note values. They are the canonical "I'm hiding a number" primitive in ZK literature. What ZERA ships is not that. What ZERA ships is a **Poseidon-based commitment** — a hash-based commitment that hides the same set of fields (`amount, asset, secret, blinding, memo[0..3]`) and is binding under the collision-resistance of Poseidon. The marketing copy keeps the word "Pedersen" because that's the term-of-art for the *role* — a hiding, binding commitment to a confidential note. The implementation is the right primitive for the deployment target, which is Solana, which has a `sol_poseidon` syscall, which means Poseidon costs us a few thousand compute units and Pedersen would cost us hundreds of thousands. This post walks the why, the what, and the receipts. ## What we actually wanted from "Pedersen" Strip the construction down to the requirement. A note commitment in a shielded pool has to be: 1. **Hiding.** Given the on-chain commitment, no observer can recover the amount, secret, blinding, asset, or memo. 2. **Binding.** Once posted, the depositor cannot later "open" the commitment to a different note. 3. **Cheap inside a circuit.** The prover needs to recompute the commitment from the private inputs and assert equality with the public input. Every constraint there shows up in proving time and `.zkey` size. 4. **Cheap on-chain.** The settlement layer recomputes hashes whenever the Merkle tree advances. If that primitive is expensive, every deposit is expensive. Pedersen on `bn254` G1 nails (1) and (2) but blows (3) and (4). Each scalar multiplication inside a Groth16 circuit is hundreds of constraints. On-chain, you'd be paying for elliptic-curve group ops on every leaf hash. Solana's compute-unit budget is generous but not infinite, and the on-chain Merkle tree is the hottest piece of state in the protocol. Poseidon flips that. It's a permutation-based hash specifically designed for ZK circuits — `x^5` S-boxes, eight full rounds, partial rounds chosen for the field. The 2-to-1 variant we use for Merkle nodes costs us *dozens* of constraints, not hundreds. And on-chain, Solana provides it as a syscall that sips compute units. The hiding/binding properties come from collision-resistance of the hash and the fresh random `blinding` factor on every note. So the engineering choice was: keep the *role* of a Pedersen commitment, swap the *primitive* for one that fits the deployment surface. Cypherpunk purity loses to compute units every time. ## The Rust core that everything else has to agree with The canonical implementation lives in [`crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs). The crate documentation is intentionally clinical: ```rust //! Note primitives for the ZERA shielded pool. //! //! A **Note** represents a confidential UTXO inside the pool. It carries an //! amount, asset identifier, a secret (private key material), a blinding //! factor, and an optional 4-element memo field. //! //! The note commitment is computed as: //! //! commitment = Poseidon(amount, asset, secret, blinding, memo[0..3]) //! //! The nullifier is: //! //! nullifier = Poseidon(secret, commitment) ``` The shape of the `Note` struct enforces the contract: ```rust #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, BorshSerialize, BorshDeserialize)] pub struct Note { /// Token amount in the smallest denomination (e.g. USDC lamports). pub amount: u64, /// Asset identifier — typically `pubkey_to_field_bytes(mint.to_bytes())`. pub asset: [u8; 32], /// Secret key material (random 32 bytes). **Must be kept private.** pub secret: [u8; 32], /// Blinding factor for the Pedersen-like commitment (random 32 bytes). pub blinding: [u8; 32], /// Optional 4-element memo field (each element 32 bytes). pub memo: [[u8; 32]; 4], } ``` Two things to notice. First, the doc comment for `blinding` literally says *"Pedersen-like."* That's the gap I described in the intro, written into the source for anyone who knows enough to look. Second, the secret and the blinding are sampled separately. They serve different roles: `secret` derives the nullifier, `blinding` derives the hiding property. If they were the same value, an attacker who learned the nullifier preimage would also unmask the amount. Sampling them independently is the cheap way to keep those failure modes from chaining. The compute function: ```rust pub fn compute_commitment(note: &Note) -> Result<[u8; 32]> { let amount_fr = Fr::from(note.amount); let asset_fr = Fr::from_be_bytes_mod_order(¬e.asset); let secret_fr = Fr::from_be_bytes_mod_order(¬e.secret); let blinding_fr = Fr::from_be_bytes_mod_order(¬e.blinding); let memo0_fr = Fr::from_be_bytes_mod_order(¬e.memo[0]); let memo1_fr = Fr::from_be_bytes_mod_order(¬e.memo[1]); let memo2_fr = Fr::from_be_bytes_mod_order(¬e.memo[2]); let memo3_fr = Fr::from_be_bytes_mod_order(¬e.memo[3]); let inputs = [ amount_fr, asset_fr, secret_fr, blinding_fr, memo0_fr, memo1_fr, memo2_fr, memo3_fr, ]; let h = poseidon_hash(&inputs)?; Ok(field_to_bytes32_be(&h)) } ``` That is the entire commitment. Eight field elements, one Poseidon, 32 bytes out. The `Fr::from_be_bytes_mod_order` is the unglamorous load-bearing call — it reduces a 32-byte big-endian array into the BN254 scalar field by modular reduction, which is the only way to ensure the JavaScript SDK and the Rust crate agree on the byte representation of a value that might exceed the field. The Solana on-chain program does the same thing in the same direction. Get the endianness wrong and your prover and your verifier disagree silently, which is the kind of bug that costs an audit cycle. ## Why three implementations of the same hash exist If you grep the SDK, you find Poseidon implemented (or wrapped) four times: - `crates/zera-core/src/poseidon.rs` — Rust, via [`light-poseidon`](https://crates.io/crates/light-poseidon) `new_circom`. - `packages/sdk/src/crypto/poseidon.ts` — TypeScript, via `circomlibjs`. - `crates/zera-neon/` — Neon binding so Node can call the Rust core. - The on-chain program — Solana's `sol_poseidon` syscall. That's four entry points to the same hash function, and they all have to produce the same 32 bytes for the same inputs or the protocol falls over. The reason for the proliferation is platform: snarkjs in the browser wants a JS hash, the on-chain program wants a syscall, the Rust core wants no JS dependencies, and Node consumers benefit from native performance. The SDK's `docs/CRYPTOGRAPHY.md` enumerates the cross-validation: > All four are verified to produce the same output for known test vectors: > > ``` > Poseidon(0, 0) = 14744269619966411208579211824598458697587494354926760081771325075741142829156 > Poseidon(1, 2) = 7853200120776062878684798364095072458815029376092732009249414926327459813530 > ``` Those two test vectors are the cheapest possible smoke test that the four implementations agree at the byte level. They are run in CI on every commit. If any of them drift — different parameter set, different endianness, different round constants — the Vitest run goes red instantly and we don't ship. ## The hiding argument, written out once The reason a hash with a fresh random blinding factor is hiding has nothing to do with Poseidon being magical. It's the same argument that justifies any hash-based commitment. Given `H(amount, asset, secret, blinding, memo)` and the value `amount`, the attacker has to find `(secret', blinding', memo')` such that `H(amount, asset, secret', blinding', memo') == commitment`. Because Poseidon is collision-resistant and the input space of `(secret, blinding)` is `2^254 × 2^254`, this is computationally infeasible. Without `blinding`, the commitment would be enumerable for small amount spaces — an attacker could precompute `H(0, asset, ...)`, `H(1, asset, ...)`, etc. With it, the precomputation is impossible. The binding argument is the dual: to "open" the commitment to a different `amount'`, the attacker has to find a Poseidon collision. This reduces to the same hardness assumption. This is the same contract the textbook Pedersen commitment provides, with a different cryptographic primitive backing it. The marketing word "Pedersen" is therefore not wrong, just collapsed. The role is identical. The construction is platform-appropriate. ## What Poseidon costs us Poseidon is younger than SHA-256 and has received less cryptanalytic attention. The SDK's [SECURITY.md](https://github.com/Dax911/zera-sdk/blob/main/docs/SECURITY.md) is honest about this: > Poseidon has been analyzed extensively in the academic literature. No practical attacks are known for the parameter sets used by circomlib. However, Poseidon is relatively new compared to SHA-256 and has received less cryptanalytic attention. That's the right tone. The construction is sound, the parameter set is the one the entire ZK ecosystem uses, and the cryptanalysis pipeline is active and global. But Poseidon is a hash function in motion, and we should expect adjustments — Reinforced Concrete, Rescue, the next variant — to land over the next few years. The SDK is structured so the hash function is a single Rust module and a single TypeScript module. If we ever have to migrate, it's a contained change with a clear cross-validation surface. The other thing Poseidon costs us, less obvious: it removes the homomorphic property of textbook Pedersen. You cannot add two Poseidon commitments and get a commitment to the sum. That property is what made Pedersen useful for *aggregate* confidential transactions in older protocols. ZERA does not need it, because the value-conservation check is enforced *inside the transfer circuit* (`inAmount == outAmount1 + outAmount2`), not by adding commitments outside the circuit. Different design point, different primitive. ## Why this matters for what ZERA is If you read the [Why I started Zera Labs](/blog/why_i_started_zera_labs/) letter, the founding bet is that ZK is finally fast enough, cheap enough, and verifiable enough to leave the laboratory. The "cheap enough" leg is exactly the trade-off this post describes. We do not get to ship a privacy pool to mainstream users at 1¢ per transfer if we spend 200,000 compute units per Merkle node hash. Poseidon is the engineering choice that turns ZK from a research demo into a checkout button. The ZERA Labs front page says "Pedersen Commitments" because the audience is people who want to know we have hiding/binding commitments to confidential notes. The SDK ships Poseidon because that's the implementation that makes the commitment cheap. Both are true, and the gap between them is the part of the work nobody sees. ## Further reading - [zeralabs.org](https://zeralabs.org) — Cryptographic Innovation pillar (Pedersen Commitments / Zero-Knowledge Proofs / Nullifier Generation). - [zera-sdk `crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs) — the canonical Rust implementation. - [zera-sdk `docs/CRYPTOGRAPHY.md`](https://github.com/Dax911/zera-sdk/blob/main/docs/CRYPTOGRAPHY.md) — the cross-implementation invariant spec. - [zera-sdk `docs/SECURITY.md`](https://github.com/Dax911/zera-sdk/blob/main/docs/SECURITY.md) — threat model + cryptographic assumptions. - [light-poseidon crate](https://crates.io/crates/light-poseidon) — Rust implementation we depend on. - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — where the four-implementation invariant first landed. - [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the "fast enough, cheap enough" thesis. - [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the privacy thesis these primitives implement. --- # 144 Tests and a Surfpool Devnet Canonical: https://blog.skill-issue.dev/blog/zera_sdk_test_suite/ Description: How the Zera SDK got from "scaffolded" to "trustable" — a 144-test Vitest suite, a Surfpool-forked devnet running on a Latitude box, and a quickstart that actually works. Published: 2026-03-31T14:40:54.000Z Tags: zera, typescript, testing, vitest, sdk, devnet, surfpool, solana > "add comprehensive test suite for sdk (144 tests)" That's [`80927`](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c), 2026-03-31. Three weeks after [the day-one scaffolding](/blog/zera_sdk_scaffolding/) shipped, the Zera SDK had 13 test files and 144 individual test cases, all passing under Vitest. Twenty-four hours after that, [`e350707`](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6) added a working hosted devnet, a quickstart guide, and the first end-to-end demo. This post is about the bridge between "the code exists" and "you can use it without reading the source." ## The shape of the test suite The 13 test files mirror the SDK's 13 modules: ``` constants.test.ts crypto/keccak.test.ts crypto/poseidon.test.ts merkle-tree.test.ts note-store.test.ts note.test.ts pda.test.ts prover.test.ts tx/deposit.test.ts tx/transfer.test.ts tx/withdraw.test.ts utils.test.ts voucher.test.ts ``` The reason there's exactly one test file per source file: it's the easiest possible discipline to enforce. Open `note.ts`, see `note.test.ts`, expect coverage. Open `prover.ts`, see `prover.test.ts`, expect coverage. The moment you start putting "shared utility tests" in `helpers.test.ts`, you lose the ability to look at a file and know whether it's tested. ## A test that catches a regression I couldn't have predicted From [`merkle-tree.test.ts`](https://github.com/Dax911/zera-sdk/blob/809274f5d2f8d3708cb09f6a353fec889994d59c/packages/sdk/src/merkle-tree.test.ts): ```ts describe("MerkleTree", () => { it("initializes empty hashes correctly", async () => { const tree = await MerkleTree.create(SMALL_HEIGHT); // emptyHashes[0] should be 0 (empty leaf) expect(tree.emptyHashes[0]).toBe(0n); // emptyHashes[1] should be hash(0, 0) const expected1 = await poseidonHash2(0n, 0n); expect(tree.emptyHashes[1]).toBe(expected1); }); it("root of empty tree matches the top-level empty hash", async () => { const tree = await MerkleTree.create(SMALL_HEIGHT); let expected = 0n; for (let i = 0; i < SMALL_HEIGHT; i++) { expected = await poseidonHash2(expected, expected); } expect(tree.getRoot()).toBe(expected); }); }); ``` The "empty hashes" test is the one I'm proudest of. The empty-tree root is one of the most important invariants in a privacy pool: every fresh pool starts with this value, and the on-chain program initializes its tree to this value. If the off-chain SDK and the on-chain program disagree by a single bit, the very first deposit fails because the witness path doesn't reconcile to the root the program wrote at init. Without this test, that bug shows up the first time a real user tries to deposit. With this test, it shows up in CI in 220ms. ## Test height ≠ production height Note the constant at the top of the same file: ```ts const SMALL_HEIGHT = 4; ``` The production `TREE_HEIGHT = 24`. Building a tree of height 24 in tests is doable but slow — Poseidon over 16M empty-hash slots means tens of seconds per test. Height 4 is 16 leaves. The properties under test (root recomputation, leaf indexing, witness path consistency) are agnostic to height. Test the small case in milliseconds, trust the algebra to scale. ## Devnet via Surfpool The next major commit is [`e350707`](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6) on 2026-04-01: **`add devnet infrastructure, quickstart guide, and fix shielded pool program ID`**. This is the commit where I stopped saying "tests pass" and started saying "you can run this." The devnet is a Surfpool-forked mainnet — a 1:1 fork of Solana mainnet state with Light Protocol ZK Compression and the Zera shielded pool program deployed on top. From [`devnet/SETUP.md`](https://github.com/Dax911/zera-sdk/blob/e350707ba47247f1ec1feac439267d11848bfde6/devnet/SETUP.md): | Service | URL | Description | |---------|-----|-------------| | Solana RPC | `http://64.34.82.145:18899` | JSON-RPC (forked from mainnet) | | WebSocket | `ws://64.34.82.145:18900` | Real-time subscriptions | | Surfpool Studio | `http://64.34.82.145` | Dashboard UI (basic auth) | Why Surfpool over `solana-test-validator`? Two reasons: 1. **It forks mainnet state.** The shielded pool needs to interact with real USDC and SPL token mints. A vanilla test-validator would have me re-creating those by hand. Surfpool just snapshots them. 2. **Light Protocol's whole stack is already deployed on mainnet.** Forking gives me the real programs at the real addresses, not stubs. A Latitude box hosts the public devnet 24/7. Local devnets work too: ```bash cd devnet surfpool start --manifest-file-path ./txtx.yml \ --rpc-url "https://api.mainnet-beta.solana.com" ``` `txtx.yml` contains the deploy runbooks. `accounts_dump/zera_pool.json` and `zera_pool.so` are the snapshot of the on-chain pool program's state. The whole devnet boots in under 30 seconds on a fresh box. ## The bug the devnet caught The same commit message says **"fix shielded pool program ID."** That bug is the entire reason this commit exists. The SDK's `SHIELDED_POOL_PROGRAM_ID` constant in `constants.ts` was wrong — it pointed at a stale program ID from an early devnet deploy. Every transaction the SDK built was sent to a program that didn't exist anywhere. Tests pass because tests use mocked PDAs. Devnet caught it the moment a real `buildDepositTransaction` got submitted. This is the point of having a devnet at all. Unit tests will tell you that your math is consistent. They will not tell you that your program ID is wrong. Only an end-to-end submission against a real cluster catches that. ## What this taught me The test-to-deploy gap is the most expensive interval in any SDK's lifecycle. You can have 144 passing tests and still ship a constant pointing at the wrong program. The fix is not "more unit tests." The fix is one end-to-end test that submits a real transaction to a real cluster and asserts on the response. Surfpool made that possible without a public faucet, without a public RPC, without leaking devnet state to the world. The other thing this taught me: a 144-test suite for a ~3000-line SDK is roughly the right ratio. Less and you can't refactor with confidence. Much more and you're testing the language. Vitest's parallel runner means the whole suite finishes in ~2 seconds locally; CI runs it on every PR and the latency cost of a regression stays close to zero. ## Trade-offs **Why Vitest over Jest?** Native ESM, Vite-aligned config, faster start time. Jest's ESM story has improved but it still feels like a port. Vitest is the default if your project is already in the Vite/Bun half of the ecosystem. **Why ship a hosted devnet at all?** Because partners and collaborators are not going to install Surfpool on day one. Giving them an HTTP endpoint that's already up is the difference between "I'll try it next week" and "I'm trying it right now." **Why basic auth on the Studio dashboard?** Because it's a debug UI, not a public service, and exposing the validator state to anonymous internet traffic is a slow rug. ## Further reading - [The 144-test commit](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c) - [Devnet + quickstart commit](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6) - [Surfpool documentation](https://surfpool.run) - [Vitest](https://vitest.dev/) - [Day-one SDK scaffolding](/blog/zera_sdk_scaffolding/) — what these tests are testing. --- # Building the ZERA Wallet for desktop, iOS, and Android Canonical: https://blog.skill-issue.dev/blog/zera_wallet_three_platforms/ Description: Three platforms, one shielded pool, one design system. The trade-offs of building a wallet that has to feel like cash on a phone, like a tool on a laptop, and the same on both. Published: 2026-03-25T05:13:33.000Z Tags: zera, wallet, react, typescript, mobile, ux Most wallet posts start with the cryptography. This one starts with the part that is harder. The cryptography is solved. We have [Pedersen commitments](/blog/pedersen_commitments_in_production/), [nullifiers](/blog/nullifiers_without_witchcraft/), Groth16 proofs that run in human-tolerable time, and a [SDK with the right asymmetric MCP surface](/blog/mcp_server_inside_zera_sdk/). The hard problem is the one that does not appear in any cryptography paper: **how do you make a wallet that feels like cash on a phone, like a tool on a laptop, and the same product on both?** This is the part of [Zera Wallet](https://wallet.zeralabs.org) that nobody quotes the marketing copy of. It is also the part that takes the most code. ## Three platforms is two too many — except it isn't The temptation when you launch a wallet in 2026 is to ship "mobile first" and let the desktop experience be a responsive cousin. There is a real argument for this: the median crypto user holds their assets on a phone, the offline-P2P story is a phone story (you tap two phones, you don't tap two laptops), and the mobile design constraints force discipline. We almost did that. The reason we didn't is the user we kept seeing in customer development: the *operator.* Operators are the people who run the treasury for a Zera-using business, who hold the cold-storage keys, who reconcile the books at end-of-quarter. They live on laptops. They want a wallet that gives them dense information — a real table of unspent notes, sortable, filterable, exportable to CSV. They are not an edge case; they are the customer who pays. So we ended up with the same product on three platforms with deliberately different information density: - **Mobile** — single-task, gesture-driven, big tap targets, NFC pairing for offline P2P, Face ID / biometrics gate on every send. - **Desktop** — multi-pane, keyboard-first, dense tables, hardware-key signing, multi-account view, CSV export. - **iOS / Android** — same Mobile UX, native share sheets, native NFC stack, platform-specific Secure Enclave integration. The thing that makes this tractable is that the *primitives* underneath are identical. Same shielded pool. Same SDK. Same Merkle tree. The wallet is just three different lenses over the same state. ## The reference UX lives in `zera-wallet-demo` Before the production wallet got a single line of code, the [`zera-wallet-demo`](https://github.com/Dax911/zera-wallet-demo) repo was running. It was — and still is — the canonical reference for what the wallet should *feel* like. From [the v3 ZKP commit log](/blog/zera_wallet_v3_zkp/) it is clear we iterated on the mental model in the demo dozens of times before committing to the production shape. The demo's package.json is a fair list of the bets we made: ```json "dependencies": { "@solana/wallet-adapter-react": "^0.15.35", "@solana/wallet-adapter-react-ui": "^0.9.35", "@solana/web3.js": "^1.98.0", "framer-motion": "^11.11.17", "lucide-react": "^0.468.0", "react": "^19.0.0", "react-router-dom": "^7.1.0", "sonner": "^1.7.1", "tailwind-merge": "^2.6.0", "zeraswap-sdk": "workspace:*" } ``` A few of those choices deserve specific defence. ### React 19 was not the easy call React 19 was barely a year old when we started, and the wallet ecosystem on Solana is full of libraries that were tested against React 17 and 18 and quietly assume hooks behave a specific way. We took the upgrade hit because the Server Components story changes how we think about *this is sensitive data, do not render it client-side* — even though we mostly use it from the client side, the discipline of marking which components touch keys and which do not made the security model cleaner. ### `framer-motion` for trust signals You do not normally find a high-end animation library in a wallet codebase. We use it for one specific thing: the "send confirmed" state. The transition between *"you have authorised this send"* and *"this send is final on chain"* is a moment of maximum user anxiety. A jarring instant flip from a button to a checkmark looks like the app glitched. A 350ms eased fade-in with the prior state visible underneath, settling into a green check, looks like the app is doing something. The animation is the *trust signal.* `framer-motion` makes that easy to ship and impossible to do badly. We honour `prefers-reduced-motion` everywhere. The animation is decoration, not load-bearing. ### `sonner` for toasts Most toast libraries on React are ugly or overengineered. `sonner` is what happens when someone with taste shipped a toast library and called it done. The fact that it stacks gracefully and gets out of the way is the entire pitch. ### `lucide-react` for icons, no exceptions Across the entire Zera codebase — wallet, SDK, [zera-med](/blog/zera_med_zk_fhir/), [zeraswap](/blog/zeraswap_compressed_amm/), even [this blog](/blog/why_i_started_zera_labs/) — every single icon is from Lucide. One pack, one stroke weight, one optical alignment. This is the kind of decision that costs nothing to make in week one and is impossible to retrofit in year two. ## The mobile drawer, ported You can see the design language travel between repos in the [responsive HUD work on `zera_med_demo`](/blog/zera_med_responsive_hud/) — the mobile drawer that ships in `zera-wallet-demo` is the same component, give or take a tag, that we shipped in the medical-records demo two months earlier. That is what a design system is *for.* Not the Tailwind tokens, not the icon pack, but the muscle memory of "we have already solved 'phone with a sidebar that needs to also work on desktop.'" ## What the production wallet adds The demo is the lab. The production wallet adds three things the demo deliberately does not: 1. **Hardware-key signing** — Ledger and (TODO: Dax confirm — Trezor support is in progress) — for the operator desktop case. The demo signs entirely in the browser; the production app refuses to broadcast a transaction whose proof was constructed without a hardware-signed approval. 2. **Native iOS / Android shells** — TODO: Dax confirm exact framework choice (Tauri vs. React Native vs. native). The demo runs in a browser; the production app needs Secure Enclave access and the platform NFC stack, which means a real native shell. 3. **Compliance hooks** — for the venues that need them. ZERA is token-agnostic and venue-flexible. The wallet has a clean integration point for permissioned KYC layers without making them mandatory for the protocol. Reasonable people can disagree about how much compliance belongs at the wallet edge; we ship the surface and let the customer choose. ## The question I get asked the most > *Why a wallet at all? Isn't ZERA an SDK story?* The SDK is for developers. The wallet is for everyone else. **You cannot ship privacy as a primitive that only protocol engineers can integrate.** If we want a unified shielded pool to be the default for stablecoin transfers in 2027, the on-ramp has to be a wallet you can hand to your accountant, your sister, and an autonomous AI agent — and it has to feel obvious to all three. The wallet is the product. The SDK is the *contract.* ## Further reading - [zera-wallet-demo on GitHub](https://github.com/Dax911/zera-wallet-demo) — the reference UX - [wallet.zeralabs.org](https://wallet.zeralabs.org) — production landing - [Zera Wallet v3 ZKP](/blog/zera_wallet_v3_zkp/) — the commit log - [The MCP server inside zera-sdk](/blog/mcp_server_inside_zera_sdk/) - [A Privacy Demo That Works on a Phone](/blog/zera_med_responsive_hud/) — sibling design work - Solana Foundation, *Wallet Standard* — the React-side wallet-adapter contract --- # Zera Wallet v3: ZK Proofs in a Tauri Webview Canonical: https://blog.skill-issue.dev/blog/zera_wallet_v3_zkp/ Description: A Tauri 2 desktop wallet that proves Groth16 in the browser, persists encrypted notes locally, talks NFC to physical bearer cards, and never lets the private key out of Rust. Published: 2026-03-24T15:45:10.000Z Tags: zera, wallet, tauri, rust, react, zk, groth16, nfc The Zera SDK is the engine. The wallet is the car. Three weeks after the SDK shipped, I started building the v3 desktop wallet — Tauri 2 + React 18, with a Rust keystore that never lets the seed touch JavaScript and a webview that runs Groth16 provers in WebAssembly. The initial commit is [`39b5518`](https://github.com/Dax911/zera-wallet-demo/commit/39b55182b349da8896cd841dad753bb162ddcc48) on 2026-03-24. The follow-up — the one that made the wallet actually do anything — is [`660283f` — `ZKP core, real data layer, wallet unlock, note scanning`](https://github.com/Dax911/zera-wallet-demo/commit/660283fe9a16d7f1a471cdf06542f5592bf8ba9f) the same day. The third commit, [`d061813`](https://github.com/Dax911/zera-wallet-demo/commit/d061813aa98d83aa7dfcb59f0a7ce7c5ef3993d2) on 2026-03-25, added P2P send + NFC bearer cards. Three commits, ~3000 lines of meaningful code, full privacy stack. This post is about what's load-bearing in those three commits. ## The trust model: Rust holds the key The hardest design decision in any Tauri wallet is *where the private key lives*. The naive thing is to load it into JavaScript, sign in JS, send. The naive thing leaks the key the first time anything in the JS supply chain ([Rusty Pipes](/blog/rusty_pipes/), say) gets compromised. The right thing is `keystore.rs`: ```rust // src-tauri/src/keystore.rs #[derive(Debug, Clone, Serialize, Deserialize)] pub struct WalletFile { pub version: u32, pub salt: String, // Argon2 salt, hex pub nonce: String, // ChaCha20 nonce, hex pub ciphertext: String, // Encrypted payload (JSON: { seed, entropy }) pub pubkey: String, // base58, unencrypted for display before unlock } struct WalletPayload { seed: String, // 64-byte BIP39 seed for mnemonic, 32-byte raw key otherwise entropy: String, // 16-byte entropy for 12-word recovery key_type: String, // "mnemonic" or "raw_key" } ``` The seed lives in `$APPDATA/zera/wallet.enc`, encrypted with ChaCha20-Poly1305 under a key derived from the user's password via Argon2id. The pubkey is stored in plaintext so the unlock screen can show "Unlock wallet ABC123..." before the user types anything. The frontend never sees the seed. Ever. Sign requests go through a Tauri command: ```rust #[tauri::command] pub async fn sign_and_send_transaction(/* ... */) -> Result { // Decrypt seed using the in-memory unlock key, sign tx, send to RPC, // return signature. Seed is zeroized at end of scope. } ``` If the frontend gets compromised, the worst it can do is request signatures. It cannot exfiltrate the key. ## Importing keys from Phantom and Solflare without breaking the trust model The keystore had to handle three import paths from day one: 1. **Generate a new wallet** — fresh BIP39 mnemonic, derive seed, encrypt, store. 2. **Import a 12/24-word mnemonic** — same as above but seeded by user input. 3. **Import a raw private key** — base58 from Phantom, base64 from Solflare, base58 from `solana-keygen`. Raw 32-byte key gets put in `seed` with `key_type = "raw_key"` so the unlock path knows not to treat it as BIP39 entropy. The viewing-key derivation is web-wallet-compatible — same HKDF schedule the original web wallet used so a user could import the same seed and see the same shielded notes. That backwards-compat constraint cost me a day; without it the wallet would have been quietly incompatible with the SDK's `MemoryNoteStore` semantics in practice. ## Groth16 in a webview The wallet ships [the same circuit files as the web wallet](https://github.com/Dax911/zera-wallet-demo/tree/660283fe9a16d7f1a471cdf06542f5592bf8ba9f/public/circuits): `deposit.wasm`, `deposit_final.zkey`, `withdraw.wasm`, `withdraw_final.zkey`, `transfer.wasm`, `transfer_final.zkey`, plus `relayed_withdraw` variants. The Tauri webview loads them statically, runs `snarkjs.groth16.fullProve`, gets a proof + public signals out, and hands them back to Rust to format for Solana. The split is intentional: - **JS proves.** Because snarkjs is the canonical, audited Groth16 prover for circomlib circuits. - **Rust signs.** Because the seed lives there. The tx flow is therefore: ``` JS: build inputs → snarkjs.fullProve → proof + publicSignals JS: send to Tauri command with proof, commitment, recipient Rust: decrypt seed → build solana tx (using SDK builders) → sign → send Rust: return signature to JS JS: on success, append note to encrypted note store ``` Snarkjs is heavy — about 30s on a cold proof, 5–8s warm — but the alternative is "ship a Rust-native Groth16 prover," which is a multi-week project of its own and which would still need to consume the same `.zkey` artifacts. ## Notes are private. Notes are also a database. A privacy wallet without a note store is just a key manager. Every shielded transaction produces output notes that *only the recipient can decrypt*, and the recipient has to scan the chain to find them. The wallet ships [`src/lib/noteEncryption.ts`](https://github.com/Dax911/zera-wallet-demo/blob/660283fe9a16d7f1a471cdf06542f5592bf8ba9f/src/lib/noteEncryption.ts), which implements ECDH + nacl.box (XSalsa20-Poly1305). The plaintext format is versioned and binary-packed: ```ts // v2: single note — 169 bytes plaintext // [0x02][amount u64 LE][secret 32B][blinding 32B] // [asset 32B][commitment 32B][nullifier 32B] // v3: note pair — 145 bytes plaintext // [0x03][amt1 u64 LE][secret1 32B][blinding1 32B] // [amt2 u64 LE][secret2 32B][blinding2 32B] // Used for splits where both outputs go to the same key. // Packing two notes into one nacl.box saves 265 bytes on-chain. const BINARY_V2_LEN = 1 + 8 + 32 + 32 + 32 + 32 + 32; // 169 const BINARY_V3_LEN = 1 + 72 + 72; // 145 ``` Why not JSON? Two reasons: 1. **Bytes are cheap on Solana, JSON is expensive.** Every byte you encrypt is a byte you store on-chain (or in an encrypted memo). 169 binary bytes compress to about 80% the size of equivalent JSON. 2. **Format versioning is robust.** A leading tag byte (0x02, 0x03) lets older wallets recognize unsupported formats and fall back gracefully instead of decrypting garbage. ## Note persistence The thing nobody warns you about with privacy wallets: **if you lose your local note store, you can only recover funds by scanning the on-chain Merkle tree with your viewing key.** That scan is slow, expensive in RPC calls, and has to be done from scratch every time. So the wallet auto-persists notes to disk: ```ts // src/lib/notePersistence.ts const NOTES_FILE = "zera/notes.json"; const NFC_FILE = "zera/nfc-cards.json"; export async function saveNotesToDisk(notes: any[]): Promise { await mkdir("zera", { baseDir: BaseDirectory.AppData, recursive: true }) .catch(() => {}); await writeTextFile(NOTES_FILE, JSON.stringify(notes, null, 2), { baseDir: BaseDirectory.AppData }); } ``` Notes auto-save on every change and load on startup. The encrypted-at-rest version of this is on the roadmap; for v3 the notes file is plain JSON in `$APPDATA`, which assumes the user trusts their own machine. The next iteration wraps it in the same ChaCha20 layer the keystore uses. ## NFC bearer cards The wallet's most futuristic feature — and the one most likely to feel like sci-fi to anyone who hasn't used it — is NFC bearer cards. From the `d061813` commit message: > NFC page: real shielded notes, arbitrary amounts, custom mint, write pool notes to tags, read tags back into pool The model: take an unspent shielded note from your pool, serialize the encrypted plaintext into an NFC tag's NDEF record, hand the physical card to someone. They tap it on their wallet, the wallet pulls the encrypted blob, decrypts it with their viewing key, and the note becomes theirs. No on-chain transaction at all. The note's nullifier is only revealed when the recipient eventually spends it. This is the "physical cash" path I'd been sketching since the [a better cryptocurrency](/blog/a_better_crypto/) post a year earlier, and the [m0n3y voting proposal](/blog/m0n3y_naming_a_dream/). The wallet shipped it as a real button. PC/SC + Proxmark3 hardware support, both supported in `src-tauri/`. ## Trade-offs **Why Tauri instead of Electron?** Because Electron ships a 200MB Chrome runtime and its security model has been a moving target for years. Tauri's webview + minimal-IPC model gives me the trust boundary I need (Rust ↔ JS) for free. **Why snarkjs in JS instead of a Rust prover?** Because snarkjs is the audited canonical prover for circomlib circuits. Rolling my own Rust prover would have shifted weeks of audit risk onto a Rust crate that nobody else uses. **Why plain JSON note persistence in v3?** Because the alternative was holding the wallet release for an encrypted-at-rest design pass that was already a TODO. v3 ships now, encryption-at-rest of the note store ships in v3.1. **Why ship a viewing-key compatibility layer with the web wallet?** Because the only thing worse than a privacy wallet you can't import into is a privacy wallet that *silently* doesn't import the same notes. Compatibility is a design constraint that has to be in v1 of any new client. ## What this taught me The trust boundary of a wallet is the most expensive surface in the project. Every subsystem you build either reinforces it (Rust holds the seed; JS sees ciphertexts) or breaks it (JS reads the keystore; key escrow services). v3 reinforced. The cost: ~30% of the codebase is the IPC plumbing. The benefit: a [Rusty Pipes](/blog/rusty_pipes/) compromise of the JS supply chain doesn't lose anyone's funds. ## Further reading - [zera-wallet-demo on GitHub](https://github.com/Dax911/zera-wallet-demo) - [Initial v3 commit](https://github.com/Dax911/zera-wallet-demo/commit/39b55182b349da8896cd841dad753bb162ddcc48) - [ZKP core + real data layer](https://github.com/Dax911/zera-wallet-demo/commit/660283fe9a16d7f1a471cdf06542f5592bf8ba9f) - [P2P send + NFC bearer notes](https://github.com/Dax911/zera-wallet-demo/commit/d061813aa98d83aa7dfcb59f0a7ce7c5ef3993d2) - [Tauri 2.x docs](https://v2.tauri.app/) - [snarkjs](https://github.com/iden3/snarkjs) — the Groth16 prover this wallet ships in JS. - [Building the Zera SDK day one](/blog/zera_sdk_scaffolding/) — the engine this wallet drives. --- # x402 Vector 2: partial-signing instruction injection Canonical: https://blog.skill-issue.dev/blog/x402_partial_signing_injection/ Description: The x402 client builds and partially signs the entire VersionedTransaction. A facilitator that validates structure but not bytes can co-sign a tx with extra clawback / drain instructions appended after the legitimate transfer. Published: 2026-03-23T18:00:00.000Z Tags: security, x402, solana, transaction-injection, research import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The trust split in x402 is unusual. The **client** builds the entire VersionedTransaction. The **facilitator** signs as feePayer and submits. The facilitator pays gas; the client picks the recipient, the amount, the compute budget — *and the rest of the instructions*. Most facilitators validate that the tx contains a `TransferChecked` for the agreed-upon (mint, amount, recipient). They do not always validate that **nothing else** is in the tx. That's the bug. Post 3 of the [x402 attack surface series](/series/x402-attack-surface/). ## The trust gap A typical x402 transaction looks like: ``` [0] ComputeBudgetProgram::SetComputeUnitLimit(40_000) [1] ComputeBudgetProgram::SetComputeUnitPrice(5) [2] TokenProgram::TransferChecked(amount=1000, mint=USDC, src, dst) ``` The facilitator's `/verify` endpoint typically: 1. Decodes the tx. 2. Checks `feePayer == self.address`. 3. Loops through instructions; finds the `TransferChecked`; asserts amount + recipient match. 4. Returns 200. What the facilitator usually does **not** check: - The presence of *additional* instructions after the transfer. - Whether the recipient ATA's authority is a token-2022 mint with a transfer hook that calls back into a malicious program. - Whether the `mint` field of the `TransferChecked` matches the protocol-spec'd USDC mint *byte-for-byte* (the SPL token program checks the mint's pubkey but doesn't enforce a particular mint). ## Three injection patterns ### Pattern A: clawback via post-transfer CPI Append an instruction that calls a custom program. The custom program runs with the client's authority on the *destination* ATA — wait, that's not how Solana auth works. Re-think: the client signs only their key. So instructions that are appended cannot use the *facilitator's* authority. They can: - Use the client's keypair (signing already authorized). - Use any account the client controls. - Burn compute units the facilitator pays for. **Useful injection #1 (Pattern A1):** Burn the facilitator's CU budget. Append a 30k-CU compute-burning instruction (e.g., a no-op loop in a custom program). The transfer succeeds at 5k CU; the burn at 30k CU; the facilitator pays for 35k CU instead of 5k CU. Per-tx gas drain magnified ~7×. **Useful injection #2 (Pattern A2):** Force a fail *after* the transfer succeeds. If the facilitator's verify path checks the transfer instruction is present but doesn't simulate the whole tx, an instruction that asserts a false condition (e.g., a custom `assert_value_equal(0, 1)`) **fails the entire transaction** and rolls back the transfer. The client's PAYMENT-RESPONSE looks valid (signed, submitted), the facilitator paid gas, but no token actually moved. Combined with the [settlement race](/blog/x402_settlement_race_condition/), this is monetisable. ### Pattern B: token-2022 transfer hook If the destination ATA is a token-2022 mint *with a transfer hook program*, every transfer to that ATA triggers a CPI into the hook program — which runs with the privileges of the SPL Token-2022 invoker. The client controls the destination. If the client picks a destination ATA on a mint with a hostile transfer hook, the hook runs after the transfer with arbitrary code. The protocol spec says "use USDC", but spec compliance is enforced by the facilitator's validator, not by Solana itself. ### Pattern C: minimum-amount string trick Combined with [Vector 9 (amount string parsing)](/blog/x402_amount_string_parser/): the PAYMENT-REQUIRED says `"1000"` (= $0.001). The client encodes `"01000"` in the SPL transfer ("1000" with a leading zero, which the validator's `parseInt` accepts). The actual on-chain value transferred is `1000` in raw lamports (or whatever the `parseInt` evaluates to in the validator vs the SPL program). Some validators round on parse; some don't. Mismatch = pay less than required. ## PoC sketch ```rust // Pseudocode — see repo fn craft_malicious_tx(facilitator: &Pubkey, client: &Keypair, amount: u64) -> VersionedTransaction { let mut ixs = vec![ compute_budget::set_unit_limit(40_000), compute_budget::set_unit_price(5), spl_token::transfer_checked(...), ]; // Inject a CU-burner that fires AFTER the transfer. ixs.push(custom_program::cu_burn_30k()); let blockhash = recent_blockhash(); let msg = Message::new_with_blockhash(&ixs, Some(facilitator), &blockhash); let mut tx = VersionedTransaction { signatures: vec![Signature::default()], message: msg.into() }; // Client signs; facilitator's signature stays default until /settle. let client_sig = tx.message.serialize().sign(client); tx.signatures[1] = client_sig; tx } ``` ## Mitigations The fix is also small but architecturally pointed: 1. **Whitelist instruction prefix.** The facilitator's `/verify` should require the instruction list to be *exactly* `[ComputeBudgetSetUnitLimit, ComputeBudgetSetUnitPrice, TransferChecked]`, no extras. Reject anything with a 4th instruction. 2. **Pin compute unit limit.** Don't accept client-supplied CU budgets above 5k. Inject your own. 3. **Pin the mint.** Don't accept any mint in the transfer; require an exact match against the facilitator's allowlist (`USDC mainnet only`). 4. **Simulate before sign.** Run `simulateTransaction` against the partially-signed tx before adding feePayer. If sim fails or returns unexpected logs, reject. 5. **Reject token-2022 mints with hooks** unless the hook program is on an allowlist. (1) and (2) together kill Patterns A and the CU-burn variant. (3) and (5) together kill Pattern B. (4) is good defense in depth. ## Why the spec hasn't fixed this Probably because the original x402 design assumes the client is benign — they're paying for content, why would they sabotage their own payment? The threat model that breaks this is "the client is also the merchant" or "the client is also a competing facilitator" or just "the client is a researcher". Once you accept that the spec must work against malicious clients, Pattern A1 (CU burn) is the single highest-impact fix. ## Bibliography - [Dax911/x402_mal/research/instruction-injection/](https://github.com/Dax911/x402_mal/tree/main/research) - Solana Token-2022 Transfer Hook: [docs.solanalabs.com](https://spl.solana.com/token-2022/extensions#transfer-hook) - ComputeBudgetProgram: [solana.com/docs](https://solana.com/docs/core/transactions/runtime#compute-units) Previous: [Settlement race ←](/blog/x402_settlement_race_condition/) · Next: [Facilitator gas drain →](/blog/x402_facilitator_gas_drain/) --- # x402 Vector 1: settlement race condition Canonical: https://blog.skill-issue.dev/blog/x402_settlement_race_condition/ Description: Coinbase x402's verify→settle pipeline isn't atomic. A client can submit the same PAYMENT-SIGNATURE to multiple facilitators in parallel, or race the facilitator with a direct on-chain submission. Double-spend within blockhash validity (~60s). Published: 2026-03-22T18:00:00.000Z Tags: security, x402, solana, race-condition, research import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The cleanest vulnerability in the x402 protocol — also the one that's easiest to fix and the one most likely to bite production deployments. This post walks through the settlement race in detail, gives a PoC layout, and lists the mitigations that'd close it. Post 2 of the [SOLMAL series](/series/x402-attack-surface/) on x402. ## The bug The x402 settlement flow has two server-side calls: 1. `POST /verify { tx, expected }` — facilitator returns 200 if the partially-signed tx is well-formed and pays the right amount. 2. `POST /settle { tx }` — facilitator co-signs as feePayer and submits the tx to Solana. In every reference implementation I've seen, these are independent HTTP handlers with no shared lock. **A signed PAYMENT-SIGNATURE can be submitted to `/settle` more than once.** The facilitator will: - Re-co-sign with feePayer (idempotent — same tx hash). - Re-submit to Solana RPC. Solana itself deduplicates — the second submission of an already-confirmed tx returns `AlreadyProcessed`. Fine. But there's a window between *the client submitting tx_a* and *Solana confirming tx_a* during which the *same client signature* on a *different transaction* (tx_b, with a different blockhash or a different recipient ATA) can also be settled. The client paid once; the merchant believed they were paid; the underlying ledger says otherwise. ## Three concrete scenarios ### Scenario 1: parallel facilitator submission If a network has multiple facilitators (Coinbase plus third parties), the client can: 1. Build PAYMENT-SIGNATURE. 2. POST to facilitator A's `/settle`. 3. POST the same payload to facilitator B's `/settle` 50ms later. 4. Both facilitators submit. The first to land on Solana wins; the loser sees `AlreadyProcessed`. Result: only one tx settles on-chain, but both facilitators consumed gas, and **both believed they had successfully settled the payment** (depending on RPC client timing). Some facilitator implementations cache `/settle` responses by request hash; others cache by tx signature; others don't cache. The cache discrepancy is the monetisable bit. ### Scenario 2: client-side bypass 1. Client receives `PAYMENT-REQUIRED` with feePayer=facilitator. 2. Client builds PAYMENT-SIGNATURE. 3. Client **also** builds a *different* tx with the same client signature but a different recipient ATA — say, a second ATA the client controls. 4. Client submits the second tx directly to Solana RPC. 5. Client submits PAYMENT-SIGNATURE to `/settle`. If Solana confirms the *second* tx first (because the facilitator's RPC is in a different region with higher latency), the merchant's real settlement fails. The client never paid the merchant; the client paid themselves. The merchant might still grant access if they don't watch the on-chain confirmation tightly. ### Scenario 3: rapid replay Submit the same `PAYMENT-SIGNATURE` to the same facilitator 50 times in 1 second. If the facilitator's `/settle` handler doesn't lock-and-dedupe on the request payload hash, every call submits to Solana. 49 of 50 will fail with `AlreadyProcessed`, but during the racing window some may compute against stale state and reach unexpected outcomes (rent reclaims, ATA-init double-fee, etc.). ## PoC structure The repo contains a Rust harness in [research/race-spammer/](https://github.com/Dax911/x402_mal/tree/main/research): ```rust // Pseudocode — see repo for the runnable version. async fn race_test(facilitator: &Url, client: &Keypair) -> RaceResult { let req = build_payment_request(client); let sig = build_payment_signature(&req, client); let handles: Vec<_> = (0..50).map(|_| { let url = facilitator.clone(); let s = sig.clone(); tokio::spawn(async move { settle(&url, s).await }) }).collect(); futures::future::join_all(handles).await } ``` 50 parallel `/settle` calls. Count: how many got HTTP 200? How many led to a confirmed Solana tx? How many cost the facilitator gas? ## Mitigations The fix is small but it does have to be coded: 1. **Atomic verify+settle.** Combine the two endpoints, or have `/settle` re-run verify under a lock keyed by the tx signature. 2. **Per-signature dedup.** Cache settled tx signatures in Redis / KV with TTL = blockhash validity (~60s) + safety margin. Reject duplicate `/settle` calls with HTTP 409. 3. **Confirmation polling.** `/settle` should not return until the tx is confirmed (level=`processed` minimum, ideally `confirmed`). Currently most implementations return on RPC submit, not on confirmation. 4. **Per-client rate limit on `/settle`.** Even with dedup, a malicious client can create N distinct signatures. Limit per-IP and per-client-key. Of these, (2) is the easy win. KV cache keyed by signature, TTL of 90 seconds. Stops scenarios 1 and 3 dead. ## What this means for x402 deployments If you're operating an x402 facilitator: implement (2) before going live. The TTL needs to be longer than blockhash validity to cover the late-replay edge case. Use Cloudflare Workers KV, AWS DynamoDB, Redis — anything with sub-100ms eventual consistency. If you're a merchant integrating x402: don't grant content access until the facilitator's `/settle` returns AND your own RPC poll confirms the tx. The current spec lets merchants act on the facilitator's word; the spec needs an explicit "signature confirmed at slot S" field, and merchants need to poll until they see that slot ≤ current_slot - 32 (final). ## Bibliography - [Dax911/x402_mal SOLMAL.md](https://github.com/Dax911/x402_mal/blob/main/SOLMAL.md) — full threat model - Solana Foundation. *Transaction confirmation levels.* - Coinbase Developer Platform. *x402 specification.* Previous: [Series intro ←](/blog/x402_attack_surface_intro/) · Next: [Partial-signing instruction injection →](/blog/x402_partial_signing_injection/) --- # x402 Vector 3: facilitator gas drain Canonical: https://blog.skill-issue.dev/blog/x402_facilitator_gas_drain/ Description: x402 facilitators pay all transaction fees and the spec defines no per-client rate limit. A flood of valid-looking transactions that fail at maximum compute-unit consumption is a per-request economic attack on the facilitator. Published: 2026-03-21T18:00:00.000Z Tags: security, x402, solana, dos, economic-attack, research import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; The x402 protocol has a fee model where the **facilitator pays gas**. This is the entire UX win — AI agents don't need SOL to make payments. It's also the entire economic attack surface. Post 4 of the [x402 attack surface series](/series/x402-attack-surface/). ## The economics For each settled x402 transaction, the facilitator pays: - 5,000 lamports base fee (Solana minimum, ~$0.001 at SOL=$200). - `compute_unit_limit × compute_unit_price` priority fee (configurable, max 5 microlamports/CU per spec, max 40,000 CU). - Worst-case priority fee: 40,000 × 5 = 200,000 microlamports = 0.0002 SOL ≈ $0.04. So per tx, facilitator outflow is bounded at ~$0.041. That's fine for legitimate traffic. It's not fine when an attacker generates valid-looking PAYMENT-SIGNATUREs at 1000 req/sec. ## The attack: maximum-CU failure Three flavors of failing transaction that maximally hurt the facilitator: ### Flavor A: legitimate-looking transfer that fails post-CU-burn Recall from [Vector 2](/blog/x402_partial_signing_injection/): the client controls the instruction list. Append a custom-program call that: 1. Burns 35,000 CU in a no-op loop. 2. Asserts `False`, failing the entire tx. Outcome: facilitator pays 5k base + 175,000 microlamports priority = full $0.041. Tx reverts. Merchant gets nothing. Repeat at scale. ### Flavor B: valid-but-rejected mint Specify the wrong mint in the SPL `TransferChecked`. The instruction validates client-side because the client controls the bytes. The instruction fails on-chain because the mint pubkey doesn't match the ATA. Solana's runtime evaluates the entire instruction *before* it can detect the failure — so the facilitator pays the full CU consumption. ### Flavor C: ATA derivation mismatch The client supplies a destination ATA that derives from `(owner=A, mint=USDC)` but specifies `(owner=B, mint=USDC)` in the instruction. Solana's `transfer_checked` verifies the ATA is consistent with the supplied owner+mint and rejects. CU consumed: full instruction cost. All three flavors share the property: **the facilitator's `/verify` returns 200 (validators check structure, not bytes)**, the facilitator pays gas to settle, the tx reverts on-chain, the merchant doesn't get paid, the attacker has cost the facilitator money for nothing in return. ## Quantification A single attacker on a residential Comcast connection can sustain ~100 req/sec to a single facilitator endpoint. At ~$0.04/tx: - 100 req/sec × 60 sec/min × $0.04/tx = **$240/min in facilitator gas** - Across an 8-hour business day: **$115,200/day** Multiple attackers behind different IPs (e.g., a botnet of 1000) and the daily cost crosses $100M. Facilitator margins on legitimate x402 traffic are pennies per transaction. A sustained gas-drain attack burns the facilitator's runway in hours. ## Why the spec doesn't address this The spec assumes a "trusted client" model. AI agents operate semi-autonomously and **don't have an incentive to attack the facilitator they're paying** — except when they do. Specific incentive structures that make this attractive: 1. A competitor (rival facilitator) wants the target out of business. 2. A nation-state actor wants to disrupt agentic-economy infrastructure. 3. A researcher (this writer) wants to demonstrate the bug. 4. An AI agent that's been adversarially prompted to drain its own facilitator's funds. Threat (4) is the one I find most interesting. An LLM that's been jailbroken via prompt injection could — at no cost to itself — execute the gas-drain attack against the facilitator, which is operationally what x402 was designed to make easy. ## Mitigations In rough order of effectiveness: 1. **Per-client rate limit on `/settle`.** The facilitator must enforce N transactions per client-keypair per minute. Default ~10 sounds fine; can be raised for trusted clients via API key. 2. **CU budget cap.** Facilitator overrides client's `set_unit_limit` and `set_unit_price` instructions; pins to ≤5,000 CU and 1 microlamport/CU. Reduces worst-case outflow per tx by ~10×. 3. **Pre-flight simulation.** Before adding feePayer signature, run `simulateTransaction`. If sim returns `Err`, reject before paying gas. Shifts cost to a quick simulation call. 4. **Reputation-based throttle.** Track each client-keypair's settlement success ratio. Drop clients with under 50% success rate to a lower rate limit. 5. **Stake-or-pay deposits.** Out-of-band: clients deposit a small SOL bond with the facilitator. Failed transactions debit from the bond. Removes the asymmetric-cost property entirely. (1) is the bare minimum. (3) is the most operationally complex but also the most thorough. (5) is the Real Fix but requires protocol changes. ## What I'd do if I were operating an x402 facilitator ```python # Pseudocode for the verify+settle endpoint @app.post("/settle") async def settle(req: SettleRequest): client_pk = extract_client_pubkey(req.tx) # 1. Per-client rate limit if not await rate_limit.allow(client_pk, max=10, window=60): return 429, "rate_limited" # 2. Re-validate (don't trust /verify) if not validate_tx(req.tx, expected_mint=USDC, max_cu=5_000): return 400, "invalid_tx" # 3. Pre-flight simulate sim = await rpc.simulate_transaction(req.tx) if sim.err is not None: # Failed simulation = don't pay gas return 400, "sim_failed" # 4. Add feePayer sig + submit signed = sign_with_feepayer(req.tx) sig = await rpc.send_transaction(signed) # 5. Wait for confirmation before returning success await rpc.confirm_transaction(sig, level="confirmed") return 200, {"signature": sig} ``` Cost of all this: ~50ms added latency per settlement, plus one extra RPC call. Worth it. ## Bibliography - [Dax911/x402_mal/research/gas-drain-bench/](https://github.com/Dax911/x402_mal/tree/main/research) - Solana Compute Unit Pricing: [solana.com/docs](https://solana.com/docs/core/transactions/runtime#compute-units) - Cloudflare KV rate limiting patterns Previous: [Partial-signing injection ←](/blog/x402_partial_signing_injection/) · Next: [AI-agent wallet drain →](/blog/x402_ai_agent_wallet_drain/) --- # SOLMAL: the x402 attack surface (series intro) Canonical: https://blog.skill-issue.dev/blog/x402_attack_surface_intro/ Description: Mapping the attack surface of Coinbase's x402 micropayment protocol on Solana. Series intro covering the verify→settle pipeline, the actor model, the 9 vectors, and the responsible-disclosure timeline. Published: 2026-03-20T18:00:00.000Z Tags: security, x402, solana, ai-agents, research import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx"; Coinbase shipped [x402](https://x402.org/) — a micropayment protocol that piggybacks on HTTP 402 (Payment Required) — in late 2025. It is, on paper, brilliant: AI agents pay for API access via stablecoin micropayments embedded in HTTP headers, the merchant doesn't need to run a payment processor, and a third-party "facilitator" sponsors gas on Solana so the agent doesn't need any SOL. In late 2026 I spent a few weeks staring at the protocol. I came away with **9 distinct attack vectors** plus a meta-finding about AI-agent wallets that is, I think, the single biggest risk. This post is the series opener. ## The protocol in 30 seconds Three actors: **client** (an AI agent with a Solana wallet), **resource server** (the API the agent wants to call), **facilitator** (validates payments, sponsors gas, settles on-chain). >S: GET /endpoint S-->>C: 402 + PAYMENT-REQUIRED header C->>C: Build partial VersionedTransaction
(client signs, feePayer = facilitator) C->>S: GET + PAYMENT-SIGNATURE header S->>F: /verify (is this tx valid?) F-->>S: 200 ok S->>F: /settle (co-sign as feePayer, submit) F->>F: Submit to Solana F-->>S: 200 + signature S-->>C: 200 + content + PAYMENT-RESPONSE `}/> The Solana-specific bits: - Client builds a `VersionedTransaction` with an SPL `TransferChecked` instruction. - `feePayer` is the facilitator's address. - Client only **partially signs** (their key); facilitator adds the feePayer signature. - USDC mint: `EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v`. 6 decimals, so `"1000"` = $0.001. - Compute budget capped at 40,000 CU, max 5 microlamports/CU. - Blockhash valid ~60 seconds (151 slots). ## The 9 vectors | # | Vector | Severity | Post | |---|--------|----------|------| | 1 | Settlement race condition | High | [walkthrough](/blog/x402_settlement_race_condition/) | | 2 | Transaction manipulation (partial signing) | High | [walkthrough](/blog/x402_partial_signing_injection/) | | 3 | Facilitator gas drain | Medium | [walkthrough](/blog/x402_facilitator_gas_drain/) | | 4 | Blockhash replay window | Medium | (post 6) | | 5 | Facilitator impersonation via feePayer field | Medium | (post 6) | | 6 | AI-agent wallet exploitation | **High** | [walkthrough](/blog/x402_ai_agent_wallet_drain/) | | 7 | Header injection / parsing bugs | Medium | (post 6) | | 8 | ATA derivation manipulation | Medium | (post 6) | | 9 | Amount-string parsing | Medium | [walkthrough](/blog/x402_amount_string_parser/) | The five posts in this series cover Vectors 1, 2, 3, 6, 9 in detail. Vectors 4, 5, 7, 8 are noted in the [SOLMAL.md](https://github.com/Dax911/x402_mal/blob/main/SOLMAL.md) research log and will land as a sweep post once I've written PoCs for each. ## What's load-bearing about each vector **Vector 1 (Settlement Race).** The verify→settle pipeline isn't atomic. A client can submit the same `PAYMENT-SIGNATURE` to multiple facilitators in parallel, or race the facilitator's submission with a conflicting transaction posted directly to Solana. Settlement double-execution lasts as long as the blockhash is valid (~60s). **Vector 2 (Partial Signing).** The client builds the entire transaction. The facilitator validates structure but typically doesn't audit *every byte* of every instruction. A malicious client appends extra instructions — a token-2022 hook, a clawback, an arbitrary CPI — that fire after the transfer. **Vector 3 (Facilitator Gas Drain).** The protocol specifies no per-client rate limit on the facilitator. Crafted transactions that **fail validation in the worst possible way** (consuming maximum CU before reverting) are still paid for by the facilitator. Economic DoS. **Vector 6 (AI-Agent Wallet).** The agent has a programmatic keypair and auto-approves payments below a price threshold. A service that starts at $0.001/req and ramps to $0.10/req over 1000 requests drains the wallet *without ever crossing the threshold*. The threshold check is done per-request, not per-session, not per-vendor. **Vector 9 (Amount Parsing).** Amounts in x402 are JSON strings like `"1000"`. Different implementations parse `"1000"` vs `"1e3"` vs `" 1000 "` vs `"+1000"` vs `"01000"` differently. Mismatch between facilitator's validator and Solana's actual transfer = monetisable. ## Disclosure posture This is **public research** against an open protocol with multiple independent implementations. I did not test against any specific facilitator without permission. The PoCs target a mock facilitator I wrote in the [research/](https://github.com/Dax911/x402_mal/tree/main/research) tree. For specific vendor implementations: - I have not contacted Coinbase. The protocol is open; the bugs are in the spec, not in any single implementation. - If your team operates an x402 facilitator and any of this looks live in your code: please email me. Bridge: [haydenaylor911@gmail.com](mailto:haydenaylor911@gmail.com). - I'll honour a 90-day embargo if you have a remediation plan. ## What's coming in the series 5 deep-dive posts on the highest-impact vectors: 1. [Settlement race condition](/blog/x402_settlement_race_condition/) — Vector 1, double-spend within blockhash validity 2. [Partial-signing instruction injection](/blog/x402_partial_signing_injection/) — Vector 2, append-and-execute 3. [Facilitator gas drain](/blog/x402_facilitator_gas_drain/) — Vector 3, economic DoS 4. [AI-agent wallet drain](/blog/x402_ai_agent_wallet_drain/) — Vector 6, slow-burn pricing 5. [Amount-string parser fuzzing](/blog/x402_amount_string_parser/) — Vector 9, JSON-numeric edge cases Plus a sweep post for Vectors 4, 5, 7, 8 once the PoCs land. ## Bibliography - Coinbase Developer Platform. *x402 Specification.* https://x402.org/ - HTTP/1.1: Semantics. *RFC 7231 §6.5.2 (402 Payment Required).* - Solana Foundation. *VersionedTransaction documentation.* - [Dax911/x402_mal](https://github.com/Dax911/x402_mal) — research repo; SOLMAL.md is the threat-model log. Series finale: [Settlement race condition →](/blog/x402_settlement_race_condition/) --- # Building the Zera SDK: Day One Canonical: https://blog.skill-issue.dev/blog/zera_sdk_scaffolding/ Description: Sixteen commits in fourteen minutes. The first day of the @zera-labs/sdk monorepo — Rust core via neon-rs, TypeScript scaffolding, Poseidon, Merkle trees, ZK provers, and an MCP server for AI agents. Published: 2026-03-05T21:54:29.000Z Tags: zera, typescript, rust, sdk, zk, poseidon, mcp, solana > "init monorepo structure" That commit message — [`af8cc28`](https://github.com/Dax911/zera-sdk/commit/af8cc28644e055bebc6e6688c3b7d534aca5b202), 2026-03-05T21:54:29Z — is when the Zera SDK began. Sixteen commits later, fourteen minutes after, the scaffolding was done: a Rust crate, a Neon native binding, a TypeScript SDK with Poseidon + Merkle + provers + transaction builders, and an MCP server. The whole arc is visible on [the commit log](https://github.com/Dax911/zera-sdk/commits/main) — every commit dated within the same minute, every commit doing exactly one thing. This post is about how the day-one scaffolding was structured, why I split it into 16 atomic commits, and what each piece actually does. ## The shape of the monorepo Three packages from the start: ``` packages/ zera-core/ # Rust crate — circuit-aligned crypto primitives zera-bindings/ # Neon-rs node bindings exposing zera-core to JS sdk/ # @zera-labs/sdk — TypeScript SDK mcp-server/ # @zera-labs/mcp-server — MCP tools for AI agents ``` The reason `zera-core` exists in Rust is that the on-chain Solana program is also in Rust, and the SDK has to compute Poseidon commitments and Groth16 proof formatting *exactly* the way the on-chain verifier does. JS and Rust agreeing on a 254-bit field element is the kind of thing that goes wrong silently. Moving the canonical implementation to Rust and exposing it to JS via Neon kept the two halves bitwise consistent. The TypeScript SDK is what 95% of users will touch. The MCP server is the bet that the next class of "user" will be an AI agent, not a human in a wallet popup. ## Atomic commits as a design discipline If you scan the [commit log](https://github.com/Dax911/zera-sdk/commits/main), you'll see this pattern from `af8cc28` through `e350707`: ``` af8cc286 init monorepo structure 7ba37e6e add zera-core rust crate d605b930 add neon-rs node bindings for zera-core 4aaa8def add ts sdk package scaffolding + crypto primitves 26f77955 add note mgmt, merkle tree, pda helpers, utils f713bb3f add zk prover + voucher system cd518d5a add transaction builders for deposit/withdraw/transfer 0debc6af add tree state client for fetching merkle tree from chain f4beda30 add ZeraClient high-level wrapper + NoteStore 27786470 update barrel exports w new modules d150f829 add mcp server package for ai agent integration 98bc0a88 add core sdk documentation dc2937ae add agentic integration guide + use cases af03daf2 add examples, security doc, and current status analysis ``` Each commit is one logical concept and nothing else. The reason this matters: when you're scaffolding a 200-file SDK in a single session, the sane way to bisect a regression two months later is to `git revert` a single concept. If the Merkle tree breaks, you revert `26f77955`. If the prover wires wrongly, you revert `f713bb3f`. If you ship the whole thing as one mega-commit, you can't isolate. It also makes the SDK reviewable. There's no "Initial commit (12,000 lines)." You can read it in 14 minutes the way I wrote it in 14 minutes. ## The crypto layer The cryptographic foundation is in [`packages/sdk/src/crypto/poseidon.ts`](https://github.com/Dax911/zera-sdk/blob/4aaa8def935c617cb447040bb6cb6f22aeefbf4e/packages/sdk/src/crypto/poseidon.ts). Poseidon is the hash function we use everywhere — for note commitments, for Merkle nodes, for nullifiers. It's circuit-friendly, which means it's cheap to prove inside a Groth16 circuit. SHA-256 inside a circuit is *thousands* of constraints. Poseidon is dozens. ```ts import { buildPoseidon } from "circomlibjs"; let poseidonInstance: any = null; export async function getPoseidon(): Promise { if (!poseidonInstance) poseidonInstance = await buildPoseidon(); return poseidonInstance; } export async function poseidonHash(inputs: bigint[]): Promise { const poseidon = await getPoseidon(); const hash = poseidon(inputs.map((v: bigint) => poseidon.F.e(v))); return BigInt(poseidon.F.toObject(hash)); } export async function poseidonHash2(left: bigint, right: bigint): Promise { return poseidonHash([left, right]); } ``` The singleton is load-bearing. `buildPoseidon` initializes WASM that takes ~80ms cold. If every Merkle node hash had to spin that up, building a tree with `TREE_HEIGHT = 24` would take 30 seconds. ## Notes are bigints all the way down From [`types.ts`](https://github.com/Dax911/zera-sdk/blob/4aaa8def935c617cb447040bb6cb6f22aeefbf4e/packages/sdk/src/types.ts): ```ts export interface Note { amount: bigint; asset: bigint; secret: bigint; blinding: bigint; memo: [bigint, bigint, bigint, bigint]; } export interface StoredNote extends Note { commitment: bigint; nullifier: bigint; leafIndex: number; } ``` Every field is a `bigint`. The reason: every field has to be reducible mod BN254 prime to enter a circuit, and that's a 254-bit operation. JS `Number` is 53 bits. Using `bigint` from day one means every constant in the SDK is correct as written: ```ts export const BN254_PRIME = BigInt( "21888242871839275222246405745257275088548364400416034343698204186575808495617", ); ``` The cost of `bigint` everywhere is that you can't `Math.max` your way out of a comparison. The benefit is that you can never lose a low bit by accident. ## `createNote`: the most important six lines ```ts function randomFieldElement(): bigint { const bytes = randomBytes(31); // 248 bits – safely below the 254-bit prime const value = BigInt("0x" + bytes.toString("hex")); return value % BN254_PRIME; } export function createNote(amount, asset, memo?): Note { return { amount, asset, secret: randomFieldElement(), blinding: randomFieldElement(), memo: memo ?? [0n, 0n, 0n, 0n], }; } ``` The note's `secret` is what derives the nullifier. If you can predict it, you can predict the nullifier, and your transaction is forensically linkable. Sampling 248 bits and reducing mod the BN254 prime is the standard recipe; sampling 256 bits would bias the distribution slightly toward small field elements after the modular reduction. ## Transaction builders: the SDK's actual surface area From [`tx/deposit.ts`](https://github.com/Dax911/zera-sdk/blob/cd518d5ace208ebebf5852ed38c8dff11b6d23b4/packages/sdk/src/tx/deposit.ts): ```ts export function buildDepositTransaction(params: DepositParams): Transaction { const { payer, mint, amount, commitment, proof, publicInputs, programId } = params; // Derive PDAs const [poolConfig] = derivePoolConfig(mint, programId); const [merkleTree] = deriveMerkleTree(mint, programId); const [vault] = deriveVault(mint, programId); // ... } ``` Three transaction builders: `buildDepositTransaction`, `buildWithdrawTransaction`, `buildTransferTransaction`. Each one consumes a Groth16 proof + commitment, derives the right PDAs, and returns an unsigned `Transaction`. The signing is intentionally not the SDK's job. That's the wallet's job, and embedding signing in an SDK is what gives you a tarball of leaked keys six months later. ## ZeraClient: the high-level wrapper By the time we got to `f4beda30 — add ZeraClient high-level wrapper + NoteStore`, the lower-level pieces were composable enough that one class could orchestrate them. The wrapper takes a config: ```ts export interface ZeraClientConfig { rpcUrl: string; programId?: string; circuits: { deposit: CircuitPaths; withdraw: CircuitPaths; transfer: CircuitPaths; }; noteStore?: NoteStore; cacheEndpoint?: string; } ``` …and exposes one method per high-level operation. `client.deposit(amount, mint)`, `client.withdraw(commitment)`, `client.transfer(amount, recipient)`. Behind each method is the pipeline: fetch tree state → load relevant circuit WASM → prove → build tx → return unsigned `Transaction` for the wallet to sign. `NoteStore` is an interface with one default in-memory implementation and a contract that says "if you persist notes, you're responsible for not leaking them." Most consumers will plug an encrypted file backend. The wallet demo plugs Tauri's filesystem with Argon2id-derived keys; we'll get to that in [the Zera Wallet v3 post](/blog/zera_wallet_v3_zkp/). ## MCP: betting on agents The most experimental thing on day one was `@zera-labs/mcp-server`. From [`packages/mcp-server/src/index.ts`](https://github.com/Dax911/zera-sdk/blob/d150f8294dca2bdcfd4f3b38da53b346aef64773/packages/mcp-server/src/index.ts): ```ts const server = new McpServer({ name: "zera-protocol", version: "0.1.0" }); server.tool( "zera_deposit", "Deposit USDC into the ZERA shielded pool. Funds become private and untraceable after deposit.", { amount: z.number().positive().describe("Amount of USDC to deposit (e.g., 100.50)"), memo: z.string().optional().describe("Optional memo for your records (stored privately, never on-chain)"), }, async ({ amount, memo }) => { /* … */ }, ); ``` If the only thing that talks to your protocol is wallets, your TAM is "humans who installed an extension." If MCP-connected agents can also call your protocol, your TAM is "every Claude/Cursor/Cline session anyone runs." That's a 100× delta. The bet is cheap — `mcp-server` is one ~400-line file plus the SDK it depends on. If agents end up *not* using zk-shielded pools, I lose 400 lines. If they do, I get there first. ## Trade-offs **Why circomlibjs instead of a hand-rolled Poseidon?** Because circomlib is the canonical implementation that the circuits are written against. Re-implementing Poseidon for the host is exactly the kind of "I'll save 50ms" choice that fails an end-to-end test in week three. **Why Neon instead of WASM for `zera-core`?** Because the SDK ships to Node and to a Tauri webview, both of which natively support `.node` files. WASM would have meant another loader, another fetch, another async boundary. Neon is one `require`. **Why ship the MCP server in the same monorepo?** Because the moment you give it its own repo, it falls behind on SDK changes. Same monorepo, same `pnpm-workspace.yaml`, same lockfile. One `pnpm install` and you're done. ## What this taught me Atomic commits are the difference between an SDK that's reviewable and an SDK that's trusted. Every dependency relationship in the scaffolding above is one-directional and one-commit-at-a-time. That's why the `144-test test suite` ([`80927`](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c)) that landed three weeks later could be written without rewriting any of the underlying code — see [the next post](/blog/zera_sdk_test_suite/). ## Further reading - [zera-sdk on GitHub](https://github.com/Dax911/zera-sdk) - [The 16-commit scaffolding sequence](https://github.com/Dax911/zera-sdk/commits/main) - [circomlibjs — Poseidon implementation](https://github.com/iden3/circomlibjs) - [Neon — Rust ↔ Node bindings](https://neon-bindings.com/) - [Model Context Protocol](https://modelcontextprotocol.io/) — the spec MCP servers implement. - [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the privacy thesis these primitives implement. --- # Cruiser: A Tauri Hookup App on iroh, Geohash-Bucketed Presence, and Why P2P Dating Is Actually Fine Canonical: https://blog.skill-issue.dev/blog/cruiser_iroh_gossip_p2p/ Description: A Tauri 2 + React + iroh-gossip dating app where peers find each other by geohash, broadcast presence on a topic-per-bucket, and DM each other with consent signals — all without a central server. The architecture is the product. Published: 2026-02-26T15:15:06.000Z Tags: cruiser, tauri, iroh, p2p, gossip, rust, geohash, solana The dating app market in 2026 is two things: dystopian centralized platforms (Match Group's stable: Tinder, Hinge, OkCupid, etc.) and crypto-bro-coded alternatives that promise decentralization but ship a Mongo cluster behind the API. Neither is what the queer community I built Cruiser for actually wanted. They wanted **a dating app where the only servers were the participants' own devices**, where presence was bucketed by location without any single party seeing all locations, and where the wallet was the identity but the wallet wasn't a custody surface. Cruiser shipped on 2026-02-26 in [`4cecbd4 — Cruiser: P2P hookup app — Phases 1–26`](https://github.com/Dax911/cruiser/commit/4cecbd4cf64fe2bdcd44f0aa3b6db83b1ebd3a05). 89 files. ~20,000 lines. Tauri 2.x for the desktop wrapper, React 18 + Zustand for the UI, [iroh-gossip](https://github.com/n0-computer/iroh) for the P2P transport, Solana for the wallet/payment rails. The full mono-commit is the result of 26 phases of design + implementation that I'd been working on locally, then squashed into one commit before pushing. This post is about the gossip-presence architecture in particular, because that's where the "no servers" promise actually has to be defended. ## The geohash topic split iroh-gossip is a publish/subscribe protocol over a peer-mesh, with content-addressed `TopicId`s. Every peer that subscribes to a `TopicId` joins the same gossip mesh and exchanges messages. The naive thing is to use *one* topic for the whole app — `cruiser/v1` — and broadcast every presence announce to every peer. This is a privacy disaster. It means every peer sees every other peer's broadcast, including their location. The architecture that ships in Cruiser is per-geohash topics: ```rust // src-tauri/src/gossip_presence.rs const AREA_TOPIC_PREFIX: &str = "cruiser/area/v1/"; let topic_bytes = location::topic_from_geohash(AREA_TOPIC_PREFIX, geohash6); let topic_id = TopicId::from_bytes(topic_bytes); ``` The topic is `cruiser/area/v1/`. **Every 6-character geohash bucket is its own topic.** A geohash6 covers approximately a 1.2 km × 0.6 km area — small enough to be a single neighborhood, large enough to have actual users in it. Two peers join the same topic only if they're in the same geohash6 bucket. This is the privacy architecture in one decision: **you can only see the presence of peers who chose to be visible in the same geographic bucket as you.** A peer in San Francisco can't see a peer in Berlin's gossip. Even within a city, a peer in the Mission can't see a peer in the Castro, because those are different geohash6 buckets. The cost of this design: peers walk between geohash6 boundaries (you cross a street, you're in a new bucket). The app handles this by *leaving* the old topic and *joining* the new one whenever the user's geohash6 changes. That's the lifecycle the `ActiveArea` struct manages: ```rust pub struct ActiveArea { pub topic_id: TopicId, pub geohash6: String, pub sender: Arc, broadcast_handle: JoinHandle<()>, receive_handle: JoinHandle<()>, reaper_handle: JoinHandle<()>, } impl ActiveArea { pub fn leave(self) { self.broadcast_handle.abort(); self.receive_handle.abort(); self.reaper_handle.abort(); } } ``` `leave` aborts all three Tokio tasks — broadcast loop, receive loop, peer-cache reaper — and drops the topic subscription. The next location update triggers a `join_gossip_area` for the new geohash, and the cycle repeats. ## The three tasks per area Every joined area spawns: 1. **A broadcast task** that sends a `PresenceAnnounce` (your profile snippet, your endpoint ID, your tags) every 30 seconds. 2. **A receive task** that handles incoming announces and updates a local `PeerCache`. 3. **A reaper task** that runs every 60 seconds and evicts peers that haven't announced in 90 seconds. ```rust const BROADCAST_INTERVAL_SECS: u64 = 30; const REAPER_INTERVAL_SECS: u64 = 60; ``` The 30s broadcast / 90s eviction (= reap if no announce in 3 broadcasts) gives you a "go offline within 90s" guarantee. If a peer disconnects from WiFi, walks out of range, or quits the app, every other peer's view of them ages out within a minute and a half. No central server needed to mark them offline. This is the *whole* mechanism for "who's online right now in your area." There is no `presence-server.cruiser.app/online`. The presence is the gossip itself. ## What an announce looks like ```rust // src-tauri/src/presence.rs (simplified) #[derive(Serialize, Deserialize, Clone)] pub struct PresenceAnnounce { pub endpoint_id: String, // iroh node ID — the P2P address pub geohash6: String, // your bucket (intentionally redundant for receivers) pub display_name: String, pub avatar_hash: String, // CID of avatar image; sender mirrors via iroh-blobs pub bio_short: String, // ~80 chars max pub energy: String, // a profile field — "🔥 high energy", "🌙 chill", etc. pub tags: Vec, // user-set tags for search/filter pub last_seen_ms: u64, // sender's local clock at announce time pub signature: String, // ed25519 sig over the rest, by the user's identity key } ``` A few interesting design calls: **Avatar by CID, not inlined.** The avatar is a hash; the actual image bytes are fetched via [iroh-blobs](https://docs.rs/iroh-blobs/) on a separate transport from the gossip topic. Inlining the avatar would balloon every announce to ~50KB and make the gossip topic unreasonably noisy. CID + lazy fetch is ~256 bytes per announce. **`signature` is the integrity surface.** Every announce is signed by the user's identity key. A peer receiving an announce verifies the signature before adding to the peer cache. Without this, anyone could broadcast an announce claiming to be anyone else; with it, an impostor announce is detected and dropped. **`last_seen_ms` is the announcer's clock.** Not a synchronized clock. The receiver uses this for "rough freshness" but not for anti-replay — anti-replay is handled by the iroh-gossip layer's own message dedup based on content hash + topic. ## Direct messages: a separate topic per pair DMs work the same way, with a different topic shape. From `src-tauri/src/gossip_dm.rs`: ``` cruiser/dm/v1/ ``` The endpoint IDs of the two peers are sorted lexicographically and concatenated. Both peers compute the same topic ID. Joining the topic establishes a 2-peer gossip mesh. Messages are encrypted with `nacl.box` (XSalsa20-Poly1305) using the peers' x25519 keys, derived from their ed25519 identity keys. The threat model: - **An eavesdropper on the gossip mesh** sees the topic ID (which is opaque without the endpoint IDs that produced it) and ciphertext. They learn nothing about the participants or content. - **A passive observer of the iroh DHT** sees the two endpoint IDs subscribing to a common topic, which leaks "these two people are in a DM" but not the content. Acceptable; DMs in any system leak metadata at this level. - **A man-in-the-middle** can't insert messages because they're encrypted with `nacl.box` keyed to the receiver's pubkey. They can't drop messages without the sender noticing (no acks, but the ordering would be visibly wrong). The DM topic also handles tips, consent signals, location sharing, typing indicators, read receipts, and emoji reactions. All of those are just message variants in the same encrypted topic — there's no separate channel for them. The reason: a separate channel for "I'm typing" would itself leak the metadata "person A is typing to person B" without authentication. Folding everything into the encrypted DM topic eliminates that side channel. ## Why iroh-gossip and not libp2p I evaluated three P2P stacks before landing on iroh: - **libp2p (Rust):** the de-facto standard. Powerful, but operationally heavy — DHT, NAT traversal, transports, and a non-trivial topology config. It's overkill for a single-purpose app. - **GossipSub (libp2p):** the gossip protocol within libp2p. Closer to what I needed, but still requires the full libp2p stack as host. - **iroh + iroh-gossip:** purpose-built for "P2P Rust app needs gossip." Smaller surface area, batteries-included relay/DHT/NAT-traversal via iroh's hosted public infrastructure. Subjectively faster to ship. iroh hosts a public relay infrastructure (`relay.iroh.network`) that handles NAT traversal and STUN-style address discovery. Most home users are behind NAT, so without relay infrastructure most P2P apps don't work in practice. iroh's relay is opt-in and free for development; that's what I used. The trade-off: **iroh is younger than libp2p**, the API surface is still moving, and the network effects are smaller (fewer peer apps to interop with). For Cruiser this is fine — there are no peer apps it needs to interop with — but for a project that wanted to join the existing libp2p universe, iroh would be the wrong call. ## CoreLocation, IP fallback, and the geolocation rabbit-hole The whole gossip architecture above is meaningless without the user's actual location. Browser `navigator.geolocation` doesn't work in Tauri's macOS WKWebView (wry auto-denies the permission). The follow-up commit [`d2b9cc8 — Phase 27: Native CoreLocation for macOS`](https://github.com/Dax911/cruiser/commit/d2b9cc8) is where I solved that, and it's [its own post](/blog/cruiser_corelocation_objc2/). Worth noting here: the system has *three* fallback layers for location: 1. Native CoreLocation (macOS) / GeoClue2 (Linux) / Windows.Devices.Geolocation (Windows). Best accuracy. 2. IP-based geolocation via ipinfo.io. Used when native services are unavailable or denied. 3. Manual override (you type your geohash6 into a settings field). Used for testing and for users who don't want their actual location used. Each layer feeds the same `geohash6` value to `join_gossip_area`. The peer doesn't care how the geohash was computed; they care that the geohash is honest and stable. ## What "Phase 1–26" means The mono-commit covers 26 design phases. A non-exhaustive sample of what each phase added: - Phase 1–3: Identity (ed25519 key + Solana pubkey). - Phase 4–6: Profile (avatar, bio, energy, tags). - Phase 7–9: Gossip presence (the architecture above). - Phase 10–12: DM chat (encrypted, with media + tips + consent signals). - Phase 13: Block list. - Phase 14: Favorites. - Phase 15: Notifications. - Phase 16: Themes. - Phase 17: Search. - Phase 18: Onboarding (the new-user flow). - Phase 19–21: Chat management (delete threads, relative timestamps, profile peek). - Phase 22–25: Dev tools (seed peers for local testing, SOL airdrop UI). - Phase 26: The final polish + the squash into one commit. The reason to squash 26 phases into a single commit is that the local development repo had 200+ commits with messages like `wip` and `fix wallet sig` and `actually now it works`, and that's not a public history. The squash gives readers a single coherent diff that says "this is what shipped." The cost: you lose the ability to bisect within Phase 1–26. The benefit: you don't subject the public to a noisy 200-commit history. ## What I'd do differently **The PeerCache should be persistent.** Right now, when you restart the app, you lose the in-memory peer cache and have to wait 30s for the next broadcast cycle to repopulate. Persisting it (and re-validating on next announce) would make the first-second of app launch feel responsive instead of empty. **The geohash6 boundary needs hysteresis.** Crossing a geohash boundary triggers a topic-leave / topic-join cycle. If you walk along the boundary you can flap between buckets every few seconds. The fix is to wait for a few consecutive readings on the new bucket before switching, or to subscribe to *both* buckets while in transition. Neither is implemented in the initial commit; both are easy add-ons. **The signature scheme should bind to the topic.** Right now an announce signed for topic A could be replayed on topic B by an adversary who controls a relay. Including the topic ID in the signed payload would prevent that. Easy fix; on the to-do list. ## Trade-offs **Why a desktop app first instead of mobile?** Because Tauri's desktop story was mature in 2025 and the iOS / Android mobile bindings were still beta. The [Phase 29 iOS commit](/blog/cruiser_ios_xcode_cloud/) shipped iOS support a few weeks later; Android is still pending. **Why Tauri instead of Electron?** Same reason as the [Zera Wallet v3](/blog/zera_wallet_v3_zkp/): smaller bundle, sane Rust↔JS IPC, and the Rust side can hold long-running background tasks (gossip loops, location service) without spinning up a separate process. **Why a per-pair DM topic instead of a single shared "DMs" topic?** Because per-pair topics are the right scope for routing — only the two participants subscribe — whereas a shared topic would require every peer to receive every DM and filter by recipient. That's both wasteful and a metadata leak. **Why no central reputation/abuse system?** Because the moment you ship a central reputation system, the system is no longer P2P. The Cruiser approach is: every peer maintains their own block list, locally. Abuse is mitigated by the absence of a global directory — you can only be discovered by people in your geohash6, so the attack surface is bounded by your physical area. ## What this taught me P2P-as-architecture is mostly *constraints management*: deciding what state is allowed to be global (almost nothing), what state is allowed to be partial (peer caches, ephemeral), and what state is fully local (your block list, your profile, your settings). Once you've drawn those lines, the rest of the design falls out. The other thing I learned is that **iroh deserves more attention.** It's the smallest dependency I've ever shipped that supports a real P2P product. Most P2P stacks are 50,000-line behemoths. iroh-gossip + iroh-net + iroh-blobs is enough infrastructure for a real app and the code surface is comprehensible. ## Further reading - [Cruiser on GitHub](https://github.com/Dax911/cruiser) - [The Phase 1–26 mono-commit](https://github.com/Dax911/cruiser/commit/4cecbd4cf64fe2bdcd44f0aa3b6db83b1ebd3a05) - [iroh on n0.computer](https://www.iroh.computer/) — the P2P stack. - [`iroh-gossip` docs](https://docs.rs/iroh-gossip/) — the pub/sub layer. - [Cruiser CoreLocation post](/blog/cruiser_corelocation_objc2/) — how the geolocation layer works. - [Cruiser iOS + Xcode Cloud](/blog/cruiser_ios_xcode_cloud/) — the App Store push. - [Cruiser+ landing page](/blog/cruiser_site_satori_poster/) — the marketing surface. --- # Why I started Zera Labs Canonical: https://blog.skill-issue.dev/blog/why_i_started_zera_labs/ Description: Three things became true in the same year — ZK got fast enough, Solana got cheap enough, and AI agents needed verifiable money. Sitting at the intersection felt like a ship date, not a thesis. Published: 2026-02-20T08:00:00.000Z Tags: founders, zera, zk, solana, ai, narrative, founder-letter This is the post I keep wanting to skip. It's the founding letter — the one where I'm supposed to explain, in clean prose, why a perfectly happy senior IC with a security-research side hustle decided to incorporate a thing and put his name on the door. I have started writing it five times. The other four versions all went a little too hard on the *grand cryptographic destiny of the human race* angle, which is not the kind of post I'd respect if I read it in someone else's feed. So here is the version that survived: three things became true at roughly the same time, in the same year, and sitting at the intersection of those three things felt much more like a ship date than a thesis. That's the whole pitch. The rest of this letter is just walking the three legs of the tripod. ## Leg one: ZK got fast enough to be boring I have been reading zk papers for, depending on how you count, six or seven years. The thing about zk papers is that the *math* doesn't get faster — the math has been there since Goldwasser and Micali's 1985 paper. What gets faster is the *engineering*. Better proving systems (Groth16 → PLONK → Halo2 → STARKs → folding schemes). Better hashes inside circuits (Pedersen → Poseidon → Reinforced Concrete). Better hardware (CPU SIMD → GPUs → FPGAs → the inevitable ASIC). Better libraries (snarkJS → arkworks → halo2 → Lurk → Risc Zero). In 2018, you could prove a non-trivial program in a circuit and submit it to Ethereum, but you needed a research lab and a friend at a hardware accelerator company. In 2024, you could prove a non-trivial program in a circuit on a laptop in a few seconds and submit it to a chain that didn't price proof verification like a war crime. In 2026, the prover is fast enough that **a wallet can do it on the user's machine for a normal interactive payment** without the user noticing. That last sentence is the entire reason ZK leaves the lab. The bar for "leaves the lab" is ruthless. It isn't "research demo at Devcon." It's: a non-technical user, on their existing laptop, opens a wallet, clicks Send, waits less than a coffee sip, and a Groth16 proof has gone over the wire to settle the transaction. Until that is true, ZK lives in conferences and academic papers. Once that is true, ZK eats a chunk of the financial system. That is the point we are at right now. I built [zera-sdk](/blog/zera_sdk_scaffolding/) and the [Zera Wallet v3](/blog/zera_wallet_v3_zkp/) to be the first products to ship after that line was crossed. Not after the line will be crossed, after some round, after some grant. After. It already happened. We are mostly waiting for the rest of the industry to notice. ## Leg two: Solana stopped being a gas-fee story I came up at ConsenSys. I love the EVM the way you love a complicated relative — deeply, suspiciously, with a lot of patience. But the EVM was not designed for a world in which a privacy-preserving deposit costs you a single-digit number of cents and a transfer costs less. The EVM was designed for a world in which compute is precious and you charge by the opcode. Solana is the opposite design point. Compute is cheap, throughput is high, parallelism is the default, and — critically — Light Protocol's compressed-token primitive lets you push almost the entire account state of a token into an off-chain Merkle tree. The savings are not marginal. They are something like 5000× per token. I spent a weekend porting a notional AMM to Solana for the first time and the gas numbers came out so low I assumed I had a math error. I did not. The chain is just that much cheaper. I wrote about the implications in [ZeraSwap: An AMM for Compressed Tokens](/blog/zeraswap_compressed_amm/). The short version: when the per-account-state cost of a token drops by three and a half orders of magnitude, every assumption you had about the *granularity* of tokenisation has to be re-examined. You can have one token per medical record. One token per receipt. One token per proof. The cost of "putting it on chain" stops being a budgeting decision and starts being a *naming* decision. ZK is the privacy half. Compressed tokens are the bandwidth half. If you have both, you have the substrate I would have wanted for [the cryptocurrency we should have built](/blog/a_better_crypto/). ## Leg three: AI agents need verifiable money This is the leg that tipped me from "interesting hobby" to "I'm doing this full-time." It's also the leg the most people get wrong, so I want to walk it carefully. If you have not played with the Model Context Protocol yet, the elevator version is: an AI agent (Claude, Cursor, Cline, your custom thing) connects to a *server* that exposes tools the agent can call. The server might be a calendar. The server might be a database. The server might be — and here is where it gets interesting — a wallet. In 2025 a lot of teams glued LLMs to wallets and discovered, predictably, that the result was funny but not safe. Funny because LLMs are very confident; not safe because wallets, being unverified pieces of state, can be lied to in ways the model has no way to verify. The result was a small wave of "agent steals the demo wallet's testnet ETH" videos that everyone enjoyed and then forgot about. The fix isn't smaller models or more guardrails. The fix is **verifiable cryptographic state**. If the agent asks the server "do I have the right to spend this note?", the server should be able to produce a proof that the agent can verify *locally*, with the same trust model the chain itself uses. Not a screenshot. Not an oracle. A Groth16 proof that the agent's runtime checks against the same verifying key the chain holds. This is the reason `@zera-labs/mcp-server` exists, and it's the reason it shipped on the [first day](/blog/zera_sdk_scaffolding/) of the SDK rather than as a v2 feature. If agents are going to interact with money — and the rate at which the next generation of agentic products is being shipped tells me they are — they need the same cryptographic verifiability that human users now expect from a wallet. The MCP layer is the agent's wallet. The SDK below it is the cryptographic verifiability. The chain underneath is the settlement. You don't have to believe the agent thesis is going to be huge. You only have to believe it isn't going to be zero. The MCP server is, on the day this letter ships, less than 500 lines of code. If the bet is wrong, I lose 500 lines of code. If it's right, the SDK ships into a market that is roughly 100× larger than the human-wallet market. ## Why a company instead of more posts People who have read me for a while know I do most of my thinking out loud, in writing, on this blog. There's an obvious version of all the above that's just *more posts about it*. Why a whole company. Two reasons. First: the surface area is larger than one person. The SDK alone is a Rust crate, a Neon binding, a TypeScript SDK, a prover, an MCP server, three transaction builders, a Surfpool devnet, a 144-test Vitest suite, and a documentation surface. The wallet is its own product. The AMM is its own product. The medical demo is its own product. The design system is its own product. I cannot ship that on weekends. Nobody can. Second: the work is more credible inside a company. When the SDK lands an audit, that audit lands on Zera Labs, not on "some guy with a blog." When the first integration partner asks who's accountable if the prover regresses, the answer is "Zera Labs," not "I'll get to it Tuesday." When a customer asks for a SOC 2, the answer is "we're working on it" instead of laughter. The legal and operational scaffolding is part of the product. I want to be clear about what I'm *not* claiming. I'm not claiming the team is huge. (`TODO: Dax confirm — keep this hedged until the team page is public.`) I'm not claiming we've raised a round. I'm not claiming we have customers I can name. I am claiming we have a working SDK, a working wallet, a working AMM, a working medical demo, a working devnet, and a Design System we use across the company. The rest is sequencing. ## The animating principle Every company has one sentence that explains what it is willing to be embarrassed about and what it is willing to be loud about. The one I keep coming back to for Zera Labs is: > *We build cryptographic infrastructure that is fast enough, cheap enough, and verifiable enough to leave the laboratory. Everything else is taste.* "Fast enough" is the ZK leg. "Cheap enough" is the Solana / compressed-token leg. "Verifiable enough" is the agentic leg. "Everything else is taste" means the design system, the documentation, the tone of the blog, the choice of dependencies, the way we write commit messages, the way we run incident response. None of those things are in the trade-off space. They are the part where the company has to be the company. If any of the three legs of the tripod were missing, this would be a research lab or a side project. All three are present. The thing to do, then, is to ship. ## Where to next If you want the technical receipts: - [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the 14-minute session that put the foundation in. - [144 Tests and a Surfpool Devnet](/blog/zera_sdk_test_suite/) — the bridge from "the code exists" to "you can use it." - [ZeraSwap: An AMM for Compressed Tokens](/blog/zeraswap_compressed_amm/) — the bandwidth half. - [Zera Wallet v3](/blog/zera_wallet_v3_zkp/) — the user-facing half. - [ZK-FHIR](/blog/zera_med_zk_fhir/) — the proof we can do this for things other than money. If you want the personal receipts: [Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/) and [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) are the two prior chapters. This one is chapter three. If you want to *use* the work — `dax@skill-issue.dev`. The calendar's [here](https://cal.com/daxts). That's the founding letter. Now I have to go ship the next thing. --- # Prediction Markets, LP Locks, and an Admin Page That Doesn’t Suck Canonical: https://blog.skill-issue.dev/blog/prediction_markets_admin/ Description: How I bolted CPMM prediction markets onto ZeraSwap, locked LP for graduated tokens, and built a 5-tab admin panel before the first malicious actor showed up. Published: 2026-02-18T19:31:55.000Z Tags: zera, solana, anchor, prediction-markets, cpmm, admin, governance A week after [the AMM shipped](/blog/zeraswap_compressed_amm/) I had two open feature requests from people who were actually using it: 1. "I want to bet on whether $TOKEN graduates by Friday." 2. "Why doesn't the launchpad lock LP after graduation? You're going to get rugged." Both fair. Both were addressed in [`16aa30d` — `Add prediction markets, LP locking, graduation flow, comprehensive admin, and USD pricing`](https://github.com/Dax911/z_trade/commit/16aa30d3ed2f552f743886a647ba1fc7f4773aed) on 2026-02-18. 55 files changed. Let's unpack the parts that actually matter. ## Prediction markets as a CPMM A prediction market is just a CPMM with two outcome reserves instead of a token + SOL pair. From [`sdk/src/prediction_math.ts`](https://github.com/Dax911/z_trade/blob/16aa30d3ed2f552f743886a647ba1fc7f4773aed/sdk/src/prediction_math.ts): ```ts // CPMM: shares_out = outcome_reserves * sol_after_fee // / (other_reserves + sol_after_fee) export function calcBuyOutcome( solIn: bigint, yesReserves: bigint, noReserves: bigint, outcome: "yes" | "no", feeBps: bigint, ): { sharesOut: bigint; fee: bigint } { const fee = (solIn * feeBps) / BPS_DENOMINATOR; const solAfterFee = solIn - fee; const outcomeReserves = outcome === "yes" ? yesReserves : noReserves; const otherReserves = outcome === "yes" ? noReserves : yesReserves; if (outcomeReserves === 0n) return { sharesOut: 0n, fee }; const sharesOut = (outcomeReserves * solAfterFee) / (otherReserves + solAfterFee); return { sharesOut, fee }; } // YES price = no_reserves / (yes_reserves + no_reserves) export function calcOutcomePrice(yesReserves, noReserves) { const total = yesReserves + noReserves; if (total === 0n) return { yesPrice: 0.5, noPrice: 0.5 }; // ... } ``` The trick is the *price*. In a YES/NO CPMM the price of YES is just the ratio of NO reserves to total reserves. That's because if YES is "expensive" (lots of YES shares already sold), there's less YES reserve left, and the next dollar buys you fewer YES shares. The math is symmetric. I picked CPMM over LMSR because: - The LP doesn't need to subsidize liquidity. Whoever creates the market puts up real SOL on both sides and earns the fees. - It uses literally the same `x*y=k` engine as ZeraSwap's swap path, so I could reuse the slippage and `MathOverflow` checks I'd already debugged. - Resolution is a single instruction that drains the losing side into the protocol and pays out the winning side proportionally. Six instructions on-chain: `create_market`, `buy_outcome`, `sell_outcome`, `resolve_market`, `claim_winnings`, `void_market` (plus protocol fee collection). The void path is the safety valve — if the resolution oracle disappears or the market becomes ambiguous, the admin can void and refund pro-rata. ## LP locking: the part that actually makes graduation safe Before this commit, when a launchpad token graduated to a real ZeraSwap pool, the LP tokens were minted to the launchpad authority and that was that. Nothing stopped the launch creator from yanking liquidity 30 seconds later. Classic rug. The fix is `LpLock` PDA + `lock_liquidity` and `extend_lock` instructions, and a check in `remove_liquidity` that consults the lock state. Now graduation locks the launch's LP for a configurable window. If you want to be a serious launch, you opt into a longer lock; the frontend surfaces the lock duration as a trust signal on the explore page. I shipped a related quality-of-life thing the same day in [`a02f672` — `lower graduation to 50 SOL`](https://github.com/Dax911/z_trade/commit/a02f67287a25ef3ce76117d6d592337002cb99a9). 85 SOL was the original threshold and nobody could actually graduate a token at $15K worth of bonding-curve liquidity. 50 SOL turned out to be the floor where a real microcap launch could clear graduation. ## The 5-tab admin page The same commit ships a five-tab admin panel: `Overview / Launchpad / AMM / Markets / Docs`. The reason this is its own thing is not vanity — it's that a Solana program with five separate config PDAs and three separate fee vaults *cannot be safely operated from a CLI*. You will misread a hex address. You will paste the wrong network. You will pause production thinking it's devnet. Each tab carries: - All three vault balances with USD denomination (SOL/USD pulled from CoinGecko via `SolPriceContext` polling). - "Initialize PDA" buttons for any config that hasn't been bootstrapped on the current cluster. - Per-launch / pool / market fee collection, plus a "collect all" bulk button. - The void-market button on the prediction tab, behind a confirm modal, because the void path is irreversible. I ended up needing this faster than I expected. The very next day I shipped [`557d314` — `Add migrate_config instruction for safe account resizing`](https://github.com/Dax911/z_trade/commit/557d314bd4c9d045823dbd8e6301742338f14ca6) and [`f673b22` — `Add config migration UI to admin page`](https://github.com/Dax911/z_trade/commit/f673b226a34dff77a35ccaf0db1c064112b528fb). The trigger: I'd added a `min_market_liquidity` field to `PredictionConfig` without bumping the account size, and existing configs on devnet couldn't take the update. The admin page detected old-format accounts via a length comparison and surfaced a "Migrate Config" button. `migrate_config` does what its name says — resizes the account, copies the old data, writes the new field. The trick I missed the first time, fixed in [`6d04415`](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158): when growing a PDA you have to fund the lamport difference via a System Program CPI transfer, not by directly debiting the user's lamports inside the program. Anchor will let you write the second one. The runtime will reject it. Welcome to Solana. ## Trade-offs **Why CPMM and not parimutuel pools?** Because parimutuel doesn't give you a price until resolution. CPMM lets traders see "YES is at 67¢" continuously. That's the entire UX of a prediction market. If you can't show a price, your users are going to ask why they shouldn't just use Polymarket. **Why void-market behind admin only?** Because the alternative is "anybody can vote to void a market they're losing" and that destroys the incentive to make confident bets. The market creator stakes the liquidity; the protocol admin holds the void key. The doc tab on the admin panel makes that policy explicit. **Why an admin page in a "decentralized" project?** Because the project isn't decentralized yet. I'm not going to pretend it is. The admin keys exist; they're documented; they will be migrated to a multisig, and eventually to TW-TVV-style governance ([described in the m0n3y origin post](/blog/m0n3y_naming_a_dream/)). Lying about that today doesn't make it true tomorrow. ## What this taught me The smart-contract surface of a Solana product compounds non-linearly. ZeraSwap had three PDAs and one fee vault. Adding prediction markets and LP locks brought it to seven PDAs and three fee vaults. The cost of ad-hoc admin tooling exploded. The 5-tab admin page paid for itself in the first hour after deploy when I needed to bulk-collect fees from 12 launches. ## Further reading - [The full prediction-markets commit](https://github.com/Dax911/z_trade/commit/16aa30d3ed2f552f743886a647ba1fc7f4773aed) - [`prediction_math.ts`](https://github.com/Dax911/z_trade/blob/16aa30d3ed2f552f743886a647ba1fc7f4773aed/sdk/src/prediction_math.ts) - [`migrate_config` instruction (the safe-resize fix)](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158) - [Polymarket](https://polymarket.com/) — the UX target nobody on Solana matches yet - [LMSR vs CPMM market makers](https://www.eecs.harvard.edu/cs286r/courses/fall12/papers/Hanson_LMSR.pdf) — the paper that justifies LMSR for thin markets --- # Five Commits to Get an OG Image Out of a Cloudflare Worker Canonical: https://blog.skill-issue.dev/blog/og_pngs_cf_workers/ Description: A 24-minute slog where I got dynamic OG PNG generation to work on Cloudflare Pages Functions. The bug is WebAssembly. The fix is a build-time WASM import. Published: 2026-02-15T17:14:55.000Z Tags: cloudflare, workers, wasm, og-image, svg, solana, devops The OG image is the thing that decides whether your link gets clicked on Twitter, Discord, or Telegram. If you ship a Solana DEX without per-token OG images, your share buttons are wallpaper. If you ship them as SVG, half the social platforms render them as blank cards because half the social platforms don't render SVG. So you ship them as PNG. Which means you generate them on the edge. Which means you call into WebAssembly from a Cloudflare Pages Function. Which means [you bang your head against the wall five commits in a row](https://github.com/Dax911/z_trade/commits/main/?after=cb14990c6fadb4abe5e111cd716b3bd08a528ae9+47). This post is a real-time receipt of that head-banging from 2026-02-15 between 17:03 and 17:30 UTC. ## The problem The function in question lived at [`functions/og/default.ts`](https://github.com/Dax911/z_trade/blob/962d55c629ce56324bf9cef135d5aeac76f4c2d9/functions/og/default.ts) — a Cloudflare Pages Function that takes a token mint, builds a stylized SVG card with live AMM stats, and converts it to a PNG with [`svg2png-wasm`](https://www.npmjs.com/package/svg2png-wasm). The conversion is the hard part. Everything else is sed-replacing tokens into a template string. The naive thing is what I shipped first in [1bac3bb — `Convert OG images from SVG to PNG`](https://github.com/Dax911/z_trade/commit/1bac3bbc1173ddf95a964c394858ca7192ce28ac): ```ts import { initialize, createSvg2png } from "svg2png-wasm"; const wasmRes = await fetch(WASM_URL); const wasm = await wasmRes.arrayBuffer(); await initialize(wasm); ``` This works locally. This works on Vercel. This does not work on Cloudflare Workers. ## Stage 1: dynamic import → static import (17:03) [81d3f16 — `Fix OG PNG: use static import for svg2png-wasm instead of dynamic import`](https://github.com/Dax911/z_trade/commit/81d3f16e965ef683dc48c1bb748852c7fcca112c). CF's bundler doesn't bundle dynamic imports the same way it bundles static imports. Static import. Move on. ## Stage 2: self-fetch → unpkg (17:06) [e2b0c76 — `Fix OG PNG: fetch WASM from unpkg CDN instead of self-fetch`](https://github.com/Dax911/z_trade/commit/e2b0c76a5e9dea8a425b768fe196a28315d16fa7). I had been serving the `.wasm` file from `app/public/` and fetching it via `fetch(env.url + "/svg2png_wasm_bg.wasm")`. CF Workers cannot fetch from themselves the way Node servers can — the request loops or 503s depending on the moon phase. I switched to unpkg's CDN. That worked, but introduced a runtime dependency on a third party. We come back to that. ## Stage 3: see the actual error (17:09) [102f485 — `debug: show OG PNG error details instead of silent fallback`](https://github.com/Dax911/z_trade/commit/102f48575a2bb7cd6fc8e08013d1a6c43cb1f117). Two hours into a deploy fight and you realize you've been catching the error and rendering the SVG fallback. Take the catch out. Suffer. The error: `WebAssembly.instantiate() of bytes from request body is not allowed in this Worker`. CF Workers block `WebAssembly.instantiate()` from raw bytes. Not deprecated. Not slow. Just *blocked*. They want you to use a build-time `import` so the WASM binary becomes a real module they can compile during deploy, not at runtime in your handler. This is a real security stance — they don't want Worker code instantiating arbitrary blobs at runtime — but it's not great when your library (`svg2png-wasm`) is built around a fetch-and-init pattern. ## Stage 4: build-time WASM import (17:12) [962d55c — `Fix OG PNG: use build-time WASM import for CF Workers compatibility`](https://github.com/Dax911/z_trade/commit/962d55c629ce56324bf9cef135d5aeac76f4c2d9). This is the actual fix: ```ts // @ts-ignore — CF Workers WASM import (compiled at build time) import wasmModule from "./svg2png.wasm"; let svg2pngConverter: Svg2png | null = null; async function ensureSvg2png(): Promise { if (svg2pngConverter) return svg2pngConverter; if (!initPromise) { initPromise = (async () => { await initialize(wasmModule); svg2pngConverter = createSvg2png(); })(); } await initPromise; return svg2pngConverter; } ``` You commit `svg2png.wasm` (~2MB) inside the Functions directory. CF picks it up at deploy time, treats it as a Worker-managed module, and binds the import to a real `WebAssembly.Module`. `initialize(wasmModule)` then takes a `Module` instead of bytes, which is the pre-compiled path that CF allows. ## Stage 5: directory math (17:14) [9ccab18 — `Fix WASM import path`](https://github.com/Dax911/z_trade/commit/9ccab18e0f7f52d23feadbcac0d8033031c6e848). The per-token endpoint lives at `functions/og/token/[mint].ts`. The wasm I committed lives at `functions/og/svg2png.wasm`. The relative import was wrong. `../svg2png.wasm`. Done. ## Stage 6: fonts don't ship with the bundle (17:30) [1c91af7 — `Fix OG images: register Inter + JetBrains Mono fonts for svg2png-wasm`](https://github.com/Dax911/z_trade/commit/1c91af7994df8330f75553a004a3819ce1def75e). Same idea. `svg2png-wasm` rasterizes text by looking up the font registered in its own runtime, not the host's. The OG card uses Inter and JetBrains Mono. If you don't `registerFont(await loadFontBytes())` for both before calling the converter, your text rasterizes as `□□□□`. Hilarious in test environments. Catastrophic on a public DEX. ## What this actually looked like deployed The card is a `1200x630` SVG composed inline in TypeScript. The interesting part is the data fetch — I'm pulling live pool reserves from the cached market-data API I'd shipped one commit earlier in [5627d4d — `Add edge-cached market data API`](https://github.com/Dax911/z_trade/commit/5627d4d099cff09e708e01ae0a0c77248d714e5f), so the OG card always reflects the *current* price, capped to the cache TTL. That's the entire reason this had to live on the edge: a static image generated at build time would show stale prices forever. ## Trade-offs **Why not use [`@vercel/og`](https://vercel.com/docs/functions/og-image-generation)?** Because we're on CF Pages, and Vercel's OG library is bound to React + Satori in a way that's genuinely hard to extract. `svg2png-wasm` is 4 dependencies and one WASM file. The cost of "just write the SVG yourself" turned out to be lower than I expected. **Why commit the wasm file to git?** It's 2MB. My repo is not a museum. I'd rather have a deterministic deploy that doesn't depend on unpkg being up. **Why not pre-render on cron and serve static PNGs?** Because there are 50+ tokens at any given moment, and pre-rendering all of them on a cron is busywork that wastes cycles 99.9% of the time. The right shape is "render on cache miss, serve from cache for 24h." Which is what shipped. ## What this taught me Cloudflare's WASM contract is *real* and you cannot work around it. The error message is clear once you stop swallowing it. The ecosystem of WASM libraries is mostly written assuming Node-style runtime fetch, so half of the porting work is going to be "convince this library to take a `WebAssembly.Module` instead of a `BufferSource`." Some libraries refuse to accept that as a PR; in those cases you write a thin wrapper or you fork. Five commits in 24 minutes is not a flex. It's a confession that the only way I could solve this was to ship to production and let the runtime tell me what was wrong, because there is no other place that runs this stack the way Cloudflare does. CI didn't catch it. Local `wrangler pages dev` didn't catch it. Production caught it in 30 seconds. ## Further reading - [Cloudflare Workers — WebAssembly modules](https://developers.cloudflare.com/workers/runtime-apis/webassembly/) - [`svg2png-wasm` on npm](https://www.npmjs.com/package/svg2png-wasm) - [The full sequence of commits on z_trade between 17:03–17:30 UTC](https://github.com/Dax911/z_trade/commits/main/?since=2026-02-15) - [ZeraSwap origin post](/blog/zeraswap_compressed_amm/) — the project this OG card is for. --- # ZeraSwap: An AMM for Compressed Tokens Canonical: https://blog.skill-issue.dev/blog/zeraswap_compressed_amm/ Description: Initial commit of the first compressed-token AMM on Solana — Anchor program, x*y=k math, SOL/cToken pairs, and the cyberpunk launchpad UI that grew up around it. Published: 2026-02-10T21:03:36.000Z Tags: zera, solana, anchor, amm, light-protocol, compressed-tokens, rust > "Initial ZeraSwap: compressed token AMM for Solana" That's the [first commit on z_trade](https://github.com/Dax911/z_trade/commit/b088fe8bf3eb8c1047712abb53d865fd3ac93db3), dropped at 2026-02-10T21:03:36Z. It's also, as far as I'm aware, the first AMM where the token side of every pool is a Light Protocol compressed token instead of an SPL token. That's not an accident; that's the entire pitch. Solana compressed tokens (`@lightprotocol/compressed-token`) cost roughly 1/5000th of SPL tokens to mint and transfer at scale, because the account state lives in a Merkle tree off-chain instead of a 175-byte SPL account on-chain. That's incredible for token launches, terrible for AMMs — because every existing AMM expects to hold token accounts. So if you want compressed tokens to actually be useful as economic objects, you need an AMM that natively takes them. ## The Anchor program Seven instructions. From [`programs/zeraswap/src/lib.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/lib.rs): ```rust #[program] pub mod zeraswap { use super::*; pub fn initialize_protocol(ctx, fee_recipient, lp_fee_bps, protocol_fee_bps) -> Result<()> { ... } pub fn create_pool(ctx, initial_sol, initial_tokens) -> Result<()> { ... } pub fn add_liquidity(ctx, sol_amount, token_amount, min_lp_out) -> Result<()> { ... } pub fn remove_liquidity(ctx, lp_amount, min_sol_out, min_tokens_out) -> Result<()> { ... } pub fn swap_sol_for_tokens(ctx, sol_in, min_tokens_out) -> Result<()> { ... } pub fn swap_tokens_for_sol(ctx, tokens_in, min_sol_out) -> Result<()> { ... } pub fn collect_fees(ctx) -> Result<()> { ... } } ``` Constants ([`constants.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/constants.rs)): ```rust pub const DEFAULT_LP_FEE_BPS: u16 = 20; // 0.20% pub const DEFAULT_PROTOCOL_FEE_BPS: u16 = 5; // 0.05% pub const MAX_FEE_BPS: u16 = 1000; // 10% max total pub const MINIMUM_LIQUIDITY: u64 = 1_000; // locked forever on first deposit pub const MINIMUM_SOL_RESERVES: u64 = 10_000; // 0.00001 SOL ``` The math is `x*y=k`, the same constant-product curve Uniswap v1 shipped in 2018. There's a reason every L1 AMM eventually defaults to this: it has no edge cases that you find in production. From [`instructions/swap.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/instructions/swap.rs): ```rust // Constant product: // tokens_out = token_reserves * sol_in_after_fee // / (sol_reserves + sol_in_after_fee) let tokens_out = (pool.token_reserves as u128) .checked_mul(sol_in_after_fee as u128)? .checked_div( (pool.sol_reserves as u128).checked_add(sol_in_after_fee as u128)?, )? as u64; require!(tokens_out >= min_tokens_out, ZeraSwapError::SlippageExceeded); require!(tokens_out < pool.token_reserves, ZeraSwapError::ReservesDrained); ``` I wrote it `u128`-promoted for the multiply, then cast back to `u64` after the divide, because `u64 * u64` overflows roughly the moment any pool gets serious volume. Nothing exciting; just the kind of detail that bites you exactly once. ## What's *actually* novel The thing I had to figure out wasn't the curve. It was state trees. Each pool gets its own `state_tree: Pubkey` field in the [`Pool`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/state.rs) struct: ```rust #[account] pub struct Pool { pub token_mint: Pubkey, pub lp_mint: Pubkey, pub sol_vault: Pubkey, /// Dedicated state tree for this pool's compressed token operations pub state_tree: Pubkey, pub sol_reserves: u64, pub token_reserves: u64, pub lp_supply: u64, // ... } ``` Light Protocol's compressed token operations need an explicit `state_tree` reference. If you forget that, the compress/decompress CPI just silently lands the tokens in someone else's tree, and your pool can never reconstruct them. Five days of staring at logs taught me to put `state_tree` directly on the `Pool` account at creation time and never touch it again. ## Five days later: the cyberpunk launchpad The next major commit is [b6b6fa5 — `Add shared AMM vault, launchpad, pools, transfers, cyberpunk UI`](https://github.com/Dax911/z_trade/commit/b6b6fa50c6f9678f69375067b33379d99feeff49) on 2026-02-15. This is where the AMM stopped being a barebones swap and started being a launchpad — bonding curves, internal `UserPosition.token_balance` accounting, a graduation flow at 50 SOL of bonding-curve liquidity, and the cyan/purple cyberpunk frontend that ended up being the project's identity. The launchpad is conceptually a separate Anchor program that buys/sells against a virtual reserve (think pump.fun) until a token "graduates" to a real ZeraSwap AMM pool. The curve uses a base reserve to bootstrap price discovery. From the same day, I shipped both [`f3f71f3` and `d01b4683`](https://github.com/Dax911/z_trade/commit/d01b4683d109af3dc58f48aaf7344d463700de55) lowering graduation from 85 → 50 SOL after the first paper trade made it obvious 85 was too high — nobody graduates a token if they need to spend $15K to do it. ## The quality-of-life shift The most under-appreciated commit of that February sprint is [cb14990 — `Fix RPC spam: pause polling on hidden tabs`](https://github.com/Dax911/z_trade/commit/cb14990c6fadb4abe5e111cd716b3bd08a528ae9). The whole repo had been making 46–94 RPC calls/min to Helius. New worst case after the fix: 12 calls/min on the active tab, 0 on hidden tabs. The hook is six lines of meaningful code: ```ts // app/src/hooks/useVisibleInterval.ts function onVisibilityChange() { if (document.hidden) { stop(); } else { savedCallback.current(); // fire immediately on re-show start(); } } document.addEventListener("visibilitychange", onVisibilityChange); ``` A free tier of Helius is 100k calls/day. A tab open for 24 hours at 94 calls/min burns through that in 18 hours. This bug was costing me real money. The fix shipped 12 days into the project. ## Trade-offs **Why not use an existing AMM SDK?** Because none of them know what to do with `@lightprotocol/compressed-token`. Orca, Raydium, Meteora — every one of them assumes SPL token accounts. By the time you've patched their account derivation, you've written your own program anyway. **Why x\*y=k instead of concentrated liquidity?** Because the AMM is a graduation target for the launchpad, not a yield-farming venue. The launch flow guarantees pools start with deep, balanced reserves. Concentrated liquidity in that environment is just a way to price-impact yourself. If somebody serious comes along and wants to bring real liquidity, they can fork the program; the math is 30 lines. **Why two fees (LP + protocol)?** Because I don't trust myself to skim the protocol fee out of LP revenue post-hoc. Putting the protocol fee on a separate counter from the start was cheap then and saved me a `migrate_config` ([`6d04415`](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158)) later — well, *almost* saved me. We'll get to that. ## What this taught me Compressed tokens are an unfair advantage for whoever ships first, because the entire DEX ecosystem on Solana is built on the assumption that "token" = "SPL Token Account." Light Protocol changed that assumption. The block of code most people miss is keeping a `state_tree` field on every pool — once you've done that, everything else is x\*y=k and being kind to your RPC provider. ## Further reading - [z_trade on GitHub](https://github.com/Dax911/z_trade) - [Initial ZeraSwap commit](https://github.com/Dax911/z_trade/commit/b088fe8bf3eb8c1047712abb53d865fd3ac93db3) - [Light Protocol — compressed tokens](https://www.lightprotocol.com/) - ["Building A Better Cryptocurrency"](/blog/a_better_crypto/) — the stance on protocol-level fee design that informed `MAX_FEE_BPS = 1000`. - [Stuck Sell, Post-Graduation](/blog/stuck_sell_post_grad/) — the bug this design eventually wrote me a check for. --- # ZK-FHIR: A Medical Demo That Doesn’t Leak Patients Canonical: https://blog.skill-issue.dev/blog/zera_med_zk_fhir/ Description: Building a RISC Zero zkVM gateway for FHIR-shaped medical records — proofs over private patient data, zero-knowledge insurance claims, and HIV/STI compartmentalization. Published: 2026-02-11T06:29:06.000Z Tags: zera, zk, risc-zero, fhir, healthcare, privacy, cloudflare-pages The whole `zera_med_demo` repo exists because someone asked me, "if your privacy chain is real, prove it works for something other than crypto bros." Fair. So I spent a weekend building a working RISC Zero zkVM gateway for FHIR-shaped medical records. The MVP shipped at [commit 8ae0a7a — `Zera Medical ZK-FHIR Gateway MVP`](https://github.com/Dax911/zera_med_demo/commit/8ae0a7a64096376893206187e61e2c9f295a9050) on 2026-02-11. Full-stack: React frontend, Express + SQLite backend, real RISC Zero zkVM in `zkvm/`. Nine proof operations, every one of them running through an actual guest program — none of this "we'll mock the proof" demo nonsense. ## The shape of the problem FHIR is healthcare's answer to "data interoperability." The thing FHIR does not do is privacy. If a hospital sends FHIR records to an insurer to back a claim, the insurer learns the entire record. If a researcher queries an aggregate, the institution sending data has to trust the researcher's de-identification. ZK lets you flip that. The prover holds the private record. The verifier learns only what the proof's public outputs reveal. Everything else stays on the prover's side of the airgap. The MVP defined nine operations, each with a strict private/public split: ```rust // zkvm/methods/guest/src/main.rs match operation.as_str() { "record_commit" => run_record_commit(), "access_verify" => run_access_verify(), "aggregate_query" => run_aggregate_query(), "insurance_claim" => run_insurance_claim(), "consent_grant" => run_consent_grant(), "consent_revoke" => run_consent_revoke(), "emergency_access" => run_emergency_access(), "prior_auth" => run_prior_auth(), "compliance_audit" => run_compliance_audit(), _ => panic!("Unknown operation: {}", operation), } ``` The model: every guest reads private inputs (the patient record, the credential, the consent), commits exactly the public outputs the use case needs, and nothing else. `record_commit` for example is just a content-addressed handle — the journal carries `commitment_hash`, `patient_id_hash`, `record_type`, `resource_count`, `data_hash`. The actual conditions and observations never leave the prover. ## `access_verify`: the boring proof that justifies the whole thing If you only have the patience for one operation, it's this one. Doctor wants to read patient X. The hospital has a credential, the patient has signed a consent, and someone has to verify — without revealing the contents of the record — that the access was valid. From [`zkvm/methods/guest/src/main.rs`](https://github.com/Dax911/zera_med_demo/blob/8ae0a7a64096376893206187e61e2c9f295a9050/zkvm/methods/guest/src/main.rs): ```rust let credential_valid = !input.credential.role.is_empty() && !input.credential.institution.is_empty() && input.credential.valid_until >= input.current_timestamp; let consent_valid = input.consent.grantee_id == input.credential.accessor_id && input.consent.purpose == input.purpose && input.consent.valid_from <= input.current_timestamp && input.consent.valid_until >= input.current_timestamp; let authorized = credential_valid && consent_valid; ``` Boring. That's the point. The boring part is the predicate. The interesting part is that `input.patient_record` — which the predicate doesn't even read — never leaves the zkVM. The verifier learns: - Was access authorized? (a single bit) - What role accessed it? (`Doctor`, `Researcher`, `Insurer`) - A nullifier: ```rust let mut nullifier_hasher = Sha256::new(); nullifier_hasher.update(&input.credential.accessor_id); nullifier_hasher.update(&record_hash); nullifier_hasher.update(&input.current_timestamp); let nullifier = hex::encode(nullifier_hasher.finalize()); ``` The nullifier prevents the same access from being double-counted in audits. The record hash binds the access to a specific record without revealing it. That's the whole shape of every other operation in the demo. ## The detour: insurance claims that compartmentalize by carrier The next interesting commit is [c65cab8 — `Add ZKP visualization modal, HIV/STI data, insurer selectors`](https://github.com/Dax911/zera_med_demo/commit/c65cab8954ddc0a3ba7b308a58b36078497d34f9) on 2026-02-11. Three things landed at once: 1. **The ZK proof modal** — a full-screen animated panel that walks the user through `Private Data → RISC Zero zkVM → Proof Output`, with a comparison panel showing what the verifier sees vs. what the prover holds. Educational. People who've never touched a Groth16 receipt before will sit through 90 seconds of animation if it's pretty. 2. **HIV/STI data**. ICD-10 codes B20 (HIV disease), Z21 (asymptomatic HIV), Hep B/C, syphilis, gonorrhea, chlamydia, herpes, HPV. Plus viral load, CD4, PCR observations. ARVs: Biktarvy, Triumeq, Descovy PrEP. This is the data category that destroys lives when it leaks. So obviously this is the category the demo has to handle, or the demo is decorative. 3. **Insurer compartmentalization**. Each insurer's view is filtered to its own members. Aetna users don't see UnitedHealth records. The demo enforces this in the SQLite layer, but the ZK guest enforces it cryptographically — `insurance_claim` commits the insurer's identity in the journal, and the seed data is stamped with insurer membership. This isn't theoretical. Compartmentalization is the only reason this kind of demo isn't a HIPAA disaster waiting to happen. ## Cloudflare Pages: the dumb part of any full-stack demo Three of the six commits in the repo are deploy fixes. [c59509d — `Fix Cloudflare build: track src/data/types.ts`](https://github.com/Dax911/zera_med_demo/commit/c59509d3a6419944cb60cf6b1758dddc6f98b791), [2efff06 — `Add missing HospitalResult type`](https://github.com/Dax911/zera_med_demo/commit/2efff06c6d21c4a38fcb97d509a5b08bae5c039f), [1d0c2e2 — `Add wrangler.jsonc for Cloudflare Pages static asset deploy`](https://github.com/Dax911/zera_med_demo/commit/1d0c2e28a3c6a09381632cd9c6ca8155a6515d39). This is the part of every demo nobody writes about. You build a beautiful zk pipeline, you ship it to a static host, the host's build environment doesn't have a TypeScript file you forgot to track, and three commits later your gitignore is shorter and you've learned not to put `src/data/types.ts` in `.gitignore`. Real life. ## What this taught me The fact that I had to ship the *demo* before anyone took the privacy claim seriously is a recurring theme. People do not believe a chain is private because the white paper says so. They believe it because they can click a button labeled "Run Insurance Claim Proof" and watch the modal split private inputs from public outputs in real time. That modal is the most expensive component in the repo. It is also the only one that materially changed how the demo lands. The other thing this taught me: RISC Zero is unreasonably good for "let me prove a JavaScript-like predicate over JSON-shaped private data without learning to write Circom." The guest is just Rust. The verifier is a single library call. If your team's bottleneck is "we can't hire a circuit engineer for one demo," reach for a zkVM before you reach for snarkjs. ## Further reading - [zera_med_demo on GitHub](https://github.com/Dax911/zera_med_demo) — the whole repo. - [Initial MVP commit](https://github.com/Dax911/zera_med_demo/commit/8ae0a7a64096376893206187e61e2c9f295a9050) — full guest + host implementation. - [RISC Zero zkVM docs](https://dev.risczero.com/) — what `env::commit` actually does. - [HL7 FHIR spec](https://www.hl7.org/fhir/) — the data shape this demo is hiding. - [Building A Better Cryptocurrency](/blog/a_better_crypto/) — same privacy thesis, different vertical. --- # A Privacy Demo That Works on a Phone: Mobile Drawer, HUD Offsets, and Real Breach Data Canonical: https://blog.skill-issue.dev/blog/zera_med_responsive_hud/ Description: Bolting a mobile drawer onto the Zera Med ZK-FHIR demo without breaking the desktop sidebar, fixing AnimatePresence warnings, and updating PrivacyChallenge with 2024-2025 breach data. Published: 2026-02-11T22:48:22.000Z Tags: zera-med, react, tailwind, responsive, accessibility, framer-motion, demo The unspoken rule of demo apps is that they're built for laptops. You'd never demo a healthcare privacy product from a phone. You'd plug the laptop into a projector and run it from a 13" screen. Real users wouldn't be on a phone, the dataset has columns that don't fit on mobile, and you've shipped a desktop-only experience without thinking about it. But every demo I've done in 2026 has had at least one person in the room pulling up the URL on their phone *while I'm presenting*. They're checking the responsive design. They're clicking around in the half-attention you'd give a panel discussion. If the phone experience falls apart, that person walks away with the impression that the product falls apart, regardless of how clean the laptop view is. [`bb9bb51 — Add responsive layout with mobile drawer, centered content, and accuracy updates`](https://github.com/Dax911/zera_med_demo/commit/bb9bb51) on 2026-02-11 was the day I bolted on real mobile support. Six files changed, +4969 lines, three new pages. Let's look at what mattered. ## The mobile drawer pattern The desktop nav is a fixed left sidebar. The mobile nav is a hamburger that slides a drawer in from the left. The trick is doing both with the same component tree: ```tsx // Sidebar.tsx (excerpt) const isMobile = useMediaQuery('(max-width: 1023px)') const [drawerOpen, setDrawerOpen] = useState(false) return ( <> {/* Mobile header — only on small screens */} {isMobile && (
Zera Med
)} {/* Sidebar — fixed left on desktop, slide-in drawer on mobile */} {(!isMobile || drawerOpen) && ( {/* nav links */} )} {/* Backdrop — only when drawer is open on mobile */} {isMobile && drawerOpen && ( setDrawerOpen(false)} initial={{ opacity: 0 }} animate={{ opacity: 1 }} exit={{ opacity: 0 }} /> )} ) ``` Three things to call out: **`useMediaQuery` — not just `window.innerWidth`.** I added a tiny hook in this commit: ```tsx // useMediaQuery.ts export function useMediaQuery(query: string): boolean { const [matches, setMatches] = useState(() => typeof window !== 'undefined' && window.matchMedia(query).matches ) useEffect(() => { const mq = window.matchMedia(query) const onChange = (e: MediaQueryListEvent) => setMatches(e.matches) mq.addEventListener('change', onChange) return () => mq.removeEventListener('change', onChange) }, [query]) return matches } ``` The reason `window.innerWidth` is wrong: it doesn't subscribe to changes. You'd need a manual `resize` listener with debouncing. `matchMedia` with `addEventListener('change')` is the platform-native way and it's both faster (no JS resize event spam during drag-resize) and less code. **`{(!isMobile || drawerOpen) && ...}`.** The mount/unmount logic. On desktop, the sidebar is always present. On mobile, it's only present when the drawer is open. This is what `AnimatePresence` needs to wrap correctly — the component literally unmounts when the drawer closes, which triggers the slide-out exit animation. **Body scroll lock.** Not in the snippet but in the full diff: when the drawer is open on mobile, `document.body.style.overflow = 'hidden'` to prevent the underlying page from scrolling under the drawer. Without this, the drawer is open, the user starts scrolling, and the *page behind the drawer* scrolls instead of the drawer's contents. UX bug from hell. ## Sticky HUDs and the mobile-header offset The Zera Med demo has "HUD panels" that stick to the top of the page on each route — they show the current role (Patient/Doctor/Insurer/etc.) and a quick action menu. On desktop, they sit at `top: 0`. On mobile, the page has a 56px header at `top: 0` already, so the HUDs need to slide down by 56px: ```jsx
``` Tailwind's `top-14` is `3.5rem` = 56px. `lg:top-0` overrides for `lg+` viewports where the mobile header isn't rendered. Two utility classes, exactly the right offset, no media-query logic in the component. This is the kind of thing that's easy to miss until the demo opens on a phone and the HUD is hidden behind the mobile header. Then you spend ten minutes debugging because everything looks fine in dev tools' "responsive" mode, where the mobile header *is* shown but the layout is otherwise desktop. The fix is one className. Finding the bug is the project. ## Tight grids that collapse gracefully The dashboards have grids like `grid-cols-4` and `grid-cols-6` for layouts of metric cards. On a 320px-wide phone, four cards across is 80px each, which is unreadable. The solution is per-breakpoint cols: ```jsx
{metrics.map(m => )}
``` This is the standard Tailwind approach — it's not novel — but applying it to *every grid* in the demo took a careful pass. Some grids in the original were `grid-cols-4` (no breakpoint prefix), which forced four-across on every viewport. The diff replaced 12 such grids with breakpoint-aware variants. The mental model I use: **`grid-cols-N` should always have a `:grid-cols-K` partner unless you've intentionally decided "this layout is mobile-only" or "this layout never goes below 4 across."** The default of "this works on 1280px-wide screens and breaks below" is the desktop-blinkered version of the same component. ## Fixing the `AnimatePresence` warning ``` Warning: Each child in a list should have a unique "key" prop. Or alternatively when using AnimatePresence: AnimatePresence requires every child to have a unique `key` prop, even when only one child is rendered. ``` Anyone who's used Framer Motion has seen this. The PrivacyChallenge component had this exact bug — a single conditionally-rendered `` inside `` with no key prop. The fix: ```jsx {currentLab && ( )} ``` The `key={currentLab.id}` is what tells AnimatePresence that "this is a *different* element when `currentLab.id` changes," and triggers the exit animation of the old one and the enter animation of the new one. Without the key, Framer Motion sees the same element with new props and skips the exit/enter cycle. The result is content swapping with no transition, plus the warning in console. `mode="wait"` is the other half: it tells Framer to wait for the exit animation to complete before mounting the next child. Without it, exit and enter happen simultaneously and the layout flashes during the crossover. This is in the docs. It's still the most common framer-motion mistake in the wild. The fix is two lines. Everyone gets bitten by it once. ## The PrivacyChallenge accuracy update The most important part of this commit isn't the responsive plumbing. It's the data: > PrivacyChallenge: accuracy updates with 2024-2025 breach data and citations The PrivacyChallenge is a four-level interactive component where the user plays "data broker" trying to re-identify anonymized records. Each level uses a real-world re-identification attack (k-anonymity failure, demographic triangulation, ZIP+DOB+sex matching, free-text leakage), and each level cites a real published breach. Before this commit, the citations were dated 2017–2020 — peer-reviewed but stale. After this commit, the citations include: - The 2024 Change Healthcare ransomware attack (100M+ records). - The 2024 Snowflake/AT&T breach (109M+ wireless customers). - The 2025 Ascension Health breach (5.6M patients). - The 2025 LabCorp / Synnovis crossover incidents. Every breach in the citation list is real, dated within 24 months of the demo, and verifiable via public reporting. Why does this matter? Because the audience for this demo is healthcare buyers — IT directors, compliance officers, hospital CTOs — and they all know about the 2024 Change Healthcare breach. It cost UnitedHealth ~$22B in damages and direct response costs. Every healthcare buyer's threat model has been re-shaped by it. **A privacy demo that doesn't reference the breach the audience just lived through is a demo that hasn't done its homework.** The same is true of the other items. A 2017 breach is academic; a 2024 breach is "this could happen to my hospital next quarter." The credibility of the demo is the credibility of its references. ## Trust Score formula fix and Level 4 RNG removal Two smaller fixes in the same commit, both addressing demo failure modes: **Trust Score formula.** The demo computes a "Trust Score" (0–100) showing how identifiable a record is after the user's deanonymization attempts. The original formula had an integer-division bug that produced 0 for any score below 1.0. The fix was switching to floating-point math. Tiny diff, big visible difference — instead of every level showing "Trust Score: 0," the levels now show "Trust Score: 12 / 47 / 73 / 89" depending on how successful the user's attack was. **Level 4 always awards 3 stars.** The original Level 4 had an RNG-based reward — sometimes you got 3 stars for completing it, sometimes 2 stars, dependent on a `Math.random()` check. This was the wrong design. **A demo cannot have non-deterministic UX**, because if the demo person hits "the bad random roll" in front of a buyer, the buyer thinks the product is buggy. Removing the RNG and always awarding 3 stars on completion is the right call. The interactive challenge isn't a casino; it's a learning experience. The lesson: **deterministic demos beat dynamic demos every time.** If you want randomization, save it for the production app. ## What I'd do differently **The mobile drawer should have a swipe-to-close.** Right now you tap the backdrop or the close button. A swipe-left would be more native. Framer Motion's `drag` API would do it in 10 lines. **The HUD's `top-14` is hardcoded.** A CSS custom property `--mobile-header-height: 3.5rem` set on the body would let the HUD position itself relative to the *real* header height, not a magic number that goes wrong if the header ever changes. **The `useMediaQuery` hook should default to a server-safe value.** As written, the hook returns `false` on SSR, which would cause a flash if this demo ever ran with hydration. The Zera Med demo is pure CSR so it doesn't hit this, but the hook is a re-usable building block I should harden. ## Trade-offs **Why not use a router-aware drawer library?** Because the demo only has one drawer, on one page. Adding `vaul` or `@radix-ui/react-dialog` for one drawer is overkill. Framer Motion's `motion.aside` with hand-rolled state is 60 lines of code and zero new dependencies. **Why responsive at the design-token level (Tailwind classes) instead of CSS-in-JS?** Because Tailwind's responsive utilities are inline-readable. `lg:top-0` reads like "on lg+, top is 0," which is faster to skim than a styled-components prop spread across multiple breakpoints. The cost is verbosity; the benefit is grep-ability. **Why update breach citations instead of removing them?** Because the citations are the strongest argument the demo makes. Removing them would weaken the privacy case from "here's why this matters, citing real recent breaches" to "trust me, privacy matters." The harder pitch. ## What this taught me A demo that doesn't survive a phone is a demo that loses one in three viewers, even when the phone-watcher is a passive observer. Responsive design isn't optional even for desktop-target apps; it's the cost of admission for any web-shipped product. The accuracy/citation work taught me that **demo data quality is the demo.** The same modal animation, with stale 2017 breach data, is a less compelling product than the same modal with 2024 breach data. The cryptography is the same. The conviction in the audience is different. ## Further reading - [The bb9bb51 commit](https://github.com/Dax911/zera_med_demo/commit/bb9bb51) — the diff this post is about. - [Zera Med ZK-FHIR origin](/blog/zera_med_zk_fhir/) — the project this is bolted onto. - [ZkProofModal post](/blog/zera_med_zk_proof_modal/) — the animation pattern this commit also tweaks. - [Framer Motion AnimatePresence docs](https://www.framer.com/motion/animate-presence/) — the canonical docs for the warning I fixed. - [Tailwind responsive design docs](https://tailwindcss.com/docs/responsive-design) — the breakpoint prefixes I leaned on. - [HHS Breach Portal](https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf) — the source of the 2024–2025 breach data the PrivacyChallenge cites. --- # Zera Janitor: Closing Solana Dust Accounts in Leptos WASM Canonical: https://blog.skill-issue.dev/blog/zera_janitor_leptos_wasm/ Description: A Solana program + Leptos 0.7 frontend that scans your wallet for empty SPL token accounts, batches up to 25 closes per transaction via CPI, and pays you back 95% of the rent. The fee path is the actual interesting part. Published: 2026-02-10T20:24:09.000Z Tags: solana, rust, leptos, wasm, cpi, spl-token, side-quest Solana has a fee model that punishes inactivity: every account on the network owes a rent deposit proportional to its data size, and most of those accounts are SPL token accounts (165 bytes, ~0.002 SOL of rent each). A wallet that has interacted with a hundred different airdrops and DEX pools accumulates a hundred token accounts holding zero balance. They sit there forever unless you `closeAccount` them, which costs you the cognitive overhead of figuring out which ones are dust and the gas cost of one transaction per close. The collective sleeping rent across all dusty Solana wallets is in the tens of millions of dollars. The clean-up tool sees obvious value capture. The catch: cleaning isn't free. You still need to *send* the transactions, and naive 1-account-per-tx flows hit the network limit immediately. That's the project I shipped on 2026-02-10 in [`7aeb309 — Initial implementation of Zera Janitor`](https://github.com/Dax911/SolFetc_rs/commit/7aeb309) — a Rust workspace with three crates: 1. **`shared/`** — common constants (program ID, vault seed, fee BPS). 2. **`program/`** — on-chain Solana program with one instruction (`BatchClean`) that closes up to 25 token accounts via CPI in a single tx. 3. **`app/`** — Leptos 0.7 client-side WASM frontend that scans the wallet, lets you select accounts, and submits batched transactions through a JS shim. This post is about why each crate looks the way it does — particularly the fee-split economics on-chain and the CSR-WASM-with-JS-shim hybrid for transaction signing. ## The on-chain economics The interesting part of `program/src/processor.rs` is *not* the close loop. It's what happens after: ```rust // 5. Calculate rent collected let lamports_after = vault.lamports(); let rent_collected = lamports_after .checked_sub(lamports_before) .ok_or(JanitorError::Overflow)?; msg!("Rent collected: {} lamports", rent_collected); // 6. Split: fee to treasury, remainder to user let fee = rent_collected .checked_mul(FEE_BPS) .ok_or(JanitorError::Overflow)? .checked_div(BPS_DENOMINATOR) .ok_or(JanitorError::Overflow)?; let user_payout = rent_collected .checked_sub(fee) .ok_or(JanitorError::Overflow)?; // 7. Direct lamport transfer (vault is program-owned PDA) **vault.try_borrow_mut_lamports()? -= fee + user_payout; **treasury.try_borrow_mut_lamports()? += fee; **user.try_borrow_mut_lamports()? += user_payout; ``` `FEE_BPS = 500` and `BPS_DENOMINATOR = 10_000`, so the fee is 5% and the user keeps 95%. Each closed account returns ~2,039,280 lamports of rent; if you close 25 in one batch you collect ~51M lamports (~0.051 SOL), the program keeps ~2.5M, and the user gets ~48.5M. Two things to note: **The user signs once.** `process_batch_clean` walks the remaining `accounts` slice and assumes everything past the first four (user, vault, treasury, token program) is a token account to close. The CPI is `invoke_signed` because the *vault* (program PDA) signs as the destination of each `closeAccount`. The user only has to authorize the outer transaction, not each individual close. That's the whole point of the batch. **The fee path is direct lamport math.** Lines 7 are doing `**vault.try_borrow_mut_lamports()? -= fee + user_payout`. This is *only* legal because the vault is a program-owned PDA, and Solana lets a program directly mutate lamports on accounts it owns. If we tried this on the user's account we'd panic. If we tried it on the treasury (someone else owns it), the runtime would reject the transaction. The PDA-as-vault pattern is what makes the fee-split possible without a CPI to the system program. **Checked arithmetic everywhere.** `checked_sub`, `checked_mul`, `checked_div` instead of `-`, `*`, `/`. On a Solana program, an integer overflow in non-checked arithmetic in release mode wraps silently. Wrapping a fee calculation gives an attacker an arithmetic vector. Every program written for production should use `checked_*` math even when the values are bounded by a 64-bit balance. The cost is cheap — a few extra CU's per op — and the alternative is worse. ## Why batched at 25? The Solana transaction size limit is 1232 bytes. Each `closeAccount` CPI requires the destination's `AccountMeta` and the token account's `AccountMeta`, plus the inner instruction data. After accounting for the four base accounts (user/vault/treasury/token program) and the outer `BatchClean` instruction header, you can fit ~25 token accounts per transaction before bumping the byte limit. The frontend respects this: ```rust const MAX_ACCOUNTS_PER_TX: usize = 25; let chunks: Vec> = selected_accounts .chunks(MAX_ACCOUNTS_PER_TX) .map(|c| c.to_vec()) .collect(); for chunk in &chunks { let num = chunk.len() as u8; let ix_data = build_batch_clean_data(num); // build metas, sign, send } ``` If you select 100 dusty accounts in the UI, this fans out to 4 transactions. The user signs each one in their wallet. They all hit the same `BatchClean` instruction and the same fee-split logic. ## The Leptos 0.7 frontend, rendered client-side Leptos is the Rust SolidJS-style framework — fine-grained reactive primitives, server-or-client rendering, compiles to WASM. For Janitor I went pure CSR (`app/Trunk.toml` set up for `--release`-mode WASM bundle), because the only thing the frontend needs to do is: 1. Connect to a wallet via JS shim. 2. Scan token accounts via Solana RPC (HTTP, no need for a server). 3. Build instruction data in pure Rust. 4. Hand the instruction off to a JS shim for signing. 5. Display tx status. There's no server-side data, no SSR benefits. CSR + WASM keeps the deploy as static files on Cloudflare Pages. The Leptos contexts are how state is shared: ```rust let wallet = expect_context::>(); let accounts = expect_context::>>(); let selected = expect_context::>>(); let set_processing = expect_context::>(); ``` If you've used SolidJS this is identical: `Signal` for reactive state, `ReadSignal`/`WriteSignal` split, `expect_context` to pull from a parent. The benefit over JS Solid is that the entire pipeline — RPC parsing, instruction encoding, vault PDA derivation — is in Rust, type-checked, with `?` propagation for errors. The Leptos UI code feels like 1:1 SolidJS in JSX-via-macro form. ## The JS shim is a load-bearing concession I really wanted to do this entirely in Rust/WASM, no JS. I couldn't. The reason: ```rust #[wasm_bindgen] extern "C" { #[wasm_bindgen(js_name = zeraSignAndSend, catch)] async fn zera_sign_and_send( instruction_bytes: &[u8], account_metas: JsValue, blockhash: &str, rpc_url: &str, ) -> Result; } ``` This is an FFI into a JS function called `zeraSignAndSend` defined in the page's ` [Click Here for Original](https://x.com/jamonholmgren/status/1751409105644982694?s=20) --- ## Utility Types: Exclude In TypeScript, `Exclude` is a built-in utility type. It is not a keyword but a predefined type in the TypeScript standard library. The `Exclude` utility type is used to create a new type by excluding one set of types from another. Here's a simplified explanation: ```typescript type Exclude = T extends U ? never : T; ``` `Exclude` produces a type that includes all the types from `T` that are not assignable to `U`. It utilizes conditional types to filter out types. In the context of the original question and the code samples, the usage of `Exclude` is part of a type-level computation where it is employed to increment `Low` by 1 in the process of creating a range of numbers. To clarify, `Exclude` is not being used in the standard way here; it's being leveraged creatively in the context of defining a range of numbers within TypeScript's type system. This usage is more of a convention or a specific implementation detail rather than a standard or documented behavior of the `Exclude` utility type. --- ## Understanding Recursive Types: Range So the answer given was actually: ```typescript type Range = Low extends High ? never : Low | Range, High>; ``` The `Range` type is a recursive type that generates a union of numbers from `Low` to `High`. When `Low` equals `High`, the recursion gracefully ends with a return type of `never`. Otherwise, it forms a union of `Low` and the result of calling `Range` with `Low` incremented by 1 and `High` unchanged. ## Decoding the Magic of Exclude ```typescript type Exclude = Low extends High ? never : Low + 1; ``` Now, let's unravel the mysteries of `Exclude`. This utility type is pivotal in incrementing `Low` by 1. It becomes instrumental in crafting types like our beloved `ZeroToHundred`, a union encompassing all numbers from 0 to 100. ```typescript type ZeroToHundred = Range<0, 100>; ``` But why the incrementation? The answer lies in TypeScript's remarkable type system and its prowess in generating unions through recursive types. When `Exclude` is employed, it empowers TypeScript to construct a new union spanning all possible numbers between `Low` and `High`, ensuring that `Low` gracefully steps up by 1. This design choice leads to cleaner, more concise type definitions in our code. With the assistance of the `Exclude` utility type, we can effortlessly generate a comprehensive range of numbers without the need to explicitly list each individual one. --- ## Empowering Efficient Type Definitions In summary, `Exclude` is the unsung hero that facilitates the incremental dance of `Low` in TypeScript's `Range` and other recursive types. ```typescript // Example usage: const numberInRange: ZeroToHundred = 42; // Valid, as 42 is in the range 0 to 100 const outsideRange: ZeroToHundred = 150; // Error, as 150 is outside the range 0 to 100 ``` This approach not only enhances efficiency but also provides manageability, especially when dealing with expansive ranges of numbers. So, the next time you encounter a recursive type in your TypeScript journey, embrace the enchantment of `Exclude`. Let it be your guide to crafting elegant and powerful type definitions. Happy coding, fellow TypeScript enthusiasts! 🤖 --- # Introducing the Milk V Canonical: https://blog.skill-issue.dev/blog/introducing_milkv/ Description: Milk-V Duo is an ultra-compact embedded development platform. It can run Linux and RTOS, providing a reliable, low-cost, and high-performance platform for professionals, industrial ODMs, AIoT enthusiasts, DIY hobbyists, and creators. Published: 2024-07-12T00:00:00.000Z Tags: risc v, risc-v, risc, isa, open-source, architecture, customizable, embedded ## Introducing the Milk V: Unlocking the Power of RISC-V The Milk V is a series of innovative products designed to harness the potential of RISC-V, an open-source instruction set architecture (ISA) that is revolutionizing the world of embedded systems. Developed by Milk-V, a company dedicated to providing high-quality RISC-V products, these devices cater to developers, enterprises, and consumers alike, promoting the growth of the RISC-V ecosystem. ## Models in the Milk V Series The Milk V series includes several models, each tailored to meet specific needs and applications: 1. **Milk-V Duo**: This model features dual cores up to 1GHz (optional RISC-V/ARM), up to 512MB of memory, and a 1TOPS@INT8 TPU. It integrates wireless capabilities with Wi-Fi 6/BT 5 and comes equipped with a USB 2.0 HOST interface and a 100Mbps Ethernet port. The Duo supports dual cameras (2x MIPI CSI 2-lane) and MIPI video output (MIPI DSI 4-lane). 2. **Milk-V Duo S**: This variant of the Duo offers dual cores up to 1GHz (optional RISC-V/ARM), up to 256MB of memory, and a 1TOPS@INT8 TPU. It is capable of running both Linux and RTOS simultaneously and features rich I/O interfaces. 3. **Milk-V Jupiter**: This Mini-ITX motherboard is equipped with a RISC-V processor, making it an ideal choice for those looking to leverage the benefits of RISC-V in their projects. ### The RISC-V Advantage The RISC-V architecture offers several advantages over proprietary ISAs like ARM. Since RISC-V is open-source and free to use, manufacturers do not need to pay licensing fees, making it a cost-effective option. This openness also fosters innovation and collaboration, as anyone can contribute to the development of RISC-V. ### Conclusion The Milk V series is a testament to the growing popularity of RISC-V in the embedded systems market. With its range of models, Milk-V provides developers with the tools they need to harness the power of RISC-V and create innovative solutions. As the RISC-V ecosystem continues to expand, the Milk V series is poised to play a significant role in shaping the future of embedded systems development. ## References - Milk-V. (n.d.). Milk-V | Embracing RISC-V with us. Retrieved from Reddit. (2022, December 19). RISC-V vs. ARM embedded software perspective. Retrieved from - Milk-V. (n.d.). Introduction | Milk-V. Retrieved from RISC-V International. (2024, July 2). - Introducing the Mini-ITX motherboard 'Milk-V Jupiter' equipped with a RISC-V processor. Retrieved from NW Engineering LLC. (2022, July 28). - Overview of RISC-V in Embedded Systems Development. Retrieved from --- # Nix-flakes and Bun Canonical: https://blog.skill-issue.dev/blog/nixos_bunjs/ Description: Small update to my development flow and focus. How to get up and running with Bun.js in NixOS. Published: 2024-06-30T14:09:00.000Z Tags: nixos, bun.js, nix-flakes, javascript, astro.js, development, declarative, environment Since taking up some extra cyber security and hacking courses I have been focusing on more Linux development. As a result I have picked up NixOS and as a JavaScript developer I have to say I have fallen in love with the declarative nature of my environment on NixOS. It has been a blast working on building VMs and my own cluster out of NixOS configurations from scratch. I have also moved my spare laptop over to NixOS and have been daily driving it in lieu of my M2 Macbook Air. ## Developing with NixOS Many developers will notice that this blog is built with Astro.js and Bun.js so let's talk about my development experience adding flakes to this project and getting it up and running on my NixOS machine. ### Setting up my Development Environment Using flakes it can be overwhelming to know where to start. Luckily they provide a simple way to get started. By running the command: ```bash nix flake init ``` You will get a new file called `flake.nix` which just like `package.json` is declarative and tells the OS what tools and their versions are needed for development. I went ahead and replace the default `flake.nix` with the following. ```nix { description = "Basic flake for Astro.js and Bun.js project"; inputs = { nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable"; flake-utils.url = "github:numtide/flake-utils"; }; outputs = { self, nixpkgs, flake-utils }: flake-utils.lib.eachDefaultSystem (system: let pkgs = nixpkgs.legacyPackages.${system}; in { devShells.default = pkgs.mkShell { buildInputs = with pkgs; [ bun nodejs ]; shellHook = '' echo "Astro.js with Bun.js development environment" echo "Run 'bun create astro' to create a new Astro project" ''; }; }); } ``` Cool right? It's not super crazy and is decently readable. Notice the file extension `.nix` as you can see NixOS comes with its own DSP for declarative configuration. To learn more about the syntax visit this page about the [Nix DSP](https://nix.dev/tutorials/nix-language.html). #### Description ```nix description = "Basic flake for Astro.js and Bun.js project"; ``` Purpose: Provides a brief description of what the flake is for. This description is shown when you run commands like nix flake metadata. #### Inputs This part of the configuration declares the needed dependancies for the flake. ```nix inputs = { nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable"; flake-utils.url = "github:numtide/flake-utils"; }; ``` I grabbed these two main dependancies: - `nixpkgs`: The Nix Packages collection, fetched from the nixos-unstable branch on GitHub. - `flake-utils`: A utility library for working with flakes, fetched from GitHub. These are two of the most common deps you will see in most flakes. #### Outputs ```nix outputs = { self, nixpkgs, flake-utils }: flake-utils.lib.eachDefaultSystem (system: let pkgs = nixpkgs.legacyPackages.${system}; in { devShells.default = pkgs.mkShell { buildInputs = with pkgs; [ bun nodejs ]; shellHook = '' echo "Astro.js with Bun.js development environment" echo "Run 'bun create astro' to create a new Astro project" ''; }; }); ``` Outputs define what the flake produces. The outputs function takes the inputs (self, nixpkgs, and flake-utils) and returns an attribute set. Here, we use flake-utils.lib.eachDefaultSystem to create outputs for each supported system (e.g., x86_64-linux, aarch64-linux). - let **Block**: ```nix let pkgs = nixpkgs.legacyPackages.${system}; in ``` Purpose: Defines a local variable `pkgs` that refers to the Nix packages for the current system. - `devShells.default`: ```nix devShells.default = pkgs.mkShell { buildInputs = with pkgs; [ bun nodejs ]; shellHook = '' echo "Astro.js with Bun.js development environment" echo "Run 'bun create astro' to create a new Astro project" ''; }; ``` Purpose: Creates a development shell environment. This is useful for setting up a consistent development environment. - `buildInputs`: Specifies the packages to include in the shell environment. Here, we include bun and nodejs. - `shellHook`: A script that runs when you enter the development shell. It prints a message to the console. This setup ensures that anyone using this flake will have a consistent development environment with the necessary tools for working on an Astro.js project using Bun.js. Now it doesn't matter where in the world or what hardware you are using as long as it is able to run flakes and access the internet to grab the deps needed for the dev environment it will work and run. --- # How Random is a Local LLM? A Rust Benchmark with Redis Canonical: https://blog.skill-issue.dev/blog/ai37_llm_random_numbers/ Description: A Rust harness that asks Ollama models for "a random number between 1 and 100" thousands of times, parses every response with regex, stores results in Redis, and pits them against a real RNG. Spoiler: 42 wins. Published: 2024-04-25T02:54:54.000Z Tags: rust, llm, ollama, redis, benchmark, rng, regex, side-quest There's a piece of folk knowledge in the LLM crowd that says: ask any chatbot for "a random number between 1 and 100" enough times, and you'll see a clear bias toward the same handful of numbers. 7. 17. 42. 73. The exact set varies by model, but the bias is robust across most LLMs. I'd seen the screenshots on Twitter. I had a half-day in April 2024 and a Mac Mini running Ollama. So I built a benchmark — `ai37` — to actually measure it. The whole project lives at [Dax911/ai37](https://github.com/Dax911/ai37), and the commit that turned it from "demo" into "actually a benchmark" is [`fc5c80c — :sparkles: Rust rng`](https://github.com/Dax911/ai37/commit/fc5c80c) on 2024-04-25. This post is about what the harness looks like, why I built it in Rust instead of a 20-line Python script, and what I learned from running it. ## The shape of the experiment The premise is simple enough to write on the back of a napkin: 1. Pick a question. (`"Generate a random number between 1 and 100, inclusive. Reply with only the number."`) 2. Pick a model. (`openhermes:latest`, `llama2-uncensored:latest`, etc.) 3. Send the prompt 1,000+ times. 4. Parse the response. Extract the first integer between 2 and 99. 5. Store the response, the parsed number, the model, the response time, the timestamp in Redis. 6. Aggregate. You could write all of that in a Python notebook in fifteen minutes. The reason I wrote it in Rust is that step 3 is the bottleneck — Ollama serves one inference at a time per model, and even on M1 hardware a single completion is 1–4 seconds. To get a meaningful sample size in a reasonable wall-clock time you have to fan out across multiple concurrent requests, manage a Redis connection pool, and not let one slow model stall the whole run. Tokio + reqwest + an `MultiplexedConnection` to Redis got me to ~1,000 prompts in under three minutes. The Python equivalent would have been a thousand-prompt script that ran for an hour. ## The harness From [`src/main.rs`](https://github.com/Dax911/ai37/blob/fc5c80c/src/main.rs), this is the result struct: ```rust #[derive(Debug)] struct ApiQueryResult { request_id: u64, endpoint_url: String, question: String, response_time: u128, http_status_code: u16, response_body: String, error_message: Option, chosen_number: Option, model: String, request_datetime: DateTime, contained_additional_text: bool, } ``` Every field on this struct exists because at some point I lost data and wished I had it. `response_body` is verbatim what the model said. `chosen_number` is what the regex extracted. `contained_additional_text` is the binary flag for "did the model say only `42` or did it say `Sure! Here's your number: 42`." The reason `chosen_number` is an `Option` and not just an `i32` is the most important design choice in the whole harness: **sometimes the model doesn't reply with a number at all**. `llama2-uncensored` once replied to me with `"I cannot generate a random number for you, as I am an AI language model designed to provide informational and educational responses..."` That's not a refusal in the safety sense — that's the model genuinely not understanding what's being asked. The harness has to record that and not crash. ## Regex was the right call here ```rust fn extract_number_from_response(response: &str) -> Option { let re = Regex::new(r"\d+").unwrap(); let mut numbers: Vec = Vec::new(); for cap in re.captures_iter(response) { if let Some(number_str) = cap.get(0) { if let Ok(number) = number_str.as_str().parse::() { if number >= 2 && number <= 99 { numbers.push(number); } } } } numbers.into_iter().next() } ``` There are three subtle things in this 14-line function: 1. **Find every integer**, not just the first. Models will sometimes say `"between 2 and 99... I'd say 73."` — three numbers, the third one is the answer. You have to examine all of them. 2. **Filter to the valid range** (2–99 inclusive). This eliminates `"1"` from `"between 1 and 100"` if the model just echoed the prompt back. It also eliminates `"100"` because the prompt says *exclusive* in some variants. The boundary numbers are the most common false positives. 3. **Take the first survivor.** Counter-intuitively this is the right heuristic, because most models that emit multiple integers do so as `"between [LOW] and [HIGH], my answer is [N]"`. The `[LOW]` is filtered out by the range check. The `[HIGH]` is filtered out by the range check. `[N]` survives. The first survivor is the answer. Could you parse this with a more sophisticated NER pipeline? Sure. Could you fine-tune a small classifier? Sure. But this is a benchmark of LLM randomness, not a benchmark of how clever I can be at extracting numbers from text. The dumber the parser, the easier it is to defend the conclusion. ## Storing in Redis was load-bearing Each result becomes a Redis hash with a unique key: ```rust let unique_key = format!( "rust-basic-rng:{}:{}", Utc::now().timestamp_millis(), number ); let data = vec![("number", number.to_string())]; let _: () = con.hset_multiple(&unique_key, &data).await?; ``` The key shape — `::` — means I can: - `KEYS rust-basic-rng:*` to list every result from the control RNG. - `KEYS *:1714013094:*` to list every model's response in a 1-ms window (used for "did models converge in time?" analysis). - `HGETALL ` to recover the full record. This is *not* the right schema for a real database. There's no compound index, no fast `WHERE number = 42` query without scanning every key. But Redis on a Mac Mini doing a `KEYS *` over 5,000 entries is still a sub-100ms operation, and the entire dataset fits comfortably in a hash. The bigger reason for Redis is that I wanted to *resume the run* if my laptop hibernated. Streaming straight to a CSV would have meant losing in-flight inference if the script crashed. Redis takes the writes out-of-process; a crash loses at most one inference's worth of data. ## The control: a real RNG I added the control in this exact commit: ```rust async fn generate_and_store_random_numbers( con: &mut MultiplexedConnection, n: usize, min: i32, max: i32, ) -> redis::RedisResult<()> { let mut rng = rand::thread_rng(); for _ in 0..n { let number = rng.gen_range(min..=max); let unique_key = format!( "rust-basic-rng:{}:{}", Utc::now().timestamp_millis(), number ); // ... } Ok(()) } ``` Why bother including a `rand::thread_rng()` baseline? Because **a benchmark with no baseline isn't a benchmark, it's an anecdote.** The story "LLMs say 42 too often" is only meaningful if you also know what a real RNG's frequency distribution looks like over the same number of trials. With 1,000 trials over 98 distinct values, a uniform RNG will produce a frequency-of-mode that's *also non-uniform* — the most common number will still appear ~3× more often than the least common, just by chance. You need that baseline to say "the LLM bias is real" instead of "the LLM happened to produce a non-uniform sample." The control RNG isn't there because anyone questions whether `rand::thread_rng()` is uniform. It's there because the comparison statistic only works if both arms are sampled the same way. ## The `analyze.py` companion The same commit added a small Python script for the actual stats: ``` analyze.py | 46 ++++++++++++++++++++++++++++++++++++++++++++++ ``` (Yes, the leading space in the filename is real. I never noticed; `git` accepted it; nobody depends on it; the commit immortalized it.) `analyze.py` opens Redis, scans the keys for each model, builds a Counter, normalizes to frequency, and pretty-prints the top 10 most-common numbers per model. That's it. The script is 46 lines and it's where the actual scientific output came from. Rust did the data collection; Python did the stats. The right tool for each job. ## What the data showed I'm not going to publish the raw numbers because the runs I have are from 2024 against ollama models that have since been retrained, and I don't trust the conclusions to generalize to today's checkpoints. But the qualitative finding matched the folk knowledge: - **Both Ollama models I tested were significantly biased toward 7, 17, 42, 73, 77.** - **The Rust RNG was uniform** in the chi-square sense at 1,000 samples (p > 0.05). - **`llama2-uncensored` had a worse bias than `openhermes`** in the sense that its mode-frequency was higher (the most common number appeared more often as a fraction of total samples). - **Both LLMs avoided multiples of 10** — `30`, `50`, `60` were under-represented relative to `33`, `47`, `61`. My theory: models have learned that "round numbers don't sound random," so they overcorrect away from them. The most-common-overall LLM answer was 42. Of course it was 42. ## What this taught me The technical thing I learned was that **regex parsing is fine** for almost any LLM output extraction problem if you constrain the output range tightly. I'd been reaching for JSON-mode prompts and structured-output APIs for things that a 14-line `\d+` regex would solve. The bigger thing was about benchmarking discipline: **"is this thing biased?" is not a yes/no question without a baseline.** Half the AI Twitter takes I read in 2024 were claims of LLM bias against an implicit baseline of "perfectly uniform behavior," which no statistical process exhibits at finite sample sizes. The boring controls are what make the spicy claims defensible. If you want a Rust harness for benchmarking any local model, [ai37 is the template](https://github.com/Dax911/ai37). It's 200 lines of Rust, a 46-line Python analyzer, and a Redis dependency. Add a model, change the regex, change the question. The architecture survives. ## Trade-offs **Why Ollama instead of OpenAI/Anthropic?** Cost. 5,000 inferences at 4¢ each is $200 for a science-fair experiment. Ollama on a Mac Mini is the per-watt cost of leaving a laptop on overnight. **Why Redis instead of SQLite?** Resilience to mid-run crashes. SQLite would also work; the schema is trivial. The reason I went Redis is I had it running for another project (the Rust pipeline part of [Building A Better Cryptocurrency](/blog/a_better_crypto/)) and adding a hash schema was 5 lines. **Why filter to 2–99 instead of allowing the boundary?** Because half the failure modes of LLMs are "echoing the prompt back." Filtering 1 and 100 out cleanly distinguishes "the model picked an answer" from "the model parroted the question." You lose two valid sample values; you gain a much cleaner dataset. ## Further reading - [ai37 on GitHub](https://github.com/Dax911/ai37) — the harness, the analyzer, the (lost) Redis dump. - [Ollama](https://ollama.ai/) — the local-LLM runner I benchmarked against. - [`rand` crate docs](https://docs.rs/rand/) — `thread_rng().gen_range(...)` is what makes the control arm honest. - [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the project Redis was already running for. --- # Blazingly Fast Drinks: A Repo I Made For The Bit Canonical: https://blog.skill-issue.dev/blog/glug_blazingly_fast_drinks/ Description: A Clerk + Next.js + Expo turborepo I called "glug" with the description "Blazingly Fast Drinks". The README never mentioned drinks. The repo description carried the entire joke. Published: 2024-03-19T17:09:37.000Z Tags: turborepo, clerk, nextjs, expo, trpc, side-quest, shitpost > **Repo description:** Blazingly Fast Drinks > > **README first line:** `# Glug the PMG drink app with Clerk, Next.js, and Expo` That's [`Dax911/glug`](https://github.com/Dax911/glug). The whole joke is on the GitHub repo card. The README is sober and explanatory. The description is unhinged. This is my favourite kind of repo — a piece of public infrastructure where the only humour I'm allowed is the 350-character box on the listing page. Today I want to talk about the [`9b188bc — More context`](https://github.com/Dax911/glug/commit/9b188bc) commit on 2024-03-19. It's small. It changed nine files. It's the moment I committed to a project that exists for one joke and a turborepo template. ## What was Glug, briefly Glug was supposed to be a drink-tracking app for **PMG** — Phi Mu Gamma, my college fraternity. The premise: a phone app where brothers log drinks, the chapter sees aggregate stats, and there's a cross-platform Next.js dashboard for the chapter president to look at. Nothing privacy-respecting, nothing on-chain, nothing remotely interesting from a security perspective. A drink counter. But "drink counter" doesn't justify the stack I shipped: ``` apps/ expo/ # React Native via Expo SDK nextjs/ # Next.js 13 dashboard packages/ api/ # tRPC v10 router db/ # Prisma schema + types ``` This is the [`create-t3-turbo`](https://github.com/t3-oss/create-t3-turbo) layout. I'd been using it on every personal project that quarter — same router, same Prisma schema, same Clerk auth, same `apps/expo` + `apps/nextjs` split. The actual purpose of `glug` was to *practice the stack*. The drinks were incidental. ## The "More context" commit Here's the diff that mattered ([`9b188bc`](https://github.com/Dax911/glug/commit/9b188bc)): ``` .vscode/settings.json | 8 ++ README.md | 12 ++++ apps/nextjs/src/pages/index.tsx | 12 +++- bun.lockb | (binary) packages/api/src/context.ts | 40 +++++++-- packages/api/src/router/auth.ts | 9 +++ packages/api/src/router/index.ts | 10 ++++- packages/api/src/router/post.ts | 25 ++++++- packages/db/prisma/schema.prisma | 57 +++++++++++++-- ``` The Prisma schema is the only file with anything approaching design content. Everything else is "I added the auth router import" and "I switched to bun." But the schema reveals what the project was *actually* trying to do: - A `User` table seeded by Clerk's `userId`. - A `Drink` table with `(user, timestamp, type, abv)`. - A `Session` rollup table, time-bucketed. - An attempt at a `Timebox` table — the README has an "Additional Specs" line at the bottom that reads `Will have timeboxing need to find a way to put that in the DB w the current schema`. That last line is the thesis of every personal project I started in 2024: **"will have timeboxing."** I was trying to use my fraternity drink-tracker as a way to think about how to bound a session in time, because the same problem was sitting in three other repos I never finished. ## Why the description was the joke GitHub repo descriptions are 350 characters of cold static text on a search result. They appear in every list view, in every fork chart, in every `gh repo list dax911`. They show up *everywhere* the repo name does. If you treat them as proper marketing copy you get "A drink-tracking application for greek-letter organizations using modern TypeScript tooling." Nobody has ever clicked on a repo because the description said that. Whereas "Blazingly Fast Drinks" is the only Rust-meme rendering of "drinks app" possible. It implies: - The drinks are blazing. - The drinks are fast. - I am taking this very seriously. - I am taking this not at all seriously. The phrase is recognisably a Reddit `/r/rust` cliche. Drink-tracking is not Rust. The description is fully in conflict with the actual stack — the repo is `tRPC + Prisma + Expo`, *zero* Rust. The collision is the joke. I think a lot about how much you can get away with by putting humour in the metadata of a serious-looking artifact. Repo descriptions, npm `description` fields, git tag annotations, package `keywords`, the `version` field on a `package.json` you set to `0.0.69` — these are all places where the comedy is invisible to anyone who isn't already there. They're not in the way. They don't hurt the project. They're the public-facing payoff of a project you're never going to finish. ## What this taught me about side-quests Glug never shipped. I never wrote the timeboxing code. The fraternity never used it. I'm not even sure I told anyone in the chapter about it. That's not what side-quests are for. What glug *did* teach me: 1. **t3-turbo is the right scaffold for a TS monorepo prototype**, even if you never finish anything in it. I went on to clone this exact layout into [tauri-clerk-auth](https://github.com/Dax911/tauri-clerk-auth) and into the early scaffolds of what became [zera-wallet-demo](/blog/zera_wallet_v3_zkp/). The muscle memory of "tRPC router → Prisma model → Expo screen" is something I can do at midnight without thinking. 2. **Bun was already eating Node's lunch** by March 2024. The diff that landed in this commit replaced `pnpm` with `bun` and shrunk install time on a fresh clone from 90s to 12s. I haven't used `pnpm` for a side-quest since. 3. **A repo with a joke description gets star drift.** People still arrive in `glug` from search occasionally. They open it, see the README, see Clerk + Expo + Prisma, and leave. The description got them to click. The README didn't deserve them. ## The "PMG" footnote For anyone Googling: yes, PMG is Phi Mu Gamma. Yes, it's a real fraternity. No, this app was never the official chapter tool. There's a Google Sheet that beat me to market by approximately fifteen years. The drink counter is a Google Sheet. The drink counter has always been a Google Sheet. Every digital tool that has tried to replace the drink counter has failed because the Google Sheet is already deployed, already shared, already has a hundred entries from 2009 still in it. You can't ship faster than `sheets.new`. The right insight, retrospectively, was to ship a tRPC dashboard that *imported the Sheet*. Which I never did. Which is fine. ## What the side-quest tells you Side-quests are how you maintain stack fluency. You don't need to ship them. You don't need to scope them. You don't need to tell anyone about them. What you need is a place to type out the boilerplate so that next time you start a *real* project the boilerplate doesn't slow you down. Glug also taught me the second-order lesson that became my entire 2026: when a side-quest crosses paths with a real product idea, you should let the side-quest die and start the product. I never finished the timeboxing logic in glug. The same problem reappeared in [Cruiser's gossip presence](/blog/cruiser_iroh_gossip_p2p/) — when does a peer cease to be "active" if their last announce is 30s old? The answer in glug would have been "expire from the rollup if no row in 60s." The answer in Cruiser is "evict from the cache if no announce in 90s." Same architecture. Different domain. That's the gift of a side-quest you stop. You harvest the architecture for the next thing. Glug was the architectural garden bed; Cruiser is the tree that grew out of it. ## Further reading - [glug on GitHub](https://github.com/Dax911/glug) — the description still says Blazingly Fast Drinks. - [create-t3-turbo](https://github.com/t3-oss/create-t3-turbo) — the scaffold I've copied into a dozen projects. - [Cruiser P2P origin post](/blog/cruiser_iroh_gossip_p2p/) — where the timeboxing instinct eventually landed. - [The PMG Google Sheet](https://en.wikipedia.org/wiki/Phi_Mu_Gamma) — not actually linked here. The Sheet is private. The joke is. ---