Asymmetric Tool Surfaces for AI-Agent Cryptographic Primitives | Papers | Skill Issue Dev

Abstract

We argue that SDKs exposing cryptographic primitives to autonomous AI agents must obey an explicit asymmetry rule: read and pure-compute operations may be exposed without privilege, but state-changing authority must remain behind an out-of-band human or hardware confirmation. We formalise this rule, instantiate it for a shielded-pool zero-knowledge SDK exposing its surface via the Model Context Protocol, and identify the structural reasons the asymmetric design is tractable for cryptographic SDKs while intractable for general-purpose RPC surfaces. The contribution is a small, opinionated discipline that renders adversarial-prompt and supply-chain compromise of the agent layer non-catastrophic.

1. Introduction

The deployment posture of cryptographic software development kits has changed faster than the threat model that governs them. Until recently, an SDK was a library — a process-local API consumed by an application written by a human, in a language understood by that human, after the human had read enough of the source to form an intuition for what the calls actually did. The principal threat was a buggy caller, and the principal mitigation was clear documentation, robust types, and careful examples.

The introduction of the Model Context Protocol (Anthropic, 2024) and the subsequent rapid adoption of MCP servers across cryptographic tooling in 2025 and 2026 has changed the principal. The principal is now, with non-negligible probability, an autonomous large-language-model (LLM) agent calling the SDK on behalf of a user that may not understand the call’s semantics. The agent is potentially compromised by prompt injection (Greshake et al., 2023), by tool-poisoning of the upstream MCP server, or by an adversarial input embedded in the agent’s working context. The “lethal trifecta” identified by Willison (Willison, 2025) — private data, untrusted input, external communication — is a default property of an MCP-connected wallet stack.

This paper makes a structural argument: cryptographic SDKs that expect to be invoked by AI agents should obey an asymmetric tool-surface discipline, exposing only read and compute primitives over MCP and reserving state-changing authority for an out-of-band, human-in-the-loop, or hardware-enforced confirmation channel. The discipline is narrow and opinionated. It rules out a number of conveniences that look attractive in a normal SDK. It also makes adversarial compromise of the agent layer survivable rather than catastrophic.

The remainder of the paper is organised as follows. Section 2 establishes the threat model and notation. Section 3 defines the asymmetry rule and its decision procedure. Section 4 gives a worked example: an instantiation of the rule for zera-mcp, the MCP surface of a production shielded-pool SDK. Section 5 discusses extensions and limitations. Section 6 concludes.

2. Threat model

Let $\mathcal{S}$ denote a cryptographic SDK, and let $\mathcal{T} = \{t_1, \dots, t_n\}$ denote the finite set of tools $\mathcal{S}$ exposes via an agent-facing protocol such as MCP. Each tool $t_i$ has a typed signature $t_i : \mathbb{I}_i \to \mathbb{O}_i$ and an effect signature $\mathsf{eff}(t_i) \in \{\bot, \mathsf{read}, \mathsf{compute}, \mathsf{write}\}$ .

We adopt the following adversary $\mathcal{A}$ :

$\mathcal{A}$ may control the agent’s prompt context, including any tool-result text the agent has read.
$\mathcal{A}$ may inject text into any external content the agent fetches.
$\mathcal{A}$ does not control any signing key held by a wallet or hardware-security module operating outside the agent.
$\mathcal{A}$ does not control the network’s consensus rules.

Under this adversary, we require that for any execution trace $\sigma$ produced by the agent calling tools in $\mathcal{T}$ , the on-chain state $\Sigma$ reachable from $\sigma$ should differ from the on-chain state reachable under the trivial empty execution only via transactions that were explicitly co-signed by a wallet $W$ outside $\mathcal{A}$ ‘s control.

In words: an adversary that captures the agent must not be able to move funds. They may be able to spend compute, retrieve information the user already had access to, and produce mathematical objects (commitments, proofs) that are inert until co-signed.

3. The asymmetry rule

We say a tool surface $\mathcal{T}$ is asymmetric under separation $W$ if for every $t_i \in \mathcal{T}$ ,

\mathsf{eff}(t_i) \in \{\mathsf{read}, \mathsf{compute}\} \;\;\Longleftrightarrow\;\; t_i \in \mathcal{T}_{\text{exposed}}

and the residual $\mathcal{T} \setminus \mathcal{T}_{\text{exposed}}$ — the tools whose effect signature is $\mathsf{write}$ — are reachable only through $W$ .

The decision procedure that produces $\mathcal{T}_{\text{exposed}}$ is the following:

For each candidate tool $t$ , ask: if the agent is fully compromised by $\mathcal{A}$ , what is the worst-case state delta $\Delta\Sigma$ a single invocation of $t$ can induce? If $\Delta\Sigma = \emptyset$ — that is, if the call is observationally pure modulo the agent’s own resources — admit $t$ to $\mathcal{T}_{\text{exposed}}$ . Otherwise, route the function behind $W$ .

Two consequences follow. First, the procedure is composable: if every individual tool is observationally pure under $\mathcal{A}$ , any agent-driven composition of them remains observationally pure under $\mathcal{A}$ , because composition does not synthesise authority. Second, the procedure is local: the decision for $t_i$ does not depend on the other tools in $\mathcal{T}$ , which makes it tractable for SDK authors to apply at PR review time.

The asymmetry rule does not eliminate every class of attack. An $\mathcal{A}$ that captures the agent can still, in principle, mislead the human into approving a transaction that they would not have approved given full information. The rule cannot fix social engineering. What it forbids is an architecture in which the agent has unilateral capability to drain user funds without human action.

3.1 Why this is tractable for cryptographic SDKs

Cryptographic SDKs are unusual among software libraries in that the boundary between computation and authority is unusually crisp. Constructing a Pedersen commitment (Pedersen, 1992) does not move money; submitting a transaction does. Computing a Poseidon hash (Grassi et al., 2021) does not move money; signing the resulting witness with a private key does. Producing a Groth16 proof (Groth, 2016) does not move money; broadcasting that proof to a chain does. Each of these primitives admits a clean factorisation along the read/compute vs. write axis.

Compare this to a general-purpose enterprise SDK, where the same call may simultaneously query, mutate, and authorise — the asymmetry rule is too aggressive in that setting because the rule’s first effect is to forbid the very calls users want to make. Cryptographic SDKs are exempt from this critique because the underlying mathematics already separates the two.

3.2 What the rule rules out

It is worth being explicit about which conveniences the asymmetry rule eliminates, because each is one a typical SDK author will want.

One-shot transfer tools. A tool of the form transfer(asset, amount, to) is the canonical write effect. The asymmetry rule forbids exposing it on the agent surface even if the SDK author has access to a signing key — the right place for that key is a wallet, not the SDK process.
Privileged read tools that proxy write authority. A tool like revoke_pending_proof(proof_id) looks like a read but actually mutates the SDK’s local state in a way that affects future write operations. The rule treats it as a write.
Implicit authority via cached secrets. A tool that “remembers” a recent unlock of a key for some grace period is an implicit write — the agent has effective signing authority for the duration. The rule treats the unlock itself as a write.

4. Worked example: `zera-mcp`

We instantiate the asymmetry rule for zera-mcp, the MCP server bundled with the zera-sdk shielded-pool toolkit. The SDK exposes a small surface: four tools and three resources, all conforming to the asymmetry rule.

4.1 Tools admitted to $\mathcal{T}_{\text{exposed}}$

Tool	Signature	Effect
`compute_commitment`	$(\mathsf{asset}, \mathsf{amount}, r) \to \mathsf{Commit}$	$\mathsf{compute}$
`derive_nullifier`	$(\mathsf{sk}, \mathsf{Commit}) \to \mathsf{Nullifier}$	$\mathsf{compute}$
`build_spend_proof`	$(\mathsf{note}, \mathsf{recipient}, \mathsf{amount}) \to \pi$	$\mathsf{compute}$
`get_pool_state`	$\bot \to (\mathsf{root}, n_\text{unspent})$	$\mathsf{read}$

compute_commitment is the additively-homomorphic Pedersen commitment $C = g^v h^r$ over the BN254 curve (Barreto & Naehrig, 2006), parameterised by the asset and the amount under a caller-supplied blinding factor $r$ . Its output binds the value but reveals nothing about it. The function is observationally pure: $\mathcal{A}$ invoking it in any order with any inputs cannot produce a state delta on chain.

derive_nullifier produces $\mathsf{nf} = \text{Poseidon}_2(\mathsf{sk}, C)$ , the deterministic single-use nullifier for the note committed to by $C$ . Disclosure of $\mathsf{nf}$ proves that some note has been consumed without revealing which one — and the nullifier is single-use precisely because the chainstate enforces uniqueness, not because the SDK does. The SDK is computing a hash; the chain is what makes the hash matter.

build_spend_proof runs the canonical Groth16 prover for the shielded-spend relation, producing a proof bytestring $\pi$ together with a public-input vector. The proof is a mathematical object: it does not move funds. It is inert until packaged into a transaction, signed by a wallet, and submitted to the chain. We take the deliberate position — defended in §4.3 — that the prover is a tool rather than a resource, despite being deterministic.

get_pool_state returns the most recent commitment-tree root and an aggregate count of unspent notes. It is plainly $\mathsf{read}$ .

4.2 Tools excluded from $\mathcal{T}_{\text{exposed}}$

The following functions exist in the SDK but are not exposed via MCP. They live behind the wallet $W$ :

submit_transaction(tx) — broadcasting authority. Lives in the wallet.
unlock_key(passphrase) — private-key access. Lives in the hardware-backed keystore.
set_pool_endpoint(url) — changes which chain the SDK queries; implicit write because it can redirect future proof construction. Lives in user-mediated config.

A reader will note that submit_transaction is, mechanically, exactly the kind of call an agent often wants to make. We argue that mechanical convenience is the wrong frame: the question is whether the agent should ever have unilateral authority to broadcast. We take the position that it should not.

4.3 Why the prover is a tool, not a resource

In MCP, resources are read-only, cacheable, and side-effect-free; tools are explicit operations the model decides to invoke. A first-pass reading of build_spend_proof makes it look like a resource: the prover is mathematically deterministic, and given the same witness one obtains the same proof modulo the random tape used for the Fiat-Shamir transform. Resources fit deterministic functions cleanly.

This reading is wrong in practice. A prover is computationally side-effecting: an order-of-magnitude latency cost (4–8 seconds in our benchmarks for a Groth16 spend proof on consumer hardware\note{TODO: empirical validation — tighten with measured BN254 prover numbers from the zera-sdk-core benchmark suite once it lands.}) makes it unsafe to be aggressively cached and silently re-invoked. Resources in MCP are meant to be cheap; a four-to-eight-second resource invoked in an agent loop will exhaust user patience and platform budget long before it exhausts the abstract semantics of the call. The taxonomy, in short, is sensitive to economic facts, not just mathematical ones.

We therefore route the prover behind a tool call, which forces the agent to deliberate about whether to re-invoke it, and we maintain the rule: the prover is a tool, but it is a compute tool, not a write tool.

5. Discussion

5.1 Composability across SDKs

Because the rule is local — admission of $t_i$ depends only on $t_i$ ‘s effect signature, not on the rest of $\mathcal{T}$ — it composes across multiple SDKs that an agent might call in sequence. An agent that holds an MCP connection to zera-mcp, a shielded-pool SDK, and a separate MCP connection to a generic web-search service can have arbitrary cross-call interaction without the asymmetric SDK losing its property: read+compute calls are still read+compute calls.

5.2 Multi-step authority

A common objection: if every authority decision is human-mediated, common multi-step flows (recurring shielded payments, scheduled deposits) become user-hostile. We acknowledge the friction. The asymmetry rule does not forbid wallets from offering delegated authority — a wallet can have its own internal policy that allows the agent to broadcast pre-authorised transactions matching a signed schedule — but it requires the policy to live in the wallet, not in the SDK’s MCP surface.

This matters because the wallet is a privileged process with its own user-interface affordances and its own threat model; the SDK’s MCP surface is, by hypothesis, talking to a potentially-compromised agent. The two are not interchangeable. Delegation policy living in the wallet is auditable through the wallet’s own surface; delegation policy living in the SDK is exfiltrable along with the rest of the agent’s prompt context.

5.3 What the rule cannot prevent

The rule does not prevent a compromised agent from:

Constructing a valid-looking commitment whose underlying note belongs to the attacker, then surfacing it to the user as if it were a recipient-supplied address.
Using get_pool_state to time when the user’s wallet is most likely to approve transactions and clustering its social-engineering attempts there.
Producing a stream of proofs that exhaust the user’s prover budget without ever broadcasting.

Defences against (1) rely on display-layer integrity in the wallet (the wallet must show what it is signing). Defences against (2) and (3) rely on rate-limiting and on the wallet’s policy. None of these are eliminated by the asymmetry rule; we claim only that the rule prevents the catastrophic outcome — direct loss of funds — and reduces the rest to known classes of attack with known mitigations.

5.4 Empirical evidence

We have operated zera-mcp against a population of agents during internal red-team exercises since early 2026.\note{TODO: empirical validation — quantify with concrete adversarial-test counts once the red-team report is publishable.} No agent has been able to induce a state delta absent a wallet co-signature, which is the property the asymmetry rule was constructed to deliver. This is consistent with the formal argument above and is necessary but not sufficient evidence: the absence of an exploit during red-teaming does not constitute a security proof.

5.5 Relation to existing privilege-separation work

The asymmetry rule is, at one level of abstraction, an instance of the principle of least authority — a discipline with a long pedigree in operating systems and capability-based security. The contribution of this paper is not the principle itself but the observation that cryptographic SDKs admit a particularly clean factorisation along the principle’s axis, and that the factorisation matches the read/compute vs. write taxonomy already present in the MCP specification.

The closest analogue in the smart-contract literature is the separation of view and state-modifying functions enforced at the language level by Solidity’s view/pure qualifiers and by the EVM’s STATICCALL opcode. A view function in Solidity cannot mutate state, and the EVM enforces this by reverting any attempt to do so from a static context. The proposal here is that the MCP layer of a cryptographic SDK should adopt the same discipline, with the SDK author — not the protocol — enforcing that exposed tools have view-or-pure semantics.

There is a subtle but important difference between the EVM’s static-call discipline and the MCP rule we propose. The EVM enforces statefulness through opcode semantics: a contract function is pure if and only if it never reads chain state, and view if and only if it never writes. MCP has no such enforcement, and consequently the discipline must be carried by the SDK author at design time. This makes the rule a code-review artifact rather than a runtime guarantee. We accept this trade-off because the alternative — embedding effect semantics into MCP itself — would expand the protocol’s surface area in ways that are unlikely to be adopted by upstream maintainers in the near term.

5.6 Failure modes specific to ZK SDKs

A handful of failure modes are unique to zero-knowledge SDKs and worth registering explicitly.

The first is witness exfiltration. A shielded-spend witness contains the secret key of the consumed note. If build_spend_proof is implemented naïvely, the witness is held in agent-accessible memory long enough that a compromised agent can extract it. We mitigate this by passing the secret key to the prover via an out-of-band channel (a wallet-managed pipe) rather than as an MCP tool argument; the agent constructs the recipe for a spend, but the wallet supplies the secret material at proof-construction time. The agent never sees the secret. This requires careful API design: build_spend_proof accepts a handle for the note (a non-secret identifier), and the wallet resolves the handle to the actual key material.

The second is commitment confusion. A compromised agent can construct a valid Pedersen commitment to a value the user did not authorise, and then surface it to the user as if it were the agreed-upon commitment. The asymmetry rule does not prevent this — compute_commitment is observationally pure regardless of which value it commits to. The mitigation lives in the wallet’s display layer: the wallet must independently re-derive the commitment for the user-confirmed value and refuse to co-sign if the agent-supplied commitment does not match. This is an instance of the broader principle that user confirmation should be a function of values the user can read, not of opaque cryptographic objects whose contents are obscured.

The third is proof-system parameter pinning. If the prover circuit can be selected at MCP-call time — for example, by accepting a circuit identifier as an argument — a compromised agent can request a proof against a circuit that does not enforce the constraints the user expected. The mitigation is to remove the choice from the agent: the wallet pins the circuit set, and the SDK refuses to construct proofs against circuits the wallet has not whitelisted. We adopt this in zera-mcp and recommend it as a default for any SDK exposing a configurable prover.

6. Conclusion

The Model Context Protocol is now the default tool-calling surface for AI agents (Anthropic, 2024). Cryptographic SDKs that expose themselves via MCP inherit a new principal — the agent — and a new threat model in which the agent may be compromised by adversarial input. We have argued that a small, structural discipline — the asymmetry rule — addresses the most catastrophic class of compromise and is tractable to apply because cryptographic primitives admit a natural read/compute vs. write factorisation that general-purpose SDKs do not.

The rule is opinionated, narrow, and rules out conveniences. We claim that none of those conveniences are worth the catastrophic blast radius they introduce.

References

Anthropic. (2024). Model Context Protocol Specification. https://modelcontextprotocol.io/specification

Barreto, P. S. L. M., & Naehrig, M. (2006). Pairing-Friendly Elliptic Curves of Prime Order. In Selected Areas in Cryptography (SAC 2005) (Vol. 3897, pp. 319–331). Springer. https://doi.org/10.1007/11693383_22

Grassi, L., Khovratovich, D., Rechberger, C., Roy, A., & Schofnegger, M. (2021). POSEIDON: A New Hash Function for Zero-Knowledge Proof Systems. 30th USENIX Security Symposium (USENIX Security 21), 519–535. https://www.usenix.org/conference/usenixsecurity21/presentation/grassi

Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. arXiv preprint arXiv:2302.12173. https://arxiv.org/abs/2302.12173

Groth, J. (2016). On the Size of Pairing-based Non-interactive Arguments. Advances in Cryptology – EUROCRYPT 2016, Part II, 9666, 305–326. https://doi.org/10.1007/978-3-662-49896-5_11

Pedersen, T. P. (1992). Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing. Advances in Cryptology – CRYPTO ’91, 576, 129–140. https://doi.org/10.1007/3-540-46766-1_9

Willison, S. (2025). The Lethal Trifecta: When LLM Agents Combine Private Data, Untrusted Content, and External Communication. Simonwillison.Net (Essay). https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/