RFC 003: Rusty Pipes detection heuristic
A runtime + install-time heuristic for flagging suspect Rust binaries shipped via npm. Builds on the Rusty Pipes posts; aimed at a CI gate or an `npm audit`-class tool.
Summary
The Rusty Pipes series documented a class of npm supply-chain attacks where a maintainer ships a small, ostensibly benign JavaScript package whose postinstall (or, more sneakily, runtime require) loads a precompiled Rust .node / .dylib / .so binary, and that binary does the malicious work — out of view of every JS-aware static analysis tool. The follow-up post (Rust in Peace) walked through how to actually build such a payload, and the exploit post showed it working end-to-end.
This RFC defines a heuristic for flagging suspect Rust binaries in npm packages. It is not a malware scanner. It is a “this package contains a native artefact whose provenance does not match its claims” detector, suitable for running as a CI gate or as a one-shot CLI before npm install.
Motivation
Pure-JS supply-chain attacks (event-stream, ua-parser-js, colors, faker) are detectable by static analysis: the payload is JS, you can grep it, you can run it under a sandbox. Native attacks are not: a .node file is opaque to every JS-ecosystem tool. The maintainer can ship a benign JS surface, a malicious binary, and npm audit is silent about both.
The Rusty Pipes attack surface, in order of decreasing audit visibility:
postinstallscript that downloads a binary. Detectable: the script is JS, you can read it. Mitigation:npm install --ignore-scriptsis a known posture.postinstallscript that compiles from source. Detectable: same as above. Plus, the build is reproducible from source.- Prebuilt binary committed to the package. Opaque. The package contains a
.nodeor platform-specific binary, thepackage.jsonmain(or arequire()deeper in the tree) loads it, and the binary’s behaviour is invisible to JS-aware tools. - Conditional binary load. The binary is loaded only when a specific environment is detected — the runtime is Node 20, the platform is
linux-x64, the host hasgitinstalled, etc. Even dynamic analysis undernpm installmay not exercise the bad path.
This RFC focuses on (3) and (4). The goal is not to prove a binary is malicious — that is undecidable — but to make shipping an unsanctioned native binary noisy enough that maintainers and downstream auditors notice.
Detailed design
The heuristic runs in two modes:
- Install-time (preferred): hooks into the package install pipeline (a wrapper around
npm/pnpm/yarninstall, or a pre-publish hook in CI). - Audit-time (always available): scans an extracted package tarball or a
node_modulestree.
In both modes the input is a package directory. The output is a verdict: clean, notable, flagged. Notable and flagged emit human-readable findings.
Signal 1: native artefact present without declared binary field
Packages that legitimately ship native code (@napi-rs/*, node-gyp projects, neon-rs projects) declare so in package.json — typically a binary field, an engines.node constraint, an optionalDependencies set of platform packages, or a napi field.
If the package contains any of *.node, *.dylib, *.so, *.dll, *.wasm and none of the legitimacy markers above is present, that is a Signal 1 hit. Severity: notable.
False-positive class: legacy packages predating the napi/neon conventions. Mitigation: maintain a small allowlist of packages that ship native artefacts the old-fashioned way (e.g. fsevents, bufferutil, utf-8-validate).
Signal 2: native artefact’s claimed source not present
If a package ships *.node but no Cargo.toml, binding.gyp, build.rs, src/lib.rs, or equivalent, the binary cannot be rebuilt from the package. That is suspicious because:
- A user who wants to audit the binary cannot reproduce it.
- A user who wants to harden their build by re-compiling from source cannot.
- The npm registry has no way to verify the binary matches the source.
Severity: notable. Promoted to flagged if combined with Signal 4 or Signal 5.
Signal 3: binary architecture mismatch
A linux-x64 package containing a darwin-arm64 binary is at minimum a packaging mistake; at worst it is a maintainer accidentally publishing their dev-machine binary in place of the CI-built one. The heuristic uses file(1) (or its magic-bytes Node port) to read the Mach-O / ELF / PE header and compare against the expected platform from the os and cpu fields in package.json and the platform-suffixed package name.
Severity: notable. Promoted to flagged if the actual architecture is not declared by any optionalDependencies entry, because at that point the binary is for an architecture the package does not claim to support.
Signal 4: binary imports suspicious symbols
nm-style symbol enumeration (or for stripped binaries, string extraction) for known network-egress, process-spawn, and crypto-stealer symbols:
socket,connect,getaddrinfo,https?_request_*— outbound network from a binary that does not advertise itself as a network client is a strong signal.execve,posix_spawn,CreateProcess— process spawn.keytar,wallet.dat,id_rsa,~/.aws,~/.config/solana,id.json(Solana wallet path) — string constants pointing at known credential store paths.fetch_proxy_settings,BrowserData— Chromium-profile-stealer markers.
Severity: flagged. The presence of any one of these in a package whose stated job is “format dates” or “parse YAML” is the actual smoking gun.
Signal 5: binary is dynamically loaded by a non-obvious code path
Walk the JS surface of the package looking for require(...), import(...), and process.dlopen calls whose argument resolves at runtime to a native artefact. If the load is gated by:
- An environment variable check (
if (process.env.NODE_ENV === 'production') require('./bin/x.node')) - A platform check whose set of supported platforms is suspiciously narrow (only
linux-x64) - A version check on Node, OpenSSL, or a peer dep
the heuristic flags it as notable. The Rusty Pipes exploit specifically used a Node-version gate to avoid running in CI matrices that test on multiple Node versions.
Severity: notable. Combined with Signal 4, flagged.
Signal 6: maintainer churn at the boundary of a binary-shipping release
This signal needs registry metadata, not just the package contents. The heuristic looks at:
- The set of npm publishers who have ever published the package.
- Whether the most recent publisher is publishing for the first time.
- Whether the most recent release is the first to ship a
*.nodefile. - Whether the version bump that introduced the binary was a patch-level bump (i.e. the maintainer who is shipping native code for the first time is doing so on a
1.4.7 → 1.4.8release, not a major).
Hit-conjunction (new publisher) ∧ (first binary-shipping release) ∧ (patch bump) is flagged. This is the exact attacker pattern from the exploit post.
Reporting
Each finding is a JSON object:
{
"package": "[email protected]",
"signal": "S4",
"severity": "flagged",
"evidence": {
"binary": "lib/native/linux-x64/colorstring.node",
"symbols": ["getaddrinfo", "execve", "wallet.dat"]
},
"explainer": "https://docs.example.com/rusty-pipes/S4"
}
The CLI exits non-zero on any flagged finding so it can be wired into CI directly.
Alternatives considered
A1: Sandbox postinstall and execute the binary in a tracer
Run the package install + a require() of the package’s main inside a seccomp/sandbox that records syscalls.
Pros: dynamic — sees what the binary actually does at runtime.
Cons: defeated by environment-gated payloads (Signal 5). Also expensive: every npm install would take seconds longer. And: a sandbox bypass is itself a research target.
Status: complementary, not alternative. The dynamic analysis is a great addition to the static heuristics; it is not a substitute, because a binary that does nothing in a sandbox could still be malicious in production.
A2: Reproducible-build attestation
Require every package shipping a .node to ship a SLSA-style attestation pointing at the source commit and the build environment.
Pros: long-term correct answer. Cons: requires ecosystem-wide adoption. A heuristic that works today on packages that lack attestation is needed in the meantime.
Status: this RFC is the heuristic; SLSA is the future. They coexist.
A3: Block all .node files in packages that do not declare native code
The strict version of Signal 1.
Pros: maximally defensive.
Cons: breaks legacy packages, breaks experimentation, breaks bufferutil-class packages that pre-date napi.
Status: rejected as a default; available as an opt-in --strict mode.
Drawbacks
- False positives are real. Legitimate native packages that don’t follow napi conventions will trip Signal 1. The allowlist needs maintenance.
- Symbol enumeration on stripped binaries is weak. Strings extraction catches the
wallet.dat-class signal but not thegetaddrinfo-class signal if the binary statically links a custom DNS resolver. Mitigation: combine with Signal 6 (publisher churn). - Signal 6 needs registry metadata. That requires a privileged or rate-limited path to the npm registry. Audit-time-on-tarball mode cannot compute Signal 6.
Open questions
- What is the exit-code policy? Should
flaggedalways exit non-zero, or should there be a--no-failflag for CI surveys that want to inventory rather than gate? Leaning toward “exit non-zero by default,--no-failfor inventory mode”. - Where does this live? As a standalone CLI? An
npm auditplugin? A pre-publish hook for maintainers to self-check? Probably all three, but the standalone CLI ships first. - How does it handle
optionalDependenciesplatform packages? A package like@foo/native-linux-x64is a sibling package containing only the binary. The heuristic needs to scan those, not just the parent. Plumbing TBD. - Should it also flag pure-JS obfuscation patterns? Out of scope for this RFC — that is a different attack class, well-covered by existing tools.
Adoption
This RFC is draft. The progression to proposed requires:
- A reference implementation that runs each signal against a corpus of known-good packages (top-1000 npm) and the reproductions from the Rusty Pipes series. Target false-positive rate <1% on the corpus, true-positive rate >95% on the reproductions.
- A worked example showing the heuristic flagging the Rusty Pipes exploit binary while not flagging
@napi-rs/canvas. - A documented allowlist process.
The progression to accepted requires a shipping CLI and at least one CI integration in an external project.
Security considerations
- The heuristic is a target. Once attackers know what we look for, they will adapt. Signal 4 in particular is easily evaded by symbol stripping plus indirect call patterns. Signal 6 cannot be evaded without compromising additional npm accounts, which is the actual attacker work-factor we want to raise.
- Running the heuristic itself is a potential attack surface. It walks tarballs and reads binaries. An attacker could craft a malformed
.nodeto exploit our binary parser. Mitigation: use a hardened parser library (e.g.goblin’s pure-Rust object-file parser) and run the audit in a process with no network access. - Reporting to a central service is out of scope. A future version might centralise findings to track ecosystem health, but that introduces telemetry concerns and is not in this RFC.
References
- Rusty Pipes — the original threat model.
- Rust in Peace: How to Hijack Node.js with a Single Require — the build of the malicious package.
- Rusty Pipes Exploit — the working end-to-end exploit.
- npm RFC 0017: package signing — adjacent ecosystem work.
- SLSA build levels — the long-term provenance answer.