Your AI Agent Pulled 100,000 Malicious MCP Server Pulls Last Quarter: The Supply Chain Attack Class Nobody Is Indexing
The Phoenix Security Malware Package Intelligence corpus, published in June 2026, indexes 657 malicious package-versions across 59 supply chain attack campaigns between June 2024 and June 2026. The campaigns span npm, PyPI, VS Code extensions, and — the fastest-growing category — AI agent dependencies: MCP servers, agent skills, LLM tool packs, marketplace plugins. A single vulnerable PostgreSQL MCP server image was pulled more than 100,000 times before its supply chain compromise was disclosed.
The number is the crisis. A traditional npm package compromise affects the application's code that imports it. An MCP server compromise affects every agent that registers the server as a tool. The agent's blast radius is the union of every tool it has ever pulled, plus every transitive dependency of every tool, plus every configuration file the tool reads, plus every remote endpoint the tool calls. The supply chain for an AI agent is not a list of packages. It is a graph of capabilities, each node of which can subvert the agent.
The organizations that will defend their AI agents against supply chain compromise in 2026 are the ones that recognize the supply chain as a first-class security domain, treat MCP servers and tool packs as code dependencies with provenance requirements, and build the verification machinery before the next campaign lands. The Trend Micro "Pwning Agentic AI" research from May 2026 demonstrated that the attacks are not theoretical. The vulnerable PostgreSQL MCP server used in the proof-of-concept was pulled more than 100,000 times from a public registry. Each pull is a potential compromise.
Why the AI Agent Supply Chain Is Different
The traditional software supply chain is well-understood. A package is published to a registry; consumers declare a dependency on the package; the package's code is incorporated into the consumer's build; the consumer ships a binary that includes the package's behavior. The security boundary is the package's code: a malicious package executes malicious code, and the malicious behavior is observable through code review, runtime analysis, or behavioral detection.
The AI agent supply chain is structurally different. The "package" — an MCP server, an agent skill, a tool pack — is not code that the agent incorporates. It is a service that the agent connects to. The agent's runtime calls the package's endpoints; the package executes in its own process; the agent receives the result. The malicious behavior may execute entirely within the package's process, never touching the agent's code. Code review of the agent is irrelevant; the agent is not compromised. The package is.
The difference has three operational consequences.
Static analysis misses the attack surface. The agent's source code may be clean; the malicious behavior is in the third-party service. Static analysis tools that scan the agent's code do not see the package's behavior. The attack surface is invisible to the security tooling that the organization already has.
Runtime analysis requires instrumenting the package, not the agent. To detect malicious behavior, the runtime must observe what the package actually does — what endpoints it calls, what data it exfiltrates, what commands it executes. The instrumentation point is the agent's tool invocation boundary, where the package's responses are received and acted upon.
The supply chain is a graph, not a list. An MCP server may depend on another MCP server; a tool pack may load skills from a marketplace; a marketplace may pull packages from a registry. The transitive graph can be deep. The agent's effective dependency graph includes every node in the transitive closure, not just the direct dependencies declared in configuration.
The traditional software supply chain defenses — SBOM, dependency scanning, vulnerability monitoring — extend to AI agent supply chains, but only with significant adaptations. The SBOM must include tool packages, not just code packages. The dependency scanning must cover tool registries, not just package registries. The vulnerability monitoring must include MCP server CVEs, not just library CVEs.
The Attack Classes in 2026
The Phoenix Security corpus identifies four attack classes that account for the majority of 2026 AI agent supply chain compromises.
Class 1: Malicious MCP server with hidden capabilities. The server is published to a public registry, appears legitimate (correct name, plausible description, working documentation), but contains hidden functionality: a backdoor endpoint, a credential exfiltration routine, a sandbox escape. The agent's developer installs the server as a tool; the agent's runtime calls the server; the server's hidden functionality executes in the server's process, exfiltrating whatever the agent sends to it.
Class 2: Rug-pull MCP server. The server starts legitimate. After a defined period (often after the install base has grown to a meaningful size) or after a specific trigger (a calendar date, a version bump), the server's behavior changes. A new version adds malicious functionality; the auto-update mechanism installs the new version; the agent now calls a malicious service. The Microsoft research on MCP tool poisoning (covered in the Facio analysis from June 2026) documented this pattern in detail.
Class 3: Dependency confusion in tool registries. The legitimate tool is published with one name in the official registry and another name in a third-party registry. The agent's configuration references the third-party name; the third-party registry's package takes precedence over the official one (a dependency confusion attack); the agent connects to the attacker's server instead of the legitimate one.
Class 4: Configuration injection in agent skills. The agent's skill files (the instructions, the tool definitions, the parameter schemas) are loaded from a repository. An attacker compromises the repository (or the maintainer's account) and modifies the skill files. The agent loads the modified files; the modified instructions steer the agent toward malicious actions; the modified tool definitions expose additional attack surface. The skill is the supply chain attack.
Each class requires a different defensive control. The malicious MCP server is caught by provenance verification and runtime behavioral analysis. The rug-pull server is caught by tool pinning and update verification. The dependency confusion is caught by registry allowlisting. The configuration injection is caught by skill file integrity checking.
The AI-BOM: Extending the SBOM
The Software Bill of Materials (SBOM) is the standard artifact for software supply chain transparency. The AI-BOM — the AI Bill of Materials — extends the SBOM concept to AI agent dependencies.
The AI-BOM includes not just the code packages the agent depends on, but the tool packages, the MCP servers, the marketplace skills, the model checkpoints, the prompt templates, and the configuration files. Each entry in the AI-BOM includes: the package name, version, source registry, content hash, signature, license, last-vetted timestamp, and known vulnerabilities.
The AI-BOM is the artifact that allows an organization to answer the question: "what tools does this agent depend on, and what is the supply chain risk?" Without the AI-BOM, the answer requires manual investigation across dozens of registries, repositories, and configuration files. With the AI-BOM, the answer is a single query.
The AI-BOM is becoming a regulatory requirement. The EU AI Act's high-risk system provisions, the NIST AI RMF agentic extensions, and several national cybersecurity agencies have begun referencing AI-BOM disclosure in their guidance. By 2027, "list every MCP server and tool source your agent depends on" will be a routine line on enterprise security questionnaires. Organizations that cannot produce the AI-BOM will lose deals; organizations that can produce it will have a competitive advantage.
The Six Defensive Controls
A production AI agent supply chain defense has six controls. Each addresses a different attack class.
1. Tool registry allowlisting. The agent's runtime maintains an allowlist of approved registries from which tool packages can be installed. Tools outside the allowlist cannot be loaded. The allowlist includes the official MCP server registry, the organization's internal tool registry, and any third-party registries that have been vetted. The allowlist is enforced at the runtime, not at the configuration; the agent cannot load a tool from a non-allowlisted registry even if the configuration references it.
2. Tool pinning and signature verification. Each tool package is pinned to a specific version with a cryptographic signature. The signature is verified before the tool is loaded. A tool whose signature does not match the recorded signature is rejected. The pinning prevents silent updates; the signature verification prevents tampering.
3. Provenance verification with attestation. Beyond signatures, the tool's provenance is verified through attestation: the tool was built by the claimed publisher, from the claimed source, in the claimed environment. Attestation frameworks (in-toto, SLSA, sigstore) provide the cryptographic chain from source to binary. A tool without verified attestation is treated with reduced trust.
4. Sandbox isolation for tool execution. Tool execution occurs in a sandbox with no access to host resources, secrets, or unrelated network destinations. The sandbox is the containment boundary; a malicious tool cannot escape the sandbox to affect the agent's runtime or other systems.
5. Runtime behavioral baselining. Each tool has a behavioral baseline: which endpoints it calls, what data it accesses, what patterns of behavior it exhibits. The runtime monitor observes the tool's actual behavior and flags deviations. A tool that suddenly starts reading credential files, or that begins calling a new external endpoint, is flagged for review.
6. Configuration integrity checking. Agent skill files, tool definitions, and configuration files are integrity-checked at load time. The integrity check uses cryptographic hashes recorded at deployment time. A modified configuration is detected and rejected before the agent loads it.
These six controls together form the supply chain defense. The controls overlap in coverage (a rug-pull server may be caught by signature verification, by behavior baselining, or by sandbox isolation); the overlap is a feature, not a bug. The defense is defense-in-depth.
The Runtime Implementation
The runtime is the right place to implement the supply chain defense for two reasons.
The runtime sees the tool invocation. When the agent calls a tool, the runtime is in the call path. The runtime can verify the tool's signature, check the tool's behavioral baseline, and enforce the sandbox boundary — all at the moment of invocation. The verification is in the critical path, not in a separate security scan that may be bypassed.
The runtime produces the AI-BOM. The runtime records every tool that has been registered, every tool that has been called, and every version that has been loaded. The recording is the source of the AI-BOM. The runtime can produce the AI-BOM on demand, or continuously emit AI-BOM updates to a security information and event management (SIEM) system.
Facio (the HITL-first agent runtime) implements the supply chain defense at the runtime layer. Tool registration includes signature verification and provenance attestation; tool execution occurs in a sandboxed environment; behavioral baselining is part of the runtime monitor; configuration integrity checking is enforced at skill load time; and the AI-BOM is emitted as part of the audit trail. The implementation is the architectural commitment to supply chain security as a runtime concern, not a build-time concern.
Placet.io (the HITL inbox and messenger) complements Facio at the human review points. When a tool's behavior deviates from the baseline, when a new tool version is detected, when a provenance verification fails — Placet.io delivers the review request with full context.
The Open Question: Tool Marketplace Trust
The tool marketplace model — ClawHub, OpenClaw, the various agent skill marketplaces — is the open question in 2026 supply chain security. The marketplaces accelerate tool distribution; they also accelerate tool compromise. A malicious tool published to a marketplace can reach thousands of agents within hours.
The marketplaces are responding with verification programs, but the verification is post-hoc: tools are reviewed after publication, not before. The pre-publication verification would require the marketplace to perform deep code review and runtime analysis on every tool, which is not scalable at marketplace volumes.
The defense for organizations consuming marketplace tools is the same defense for any third-party dependency: provenance verification, sandbox isolation, behavioral baselining. The marketplace is not the trust boundary; the runtime is. The marketplace provides the distribution channel; the runtime provides the trust boundary.
The open question will not be answered in 2026. The marketplace ecosystem will continue to evolve, the verification programs will mature, and the supply chain attacks will continue. The defense is the runtime, not the marketplace. The runtime that enforces the six controls is the runtime that survives the supply chain.
The Bottom Line
The AI agent supply chain is the fastest-growing supply chain attack category in 2026. The Phoenix Security corpus documents 59 campaigns across two years, with 657 malicious package-versions indexed. A single vulnerable PostgreSQL MCP server was pulled more than 100,000 times before disclosure. The agent's blast radius is the union of every tool it has ever pulled — a graph of capabilities, each node a potential compromise vector.
The defense is six controls at the runtime layer: tool registry allowlisting, tool pinning and signature verification, provenance verification with attestation, sandbox isolation, runtime behavioral baselining, and configuration integrity checking. The controls extend the traditional software supply chain defenses to the AI agent domain, with adaptations for the unique properties of tool-based dependencies.
The AI-BOM is the artifact that makes the defense auditable. The AI-BOM catalogs every tool package, every MCP server, every marketplace skill, every model checkpoint, and every configuration file that the agent depends on. The AI-BOM is becoming a regulatory requirement; by 2027, it will be a routine line on enterprise security questionnaires.
The organizations that will defend their AI agents against supply chain compromise in 2026 are the ones that treat the supply chain as a first-class security domain, implement the six controls at the runtime layer, and produce the AI-BOM as a matter of course. The alternative is the next campaign — the next 100,000 malicious pulls — and the next incident report that names the agent as the breach vector.
Facio (the HITL-first agent runtime) implements the supply chain defense. Placet.io (the HITL inbox and messenger) delivers the human review workflow at the supply chain decision points. Together, they are the architecture for supply chain security in the age of AI agents.
Further reading:
- Phoenix Security: Supply Chain Attacks 2026 — npm, PyPI, VS Code, AI Agents
- Trend Micro: Pwning Agentic AI — Your AI Agent Is Already Compromised
- Safeguard: AI Agent Supply Chain Attacks — 2026 Trend Watch
- CyberDesserts: AI Agent Security Risks 2026 — MCP, OpenClaw & Supply Chain
- Tool Poisoning Is the New Prompt Injection: The MCP Attack Class Hiding in Plain Sight
- AI Agent Runtime Guardrails: Why Policy at the Model Layer Fails and Policy at the Execution Layer Wins