MCP Spotlight: Cloudflare API MCP Server — 2 Tools, 2,500+ Endpoints, One Search-and-Execute Pattern
Server: @cloudflare/mcp-server-api by Cloudflare
Tools: 2 (search, execute) · License: MIT · Transport: stdio
Coverage: 2,500+ Cloudflare API endpoints — DNS, Workers, KV, R2, D1, Pages, Firewall, Load Balancers, Stream, Images, AI Gateway, Vectorize, Access, Zero Trust, Email Routing, Queues, Durable Objects, Hyperdrive, Analytics
Auth: Cloudflare API Token with scope-minimal permissions
Docs: developers.cloudflare.com/.../cloudflare/servers-for-cloudflare
GitHub: github.com/cloudflare/mcp
Every major cloud platform has dozens of product APIs, hundreds of endpoints per product, and thousands of configuration parameters. The "wrap the entire API in MCP" problem usually gets solved by exposing hundreds of tools — one per endpoint, one per resource type, one per action. The agent gets a 300-tool dump in its context, performance degrades, and the user gives up.
Cloudflare took the opposite bet: expose 2,500+ API endpoints through exactly 2 tools — search() and execute(). The agent searches for the right endpoint, executes the API call. The MCP tool surface stays at 2 forever, regardless of how many Cloudflare products exist.
This is the design pattern every "wrap the whole platform" MCP server should copy. It's also the cleanest example I've seen of API-as-MCP-via-search — the same approach that Anthropic uses internally for Claude's tool-use with large API surfaces.
The Two-Tool Pattern
| Tool | What It Does |
|---|---|
search | Search the Cloudflare API schema by keyword, resource type, or product. Returns the matching endpoint, parameters, and example payloads. |
execute | Execute a Cloudflare API call with the method, URL, and parameters. Returns the API response. |
That's the entire MCP surface. The agent's mental model is:
1. "I need to do X with Cloudflare."
2. search("X") → endpoint, parameters, schema
3. execute(endpoint, params) → result
No 300-tool dump. No tool-name discovery. No per-product configuration. Just search and execute.
Why This Pattern Works
The "wrap the whole platform" problem has three failure modes that the two-tool pattern avoids:
1. Context bloat
A 300-tool tools/list response burns tens of thousands of tokens before the user even says "hi." On long-running agent sessions, the model has to re-read those tool descriptions every turn. The 2-tool surface stays at a constant ~1,000 tokens regardless of Cloudflare's product portfolio.
2. Tool name discovery
When a tool is named cloudflare_workers_kv_namespace_write_key_value_pair_v4, the agent has to guess the right name from a giant registry. The search-tool approach means the agent queries for what it needs at the moment it needs it, with the natural language description. The agent doesn't memorize tool names; it searches the schema.
3. Schema drift
When Cloudflare adds a new endpoint, a traditional MCP server needs a new release to expose it. With the two-tool pattern, the endpoint just appears in the search() results. The MCP server doesn't need to update for every new API endpoint — only for breaking schema changes.
The Coverage: Every Cloudflare Product
The two tools cover the entire Cloudflare API:
| Product | What You Can Do |
|---|---|
| DNS | Create, update, delete records; import/export zones; DNSSEC |
| Workers | Deploy scripts, manage bindings, set secrets, view logs |
| KV | Namespaces, keys, values, TTL, metadata, list operations |
| R2 | Buckets, objects, multipart uploads, presigned URLs, lifecycle |
| D1 | Databases, tables, query, migrations, export |
| Pages | Projects, deployments, custom domains, env vars |
| Firewall | WAF rules, rate limits, IP lists, bot management |
| Load Balancers | Pools, origins, monitors, geographic steering |
| Stream | Videos, live streams, captions, thumbnails, analytics |
| Images | Variants, transformations, upload, delivery |
| AI Gateway | Routes, logs, caching, rate limits, fallbacks |
| Vectorize | Indexes, vectors, query, namespaces, metadata |
| Access | Applications, policies, identity providers, groups |
| Zero Trust | Tunnels, devices, networks, routes |
| Email Routing | Addresses, rules, catch-all, send emails |
| Queues | Producers, consumers, DLQ, batch processing |
| Durable Objects | Namespaces, instances, storage, alarms |
| Hyperdrive | Configs, connections, query |
| Analytics | GraphQL analytics, zone analytics, worker analytics |
The agent doesn't need to know which product a task belongs to. "Add a DNS record for staging.example.com" → search → DNS endpoint → execute. "Deploy a Worker" → search → Workers endpoint → execute. Same two tools, same pattern.
The Search Tool in Action
When the agent calls search(), it gets structured results:
{
"endpoint": "POST /accounts/{account_id}/workers/scripts/{script_name}",
"description": "Upload a Worker script",
"parameters": {
"account_id": "string (required)",
"script_name": "string (required, alphanumeric and dashes only)",
"body": {
"main_module": "string (JS or WASM)",
"bindings": "array of KV, D1, R2, secret bindings",
"compatibility_date": "string (YYYY-MM-DD)",
"compatibility_flags": "array of strings"
}
},
"example_payload": "..."
}
The agent reads the schema, fills in the parameters, and calls execute(). The result is the API response. No magic, no abstraction, no "Cloudflare-specific" syntax.
Facio Integration
{
"mcpServers": {
"cloudflare": {
"command": "npx",
"args": ["-y", "@cloudflare/mcp-server-api"],
"env": {
"CLOUDFLARE_API_TOKEN": "${credentials.CLOUDFLARE_API_TOKEN}",
"CLOUDFLARE_ACCOUNT_ID": "${credentials.CLOUDFLARE_ACCOUNT_ID}"
}
}
}
}
Facio's audit trail captures every search() and execute() call with the query, the matched endpoint, the parameters, and the response. For a team running Cloudflare at scale, this is the complete operational record: "Agent created DNS record at 14:32 UTC, deployed Worker at 14:35 UTC, rotated KV secret at 14:38 UTC."
For HITL workflows, the destructive operations are obvious — the agent can call any Cloudflare API, so the gating happens at the scope of the API token + the destructive hint of the action:
| Token Scope | Capability | Suggested Gate |
|---|---|---|
Zone:Read | List zones, read DNS records | None — autonomous |
Zone:Edit | Create/update DNS records | Soft confirm (review the record) |
Zone:Edit (delete scope) | Delete DNS records | Hard confirm (downstream breakage risk) |
Workers Scripts:Read | List workers, read logs | None — autonomous |
Workers Scripts:Edit | Deploy new workers | Hard confirm (production deployment) |
Account:Read | List all resources | None — autonomous |
Account:Edit (global) | Anything | Hard confirm (catastrophic risk) |
The Cloudflare API Token model is scope-minimal by default — you can issue a token that has read-only access to a single zone, write access to a specific KV namespace, or admin access to a single Worker. Combined with Facio's destructive-operation gating, this creates a defense-in-depth model:
- Layer 1: API Token scope — the token can only reach the resources the user permitted
- Layer 2: HITL confirmation — the destructive operations require human review
- Layer 3: Audit trail — every call is logged with context
For teams running multi-account Cloudflare setups, the pattern is one MCP server per Cloudflare account (or per persona), each with a scoped token. The agent works in the right account with the right scope, the audit trail is per-account, and the HITL gating is per-operation.
Quickstart
# 1. Create a Cloudflare API Token
# https://dash.cloudflare.com/profile/api-tokens
# Choose a template (e.g., "Edit zone DNS") or build a custom token
# 2. Install the MCP server
npm install -g @cloudflare/mcp-server-api
# 3. Configure your MCP client
{
"mcpServers": {
"cloudflare": {
"command": "npx",
"args": ["-y", "@cloudflare/mcp-server-api"],
"env": {
"CLOUDFLARE_API_TOKEN": "your_token",
"CLOUDFLARE_ACCOUNT_ID": "your_account_id"
}
}
}
}
# 4. First prompts
# "List all my DNS zones"
# "Add a CNAME record for staging.example.com pointing to my Pages deployment"
# "Deploy a new Worker that rate-limits /api to 100 req/min per IP"
# "Show me the last 100 log lines for my Worker 'api-gateway'"
# "Create a new R2 bucket called 'user-uploads' with a 30-day lifecycle rule"
# "Set up a Vectorize index for my RAG app with 768 dimensions"
Use Cases
DNS management: "List all my zones, find ones that don't have a DMARC record, and add one with a default policy." Search → DMARC endpoint → execute. Mass DNS hygiene in a single prompt.
Worker deployment: "Deploy this Worker code to my account, with the new KV binding." Search → Workers scripts PUT → execute. The agent handles the multipart upload, the binding configuration, the compatibility date.
R2 bucket setup: "Create a new R2 bucket for user uploads, configure a 90-day lifecycle rule, and generate a presigned URL for the first upload." Three search() + execute() cycles.
Security audit: "List every WAF rule across all my zones, show me which ones are disabled, and enable any that look like good defaults." Cross-zone query, surface the disabled rules, fix them.
Vectorize setup: "Create a Vectorize index with 1536 dimensions (for OpenAI embeddings), tell me the index ID, and add 100 sample vectors from this dataset." Search → Vectorize endpoints → batch execute.
Email routing: "Add a new email routing rule that forwards hello@example.com to my personal Gmail." Search → Email Routing → execute.
Zero Trust tunnel: "Set up a new Cloudflare Tunnel for our internal admin app, configure the access policy, and give me the connector token." Multi-endpoint coordination through one conversational surface.
Cost optimization: "List all my Workers, R2 buckets, and D1 databases, and tell me which ones haven't been accessed in 30 days. Draft a cleanup plan." Cross-product query, surface the candidates, propose the action.
Incident response: "Our Workers API is returning 500s. Pull the last 500 log lines, group by error type, and find the common cause." Search → Workers logs → execute → analyze.
The Pattern: API-as-MCP-via-Search
Cloudflare's two-tool pattern is the most important design lesson of 2026 in the MCP ecosystem. For any platform with hundreds or thousands of endpoints, the right MCP design isn't to expose every endpoint as a tool. It's to:
- Expose a
searchtool that returns the relevant endpoint, schema, and example - Expose an
executetool that makes the API call - Let the agent compose search + execute as needed
The benefits compound:
- Constant tool surface —
tools/listis always 2 entries, no matter how big the API grows - No schema drift — new endpoints appear in search results automatically
- Better agent performance — model doesn't need to reason over 300 tool descriptions
- Lower token cost — only the relevant endpoint schema is loaded per call
- Universal API coverage — the entire Cloudflare API is reachable, not a curated subset
For vendors thinking about their own MCP server, this is the template. For platform teams evaluating MCP servers, the two-tool pattern is a strong signal that the vendor understands the protocol.
Bottom Line
The Cloudflare API MCP Server is the cleanest example of API-as-MCP-via-search in the ecosystem. Two tools, 2,500+ endpoints, one pattern, no tool-surface bloat, no schema drift, no per-product configuration.
For any team running production infrastructure on Cloudflare, this is the missing layer that turns "agent helps me manage my Cloudflare setup" from a 50-script Bash incantation into a single conversational surface. The agent searches the API, executes the calls, and the audit trail captures everything.
For the broader MCP ecosystem, the two-tool pattern is the design lesson every large-platform MCP server should study. Don't ship 300 tools. Ship search and execute, and let the agent compose.
npx -y @cloudflare/mcp-server-api and your agent has the entire Cloudflare platform.
MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for architecture design, ecosystem impact, and integration fit with Facio's HITL-first agent runtime.