Back to blog

Engineering · Jun 9, 2026

MCP Spotlight: AI Design Blueprint — The 10 Principles Your Agent Should Refuse to Ship Without

AI Design Blueprint exposes 13 read-only MCP tools covering 10 production-tested principles for safe, observable, and steerable agent UX — with 96+ implementation examples and a paid Blueprint Readiness Score for repo-level governance review.

MCP ServerAI Design BlueprintGovernanceAgent UXHITLAI Agents

MCP Spotlight: AI Design Blueprint — The 10 Principles Your Agent Should Refuse to Ship Without

Server: aidesignblueprint-mcp by AI Design Blueprint License: MIT · Public tools: 13 · Free anonymous access · Pro tier for review tooling MCP Tracker: glama.ai/mcp/servers/aidesignblueprint/ai-design-blueprint-doctrine Docs: aidesignblueprint.com/en/for-agents

The reason most agents fail in production isn't the model. It's that nobody defined what "good" looks like before the agent shipped. Teams iterate on prompts, swap out models, debate tool schemas — but no one's measuring the system against a shared standard of governable behavior.

AI Design Blueprint fixes this with a doctrine of 10 production-tested principles, 96 implementation examples, and a public MCP endpoint your agent can query for guidance — and a Blueprint Readiness Score (0–100, A–F) it can be measured against. The MCP is free, anonymous, and read-only. The Architect validation tool is the paid tier.

The 10 Principles in One Glance

#PrincipleThe Failure It Names
P1Treat delegated work as a system, not a conversationThe chat IS the coordination — but no run-state surface, no approval queue, no pause/resume. Work doesn't outlive a single message.
P2Make background work perceptibleSilent background failure — agent runs in the background, user sees a stale UI, crash leaves no recovery path. Disabled buttons are not state.
P3Expose meaningful operational stateState loss on refresh — user reloads, agent forgets. No durable run record, no resumable session, no audit trail.
P4Steer without restartingRestart-heavy UX destroys context and trust. Redirection should be a state change at the run level, not a new prompt.
P5Surface ownership, blockers, and merge riskTwo sessions, three minutes apart, edited the same row. One silently overwrote the other. Conflict happens after the second session starts.
P6Make hand-offs, approvals, and blockers explicitAgent fires irreversible actions (payments, posts, sends, deploys) under model output alone. The approval gate is decoration, not load-bearing.
P7Match confirmation intensity to action reversibilitySame UX for "delete draft" and "delete account" — the irreversibility axis collapses.
P8Run, don't request, when work is observableWhen the user can already see the work happen, asking permission is friction, not safety.
P9Treat tool output as evidence, not conclusionAgent reads a number, accepts it, repeats it. The number was wrong, but the agent had no reason to verify.
P10Bound the blast radius of any single actionOne failed action in an unbounded pipeline cascades. Smaller units, shorter dependencies, faster recovery.

Read those principles against your last three production incidents and at least two will map cleanly. That's the design point: these aren't abstract — they're named failure modes with working implementations.

The 13 Public MCP Tools

All 13 public tools are anonymous-allowed, read-only, and require no credentials.

Discovery

ToolReturns
principles.list(cluster?)All 10 principles, optionally filtered by cluster
clusters.list()The 4 principle clusters (orchestration, state, handoffs, etc.)
principles.get(slug)Full principle page with rationale, failure mode, examples
clusters.get(slug)All principles in a cluster, with cross-references
assets.list()All exported assets, badges, and doctrine files
guides.list()All runtime architecture guides

Search

ToolReturns
principles.search(query, limit?)Principles matching a natural-language query
examples.search(query, principle_ids?, difficulty?, library?, limit?)Curated implementation examples by library (LangChain, CrewAI, custom), difficulty, or principle mapping
guides.search(query, limit?)Runtime architecture guides by topic
examples.get(slug)A specific implementation example with full code

Single-item retrieval

ToolReturns
principles.get(slug)Already listed — for re-emphasis: the principle page is the unit of doctrine
guides.get(slug)Full guide body

Two opt-in signal tools

ToolWhen to call
signals.report(event_type, ...)Only after the user has clearly expressed that something was useful. Never automatic. Once per session max.
signals.feedback(...)Only when the user explicitly asks to leave feedback. Never prompted.

The signal tools are opt-in by design: the doctrine includes its own constraint about not silently exfiltrating value signals from agent sessions. The same privacy posture that governs your own agent should govern any agent that reads the doctrine.

The 4 Clusters

The 10 principles organize into 4 clusters, each addressing a class of failure:

  • Orchestration Visibility — how the agent shows what it's doing (P1, P2, P4, P5)
  • State & Continuity — how state survives across reloads, sessions, and async work (P3, P8)
  • Handoffs & Approvals — how the agent passes control to humans and other agents (P6, P7)
  • Evidence & Blast Radius — how the agent validates output and bounds damage (P9, P10)

When you query clusters.list() and then call examples.search(query="background work visibility"), you traverse the cluster → principle → implementation graph that the doctrine was designed to support.

The Architect Validation Tool (Pro Tier)

The public tools are read-only. The architect.validate tool is the review surface — a Pro/Teams tier feature that scores a real repository against the doctrine:

// Call:
{
  "implementation_context": "<code context or PR diff>",
  "principle_ids": ["P2", "P6", "P8"],  // optional focus
  "private_session": true                 // skip server-side logging
}

// Returns:
{
  "run_id": "abc-123",
  "score": 82,                           // 0–100
  "grade": "B",
  "badge_url": "https://aidesignblueprint.com/api/badge/run/abc-123.svg",
  "review_url": "https://aidesignblueprint.com/en/readiness-review/abc-123",
  "deltas": [...]                        // regression diff vs. previous run
}

The private_session: true flag is worth highlighting: a single call can opt out of all server-side logging. The doctrine includes a principle about evidence handling — the tool itself follows the same standard.

Facio Integration

Anonymous read access — zero config:

{
  "mcpServers": {
    "blueprint": {
      "url": "https://aidesignblueprint.com/mcp"
    }
  }
}

No API key. No account. The 13 public tools appear immediately. Your agent can call principles.search(query="approval boundary") mid-conversation when reasoning about whether an action needs HITL gating.

For HITL-first workflows, the principle mapping is the value. Your agent already knows it should ask for approval before destructive actions — but why, and which pattern? The doctrine is the shared language:

Agent: "About to call /api/orders/[id]/refund — that's a payment action."
Agent: [calls principles.search(query="approval boundary payment")]
Blueprint: "P6, P7. See examples/refund-confirmation-modal and examples/soft-confirm-pattern."
Agent: [implements modal, surfaces reversibility, returns to user]

Facio's audit trail captures which principle was consulted, which example was cited, and what the agent built. This is the Facio angle: the doctrine gives your agent a shared language for the why of HITL gating, and Facio captures the evidence of what was applied.

For Pro-tier users, the badge system creates a public artifact:

[![AI Design Blueprint](https://aidesignblueprint.com/api/badge/run/abc-123.svg)](https://aidesignblueprint.com/en/readiness-review/abc-123)

Place it in your README. Visitors see the score and grade. Click through to the public readiness review page. This is governance made visible.

Quickstart

# 1. Add the public MCP endpoint to your client
{
  "mcpServers": {
    "blueprint": {
      "url": "https://aidesignblueprint.com/mcp"
    }
  }
}

# 2. Run the first proof call
clusters.list()

# 3. Run the second proof call
examples.search(query="orchestration visibility steering", limit=3)

# 4. Production prompts
# "Search the doctrine for examples of background work visibility"
# "What's the principle for approval boundaries on destructive actions?"
# "Find runtime architecture guides for parallel agent sessions"
# "What does P3 say about state loss on refresh?"

Use Cases

Governance baseline for new agent projects: Before writing the first prompt, your agent calls principles.list() and you have a 10-item checklist of failure modes to design against. No more "we forgot about approval gates."

HITL pattern selection: When you need to decide between hard confirmation, soft confirm, or no confirmation for an action, the doctrine's P6–P8 cluster has 96+ examples across LangChain, CrewAI, custom implementations. The agent surfaces the right pattern for your context.

Pre-deploy review: The architect.validate tool scores your repo against the doctrine. Score goes down between releases? The regression diff shows what regressed.

Cross-team alignment: Engineering, product, and design all reference the same principle numbers ("P6 isn't met on the refund flow") instead of debating terminology.

Badge-as-evidence: The public readiness review page is shareable evidence that your agent meets a recognized governance standard.

Bottom Line

AI Design Blueprint gives your agent a shared language for governable behavior — 10 principles, 96+ examples, 13 public MCP tools, and a paid review tier that scores your repo on a 0–100 scale. For teams shipping production agents, the doctrine is the difference between "we have a HITL pattern" and "we have a named, validated, cited HITL pattern."

At https://aidesignblueprint.com/mcp, your agent can start consulting the doctrine in 60 seconds. The Pro tier is for the audit badge and the validation API. The free tier is everything you need to make governance a first-class concern in your agent design.


MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for tool quality, governance relevance, and integration fit with Facio's HITL-first agent runtime.