MCP Spotlight: ArcadeDB — The Multi-Model Database With a Built-In MCP Brain
Server: ArcadeDB by ArcadeData Stars: 896 · License: Apache 2.0 · Language: Java 21+ (LLJ — Low Level Java) MCP Transport: Streamable HTTP (built into database engine) Latest release: v26.3.1 · Last updated: May 24, 2026
What It Does
Most database MCP servers are external wrappers — a separate process that sits between your AI agent and the database, translating requests back and forth. More components to deploy, more latency, more attack surface.
ArcadeDB takes the opposite approach: the MCP server runs inside the database engine itself. When you start ArcadeDB, the MCP server is already there. No npx, no external service, no glue code.
And it's not just any database. ArcadeDB is a multi-model DBMS — graph, document, key/value, time series, vector embeddings, search engine, and geospatial — all in one engine. Your AI agent can query using SQL, Cypher, Gremlin, GraphQL, or even the MongoDB query language. The database engine picks the best execution path regardless of which language you use.
Created by Luca Garulli, the original founder of OrientDB (acquired by SAP), ArcadeDB was written from scratch using "Low Level Java" techniques — mechanical sympathy, minimal GC pressure, parallel query execution. It runs on everything from a Raspberry Pi to a multi-node cloud cluster.
Why It Matters for Agent Engineering
Database integration is one of the most requested MCP capabilities — and also one of the riskiest to get wrong. The failure modes are serious:
- Stale schema awareness: External wrappers cache schemas. The LLM generates queries against a schema that changed 20 minutes ago.
- Permission bypass through clever prompts: Text-based permission checks fail against creatively worded queries.
- Query language lock-in: Your agent can only query in SQL, but your data is best expressed as a graph traversal.
- Multi-component deployment: Database + MCP wrapper + auth proxy + monitoring. Four things that can break.
ArcadeDB's built-in approach addresses all four:
- Real-time schema introspection: The LLM always sees the live schema, not a cache
- Semantic query analysis: ArcadeDB parses queries through its AST — no keyword-matching permission checks
- Polyglot queries: The same MCP interface accepts SQL, Cypher, Gremlin, GraphQL, and MongoDB queries
- Single-component deployment: The MCP server is the database. Zero additional services.
The MCP Tool Set
ArcadeDB exposes five MCP tools — all running directly against the database engine:
| Tool | Purpose |
|---|---|
list_databases | Discover available databases filtered by user permissions |
explore_schema | Introspect types, properties, constraints, indexes, and inheritance |
query_data | Execute read-only queries (SQL, Cypher, Gremlin, GraphQL, MQL) |
execute_command | Run DDL/DML with per-operation permission flags |
server_status | Version, available languages, cluster health |
The key design decision: query_data is read-only by design. Even if an AI hallucinates a DROP TABLE, the engine parses it through the AST and rejects write operations. Write operations require the separate execute_command tool, which has its own granular permission flags.
Granular Security That Actually Works
Connecting an AI to your production database should make any engineer nervous. ArcadeDB's permission model is the right response:
| Permission | Default | Purpose |
|---|---|---|
enabled | false | MCP is off until you explicitly turn it on |
allowReads | true | SELECT, MATCH, and read-only queries |
allowInsert | false | INSERT and CREATE |
allowUpdate | false | UPDATE operations |
allowDelete | false | DELETE operations |
allowSchemaChange | false | CREATE TYPE, ALTER TYPE, DDL |
allowAdmin | false | Server administration |
allowedUsers | ["root"] | Limit MCP access to specific users |
The critical detail: these aren't keyword filters. ArcadeDB parses every MCP query through its native AST and determines the actual operation type semantically. A query that tries to embed a write operation inside a CTE or subquery won't fool the parser — the AST analysis catches it.
For HITL workflows in Facio, this means you can confidently enable read access for your agent and route any write commands through the approval queue.
Connecting ArcadeDB to Facio
Step 1: Start ArcadeDB
docker run --rm -p 2480:2480 -p 7687:7687 \
-e arcadedb.server.rootPassword=playwithdata \
arcadedata/arcadedb:latest
ArcadeDB ships as a single Docker container with all modules included. The MCP endpoint is at http://localhost:2480/api/v1/mcp.
Step 2: Create an API Token
Open ArcadeDB Studio at http://localhost:2480, navigate to Security → API Tokens, and create a token with read access for MCP use.
Step 3: Register in Facio
{
"action": "add",
"name": "arcadedb",
"config": {
"command": "npx",
"args": [
"mcp-remote",
"http://localhost:2480/api/v1/mcp",
"--header",
"Authorization: Bearer ${credentials.ARCADEBD_API_TOKEN}"
]
}
}
Step 4: Start Asking Questions
With ArcadeDB connected, your agent can handle queries across all data models:
- Graph: "Which customers are connected through shared suppliers?"
- Document: "Find all invoices where the total exceeds €5,000"
- Vector: "Find documents semantically similar to this contract clause"
- Time Series: "Show me the 7-day rolling average of API response times"
- Geospatial: "Which warehouses are within 50km of this customer?"
The agent picks the right query language for each question — Cypher for graph traversals, SQL for aggregations, or a combination across models.
Multi-Model: One Database, Every Paradigm
ArcadeDB's multi-model architecture deserves context. It's not five databases glued together — it's one engine that understands multiple data models natively:
| Model | Protocol/Language Support |
|---|---|
| Graph | Cypher (Open Cypher), Gremlin (TinkerPop 3.7), OrientDB SQL |
| Document | MongoDB driver + MongoDB Query Language, OrientDB SQL |
| Key/Value | Redis driver (subset) |
| Search Engine | Full-text indexing, fuzzy search |
| Time Series | InfluxDB Line Protocol, Prometheus remote_write/read, PromQL |
| Vector | Native vector embeddings, ANN search |
| Geospatial | geo.* SQL functions, spatial indexes |
| Relational | PostgreSQL wire protocol |
Plus 70+ built-in graph algorithms — pathfinding, centrality, community detection, link prediction, graph embeddings — available through any query interface.
For an AI agent, this means the database adapts to the question, not the other way around. A fraud detection query might combine graph traversal (shared accounts between flagged users), vector similarity (similar transaction patterns), and time-series analysis (velocity anomalies) — all in one query, executed by one engine.
Production Patterns
Graph RAG with Native Vector Support
ArcadeDB combines graph traversal and vector similarity in a single database. For retrieval-augmented generation:
- Documents are stored with vector embeddings
- Relationships between documents form a knowledge graph
- The agent queries semantically ("find documents about contract termination") while traversing the graph ("...and show me all related precedent cases")
No external vector database. No separate graph database. One query, one engine.
HITL for Schema Changes
With Facio's human-in-the-loop review, ArcadeDB's granular permissions create a natural workflow:
Agent workflow:
1. User: "Add a 'risk_score' property to the Customer type"
2. Agent calls execute_command → blocked: allowSchemaChange is false
3. Agent presents the DDL in Facio's HITL approval card
4. Human approves → agent retries with elevated permissions
5. Schema change logged in Facio's audit trail
The agent can propose changes instantly and safely — no risk of unintended modifications.
Multi-Language Analytics
ArcadeDB's polyglot query support lets agents use the right tool for each sub-task:
-- Agent-generated: combine graph traversal with time-series aggregation
SELECT
customer.name,
centrality.betweenness(customer) as influence_score,
ts.avg(metrics.api_latency) as avg_latency_7d
FROM Customer
WHERE geo.within(warehouse.location, "circle(52.52,13.40,50km)")
AND vector.similarity(customer.preferences, :query_vector) > 0.85
The agent doesn't need to know the best execution strategy — ArcadeDB's query optimizer handles the multi-model execution path.
Comparison: Built-In vs. External MCP Servers
| Feature | ArcadeDB (Built-In) | External Wrapper | Direct SDK Integration |
|---|---|---|---|
| Schema realism | ✓ Live | — Cached | — Manual |
| Query language breadth | ✓ 5+ languages | — Usually SQL-only | — Depends on SDK |
| Permission model | ✓ AST-semantic | — Text-based | — App-level |
| Deployment overhead | ✓ Zero | — Extra service | — Custom code |
| Multi-model queries | ✓ Native | — Not supported | — Manual orchestration |
| Audit trail integration | ✓ Built-in | — External | — DIY |
For agents that need real database access — not mock queries, not cached schemas — the built-in approach eliminates the most common failure modes.
Key Takeaways
- Built-in, not bolted-on: The MCP server runs inside ArcadeDB's engine — zero deployment overhead, always available
- Multi-model, one interface: Your agent queries graphs, documents, vectors, and time series through the same MCP endpoint
- Security by semantic analysis: AST-level query parsing catches write operations regardless of how cleverly they're disguised
- Fine-grained permissions: Each operation type (read, insert, update, delete, schema change, admin) has independent enable/disable
- Real-time schema awareness: The agent always sees the live schema, eliminating stale-query failures
- Graph-native: 70+ built-in graph algorithms enable complex relationship queries without data movement
- Open source, forever: Apache 2.0 with structural governance guarantees — created by the founder of OrientDB
ArcadeDB: arcadedb.com · GitHub: github.com/ArcadeData/arcadedb · MCP Blog: arcadedb.com/blog · Docker: hub.docker.com/r/arcadedata/arcadedb · Facio MCP docs: facio.bot/docs/mcp