MCP Spotlight: Playwright MCP — The Canonical Browser Agent, 90K Stars Strong
Server: @playwright/mcp by Microsoft
Stars: 90K (core Playwright) · Downloads: 220M npm/month · Tools: 24 · License: Apache 2.0
MCP Tracker: glama.ai/mcp/servers/microsoft/playwright-mcp
Docs: playwright.dev/docs/getting-started-mcp
If there's one MCP server that defines the category, it's Playwright MCP. Microsoft's first-party bridge between the Model Context Protocol and the Playwright browser automation engine — 24 tools, 90,000 GitHub stars, and 220 million npm downloads per month as of June 2026. This is the tool that made "agent, open a browser" a standard workflow.
The Accessibility Tree Revolution
Playwright MCP doesn't use screenshots. It uses the accessibility tree — a structured representation of the page that exposes element roles, text content, and interactive states without rendering a single pixel:
- heading "todos" [level=1]
- textbox "What needs to be done?" [ref=e5]
- listitem:
- checkbox "Toggle Todo" [ref=e10]
- text: "Buy groceries"
Your agent reads this. It types into ref=e5. It clicks ref=e10. No vision model. No pixel matching. No ambiguity.
This approach has three advantages over screenshot-based agents:
- Deterministic: Element references are exact. No "click the blue button near the top" guesswork.
- Fast: Accessibility trees are kilobytes, not megabytes. Parsing is instant.
- LLM-agnostic: Any model that can follow structured instructions works. No vision capability required.
The 24 Tools
Playwright MCP covers the complete browser interaction surface:
| Category | Tools |
|---|---|
| Navigation | browser_navigate, browser_navigate_back, browser_resize |
| Page Interaction | browser_click, browser_type, browser_fill_form, browser_select_option, browser_hover, browser_drag |
| Snapshot & Screenshot | browser_snapshot, browser_take_screenshot |
| Keyboard & Mouse | browser_press_key, browser_type |
| Dialogs | browser_handle_dialog |
| Tabs | browser_tabs, browser_close |
| Console & Network | browser_console_messages, browser_network_requests |
| Mocking | browser_route |
| Storage & Cookies | browser_save_storage_state, browser_restore_storage_state, browser_cookies_list, browser_cookies_set, browser_cookies_delete |
| Code Execution | browser_run_code_unsafe |
Plus the install tools (browser_install) and the browser_wait_for tool for timing-dependent interactions.
The CLI vs MCP Tradeoff
Microsoft ships two interfaces for browser automation from AI agents, and the distinction matters:
-
MCP: Best for exploratory, iterative, or persistent workflows. The agent maintains continuous browser context, taking new snapshots after each interaction and reasoning about page state. Higher token cost (the accessibility tree is re-read after every action), but richer introspection.
-
CLI + SKILLs: Introduced via the newer Playwright CLI, this path uses concise CLI invocations instead of MCP tool schemas. Better for coding agents that need to balance browser automation with large codebases in limited context windows. Lower token cost, less introspection.
For Facio's HITL-first runtime, the MCP path aligns better — the accessibility snapshots feed directly into the decision trail, giving reviewers full visibility into what the agent saw at each step.
The Unsafe Code Tool
For interactions that go beyond individual tool calls, browser_run_code_unsafe runs arbitrary Playwright JavaScript in the server process:
// Agent asks: "Verify the todo count"
async (page) => {
const count = await page.getByTestId('todo-count').textContent();
return count;
}
This is RCE-equivalent — it runs arbitrary code in the server process. Only enable for trusted MCP clients. The tradeoff is real: without it, interactions are limited to what individual tools can express; with it, you get arbitrary Playwright but at security cost.
Profile Modes
Playwright MCP supports three profile strategies:
| Mode | Behavior | Use case |
|---|---|---|
| Persistent (default) | Login state and cookies saved to disk per-project. Automatically scoped by workspace hash. | Long-running sessions; auth persistence |
Isolated (--isolated) | Browser profile in memory only. Discarded on close. | One-off tasks; CI; testing |
Extension (--extension) | Connects to your running browser via the Playwright MCP Bridge extension. | Using your existing logged-in browser; zero re-auth |
The extension mode is the killer feature for real-world use: connect to your browser with all your sessions intact, and the agent works with your actual logged-in state.
Facio Integration
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
Headed by default (see the browser), persistent profile (auth survives restarts), and Firebox/WebKit/Edge available via --browser. For CI:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest", "--headless", "--isolated"]
}
}
}
With Facio's audit trail, every browser_snapshot and browser_click is captured. Reviewers see exactly what the agent saw and did — structured accessibility trees, not screenshots. This makes HITL review of browser automation deterministic and auditable.
Quickstart
# One-click install for VS Code
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
# Or Claude Code
claude mcp add playwright npx @playwright/mcp@latest
# First prompt after connection:
# "Navigate to https://demo.playwright.dev/todomvc
# and add three todo items: 'milk', 'eggs', 'bread'."
Ready to test in 30 seconds. No API keys. No accounts. Just a Node.js 18+ runtime.
Bottom Line
Playwright MCP is the canonical browser agent — 90K stars, 220M monthly downloads, and built by the team that defined browser automation. The accessibility tree approach makes interactions deterministic, fast, and vision-model-independent. The extension mode lets agents work with your real browser state. And the persistent profile means you configure auth once and the agent remembers it across sessions.
If your agent needs to interact with the web, this is where you start.
MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for tool quality, engineering approach, and integration fit with Facio's HITL-first agent runtime.