MCP Spotlight: WebDriverIO MCP — Browser + Mobile App Automation, One Server, 34 Tools
Server: @wdio/mcp by WebDriverIO
Stars: 1.3k+ · License: MIT · Tools: 34 · Latest: v3.2.4 (updated today)
MCP Tracker: glama.ai/mcp/servers/webdriverio/mcp
Docs: webdriver.io/docs/mcp
Most browser automation MCP servers stop at the browser. WebDriverIO's official MCP server goes further: the same 34 tools control Chrome, Firefox, Edge, and Safari — plus iOS and Android native apps via Appium, all through one unified interface.
Install in one line:
npx -y @wdio/mcp@latest
What Sets It Apart
WebDriverIO MCP is mobile-first by design. Unlike browser-only alternatives, it supports iOS simulators, Android emulators, and real devices from day one. It's built on the battle-tested WebDriverIO framework — provenance matters when your agent is running headless automation in CI.
| Capability | Details |
|---|---|
| Browsers | Chrome (headed/headless), Firefox, Edge, Safari |
| Mobile platforms | iOS (XCUITest), Android (UiAutomator2) — simulators, emulators, and real devices |
| Transport | stdio (default) + optional HTTP for clients that can't launch subprocesses |
| Element detection | Cross-platform: CSS selectors, XPath, accessibility ID, iOS predicates, UiAutomator |
| Session recording | Every tool call automatically recorded, exportable as runnable WebDriverIO JS |
| Device emulation | Apply mobile/tablet presets (iPhone 15, Pixel 7) to simulate responsive layouts |
| BrowserStack | Built-in support for real iOS/Android devices and browser matrices in the cloud |
The Tool Set
The 34 tools break down into seven categories:
Session Management (4 tools):
start_session, launch_chrome (with remote debugging), close_session (with detach support), emulate_device
Navigation & Page Interaction (7 tools):
navigate, get_elements (viewport-filtered, paginated), get_accessibility_tree (role-filtered), get_screenshot (auto-resized ≤1MB), get_tabs, scroll, execute_script
Element Interaction (3 tools):
click_element, set_value, switch_frame / switch_tab
Mobile Gestures (3 tools):
tap_element, swipe (directional), drag_and_drop
Context Switching (2 tools):
get_contexts, switch_context — seamless native ↔ webview transitions in hybrid apps
Device Control (7 tools):
get_app_state, rotate_device, lock_device / unlock_device, get_geolocation, set_geolocation, show_keyboard / hide_keyboard, press_key
Cookie & BrowserStack (8 tools):
get_cookies, set_cookie, delete_cookies, upload_app, list_apps, list_devices, take_healing_snapshot, get_session_logs
Architecture: The Bridge Model
WebDriverIO MCP acts as a protocol bridge between AI assistants and automation engines:
┌─────────────┐ MCP (stdio) ┌──────────────┐
│ AI Agent │ ◄──────────────► │ @wdio/mcp │
└─────────────┘ └──────┬───────┘
│ WebDriverIO API
┌────────────────────┼───────────────────┐
│ │ │
Chrome/Firefox Appium iOS Appium Android
(W3C WebDriver) (XCUITest) (UiAutomator2)
- Single-session model: one active browser or app session at a time, state maintained globally across tool calls
- Auto-detach: sessions with
noReset: trueautomatically detach on close, preserving state - Smart element detection: on mobile, parses XML page source in 2 HTTP calls instead of 600+ traditional queries, generating multiple locator strategies per element
- HTTP transport option: for clients that can't launch subprocesses (OpenAI Codex secure mode, llama.cpp), the server supports HTTP mode on any port
Session Recording: The Audit Trail Angle
Every tool call is automatically recorded and exportable as runnable WebDriverIO JavaScript. For teams using Facio as their agent runtime, this creates a natural handoff:
- Your agent automates a test flow via WebDriverIO MCP
- Facio captures every tool call in its audit trail
- The session recording exports as executable JS you can commit to your test suite
This closes the loop from agent-driven exploration to deterministic, repeatable CI tests — with full traceability at every step.
Facio Integration
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}
For BrowserStack real-device testing:
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"],
"env": {
"BROWSERSTACK_USERNAME": "${credentials.BROWSERSTACK_USERNAME}",
"BROWSERSTACK_ACCESS_KEY": "${credentials.BROWSERSTACK_ACCESS_KEY}"
}
}
}
}
For HTTP transport (Facio agents running in containerized environments that can't spawn subprocesses):
npx @wdio/mcp --http --port 3000 --allowedOrigins "*"
Then configure the MCP endpoint as http://localhost:3000/mcp.
Quickstart Examples
Browser automation:
"Open Chrome headless and navigate to https://webdriver.io. Take a screenshot, find all visible links in the nav bar, and check if the 'Get Started' button is present."
Mobile web testing with device emulation:
"Start a Chrome session, emulate an iPhone 15, navigate to our checkout page, and take a screenshot at 390×844."
Native iOS app testing:
"Start my iOS app on the iPhone 15 simulator. Tap the login button, type 'test@example.com' into the email field, swipe up to scroll to the submit button, and take a screenshot."
Hybrid app context switching:
"Launch the app on Android. Check available contexts, switch to WEBVIEW_com.myapp, find the search input, and type 'test query'."
BrowserStack on real device:
"Start a BrowserStack session on a Samsung Galaxy S23 running Android 13, upload my app .apk, install it, and run through the signup flow."
When to Choose WebDriverIO MCP
WebDriverIO MCP is the right choice when:
- You need mobile + browser from a single MCP server, not two separate ones
- You're already in the WebDriverIO ecosystem and want session recordings to feed into your test suite
- You need real-device testing via BrowserStack without switching toolchains
- Your agent needs Appium-level device control (geolocation, rotation, keyboard, app lifecycle) — not just viewport emulation
For pure browser-only workflows, simpler alternatives exist. But for cross-platform test automation driven by an AI agent, WebDriverIO MCP is the most complete option available today.
Bottom Line
WebDriverIO MCP packs 34 tools across browsers and mobile platforms into a single, actively maintained package from a team that's been building test automation for over a decade. The session recording feature — combined with Facio's built-in audit trail — creates a clean path from agent-driven exploration to reproducible CI tests.
At npx -y @wdio/mcp@latest, it's zero-config evaluation. Just add the JSON and ask your agent to open a browser.
MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for tool quality, cross-platform reach, and integration fit with Facio's HITL-first agent runtime.