Engineering · May 30, 2026

MCP Spotlight: WebDriverIO MCP — Browser + Mobile App Automation, One Server, 34 Tools

The official WebDriverIO MCP server gives AI agents 34 tools to automate Chrome, Firefox, Edge, and Safari browsers — plus iOS and Android apps via Appium — all through a unified interface. Session recording, device emulation, and BrowserStack support included.

MCP ServerWebDriverIOBrowser AutomationMobile TestingAppiumAI Agents

MCP Spotlight: WebDriverIO MCP — Browser + Mobile App Automation, One Server, 34 Tools

Server: @wdio/mcp by WebDriverIO Stars: 1.3k+ · License: MIT · Tools: 34 · Latest: v3.2.4 (updated today) MCP Tracker: glama.ai/mcp/servers/webdriverio/mcp Docs: webdriver.io/docs/mcp

Most browser automation MCP servers stop at the browser. WebDriverIO's official MCP server goes further: the same 34 tools control Chrome, Firefox, Edge, and Safari — plus iOS and Android native apps via Appium, all through one unified interface.

Install in one line:

npx -y @wdio/mcp@latest

What Sets It Apart

WebDriverIO MCP is mobile-first by design. Unlike browser-only alternatives, it supports iOS simulators, Android emulators, and real devices from day one. It's built on the battle-tested WebDriverIO framework — provenance matters when your agent is running headless automation in CI.

Capability	Details
Browsers	Chrome (headed/headless), Firefox, Edge, Safari
Mobile platforms	iOS (XCUITest), Android (UiAutomator2) — simulators, emulators, and real devices
Transport	stdio (default) + optional HTTP for clients that can't launch subprocesses
Element detection	Cross-platform: CSS selectors, XPath, accessibility ID, iOS predicates, UiAutomator
Session recording	Every tool call automatically recorded, exportable as runnable WebDriverIO JS
Device emulation	Apply mobile/tablet presets (iPhone 15, Pixel 7) to simulate responsive layouts
BrowserStack	Built-in support for real iOS/Android devices and browser matrices in the cloud

The Tool Set

The 34 tools break down into seven categories:

Session Management (4 tools): start_session, launch_chrome (with remote debugging), close_session (with detach support), emulate_device

Navigation & Page Interaction (7 tools): navigate, get_elements (viewport-filtered, paginated), get_accessibility_tree (role-filtered), get_screenshot (auto-resized ≤1MB), get_tabs, scroll, execute_script

Element Interaction (3 tools): click_element, set_value, switch_frame / switch_tab

Mobile Gestures (3 tools): tap_element, swipe (directional), drag_and_drop

Context Switching (2 tools): get_contexts, switch_context — seamless native ↔ webview transitions in hybrid apps

Device Control (7 tools): get_app_state, rotate_device, lock_device / unlock_device, get_geolocation, set_geolocation, show_keyboard / hide_keyboard, press_key

Cookie & BrowserStack (8 tools): get_cookies, set_cookie, delete_cookies, upload_app, list_apps, list_devices, take_healing_snapshot, get_session_logs

Architecture: The Bridge Model

WebDriverIO MCP acts as a protocol bridge between AI assistants and automation engines:

┌─────────────┐    MCP (stdio)    ┌──────────────┐
│  AI Agent   │ ◄──────────────►  │  @wdio/mcp   │
└─────────────┘                   └──────┬───────┘
                                         │ WebDriverIO API
                    ┌────────────────────┼───────────────────┐
                    │                    │                   │
               Chrome/Firefox       Appium iOS          Appium Android
               (W3C WebDriver)      (XCUITest)          (UiAutomator2)

Single-session model: one active browser or app session at a time, state maintained globally across tool calls
Auto-detach: sessions with noReset: true automatically detach on close, preserving state
Smart element detection: on mobile, parses XML page source in 2 HTTP calls instead of 600+ traditional queries, generating multiple locator strategies per element
HTTP transport option: for clients that can't launch subprocesses (OpenAI Codex secure mode, llama.cpp), the server supports HTTP mode on any port

Session Recording: The Audit Trail Angle

Every tool call is automatically recorded and exportable as runnable WebDriverIO JavaScript. For teams using Facio as their agent runtime, this creates a natural handoff:

Your agent automates a test flow via WebDriverIO MCP
Facio captures every tool call in its audit trail
The session recording exports as executable JS you can commit to your test suite

This closes the loop from agent-driven exploration to deterministic, repeatable CI tests — with full traceability at every step.

Facio Integration

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "npx",
      "args": ["-y", "@wdio/mcp@latest"]
    }
  }
}

For BrowserStack real-device testing:

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "npx",
      "args": ["-y", "@wdio/mcp@latest"],
      "env": {
        "BROWSERSTACK_USERNAME": "${credentials.BROWSERSTACK_USERNAME}",
        "BROWSERSTACK_ACCESS_KEY": "${credentials.BROWSERSTACK_ACCESS_KEY}"
      }
    }
  }
}

For HTTP transport (Facio agents running in containerized environments that can't spawn subprocesses):

npx @wdio/mcp --http --port 3000 --allowedOrigins "*"

Then configure the MCP endpoint as http://localhost:3000/mcp.

Quickstart Examples

Browser automation:

"Open Chrome headless and navigate to https://webdriver.io. Take a screenshot, find all visible links in the nav bar, and check if the 'Get Started' button is present."

Mobile web testing with device emulation:

"Start a Chrome session, emulate an iPhone 15, navigate to our checkout page, and take a screenshot at 390×844."

Native iOS app testing:

"Start my iOS app on the iPhone 15 simulator. Tap the login button, type 'test@example.com' into the email field, swipe up to scroll to the submit button, and take a screenshot."

Hybrid app context switching:

"Launch the app on Android. Check available contexts, switch to WEBVIEW_com.myapp, find the search input, and type 'test query'."

BrowserStack on real device:

"Start a BrowserStack session on a Samsung Galaxy S23 running Android 13, upload my app .apk, install it, and run through the signup flow."

When to Choose WebDriverIO MCP

WebDriverIO MCP is the right choice when:

You need mobile + browser from a single MCP server, not two separate ones
You're already in the WebDriverIO ecosystem and want session recordings to feed into your test suite
You need real-device testing via BrowserStack without switching toolchains
Your agent needs Appium-level device control (geolocation, rotation, keyboard, app lifecycle) — not just viewport emulation

For pure browser-only workflows, simpler alternatives exist. But for cross-platform test automation driven by an AI agent, WebDriverIO MCP is the most complete option available today.

Bottom Line

WebDriverIO MCP packs 34 tools across browsers and mobile platforms into a single, actively maintained package from a team that's been building test automation for over a decade. The session recording feature — combined with Facio's built-in audit trail — creates a clean path from agent-driven exploration to reproducible CI tests.

At npx -y @wdio/mcp@latest, it's zero-config evaluation. Just add the JSON and ask your agent to open a browser.

MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for tool quality, cross-platform reach, and integration fit with Facio's HITL-first agent runtime.

MCP Spotlight: WebDriverIO MCP — Browser + Mobile App Automation, One Server, 34 Tools

MCP Spotlight: WebDriverIO MCP — Browser + Mobile App Automation, One Server, 34 Tools

What Sets It Apart

The Tool Set

Architecture: The Bridge Model

Session Recording: The Audit Trail Angle

Facio Integration

Quickstart Examples

When to Choose WebDriverIO MCP

Bottom Line

More on Engineering

MCP Server Authoring Guide 2026: Building Production-Grade MCP Servers From Scratch — The Authoring Playbook Every Independent Server Author Should Follow

MCP Spotlight: GitHub MCP Server — The Official Code-and-Collaboration Bridge With Fine-Grained PATs, PR/Issue Workflows, and the Engineering-Workflow Default for Agents

MCP Spotlight: Docker MCP Server — The Container Operations Bridge With `mcp://` Catalog Protocol, MCP Toolkit, and the Container-Default Reference for Agents