Back to blog

Engineering · Jun 16, 2026

MCP Spotlight: Site-Shot — Give Your Agent Eyes on Any Web Page, With Ad & Cookie-Banner Removal Built In

Site-Shot MCP gives AI agents 2 tools to screenshot any web page with real Chromium rendering, country proxies, and ad & cookie-banner removal built in by default — returning cleaner images that cost fewer vision tokens.

MCP ServerSite-ShotScreenshotsAgent VisionWeb ScrapingAI Agents

MCP Spotlight: Site-Shot — Give Your Agent Eyes on Any Web Page, With Ad & Cookie-Banner Removal Built In

Server: site-shot-mcp by Site-Shot Tools: 2 · License: MIT · Transport: stdio (NPX) · Render: Real Chromium, country proxies MCP Tracker: glama.ai/mcp/servers/site-shot/site-shot-mcp Docs: site-shot.com · Pricing

Most vision-capable agents stumble on the same first task: open a web page and tell me what's there. The page is 4,000 pixels tall, half of it is a cookie banner, a quarter is ads, and the content the user actually cares about starts at scroll-depth 600. By the time the model has decoded the whole image, you've burned a third of your context window on chrome.

Site-Shot MCP is the first screenshot server I've seen that ships ad and cookie-banner removal as tool parameters, not as a post-processing step. Two tools, one HTTP API, real Chromium rendering with country proxies, and the cleaner images mean fewer vision tokens and faster agent runs.

The Two Tools

ToolWhat It Does
capture_screenshotScreenshot a web page (viewport by default, optionally full page)
capture_full_pageSame as capture_screenshot with full_page: true baked in

Both tools return the screenshot as an MCP image — the model sees it directly, no base64 decoding dance, no local file handling.

The Parameters That Matter

Every Site-Shot call supports a focused set of parameters that solve the real problems agents hit on the web:

ParamTypeDefaultWhy It Matters
urlstringrequiredThe page to capture
full_pageboolfalseCapture entire scrollable page (up to 20,000px)
width / heightnumber1280 / 1024Viewport / device size for responsive testing
formatstringpngpng or jpeg
block_adsbooltrueRemove ads from the image — fewer tokens, cleaner content
block_cookie_bannersbooltrueRemove cookie consent popups — the page is what the user actually sees after dismissing
countrystringCountry proxy: "Germany" auto-sets IP, language, timezone, geolocation
language / time_zone / geolocationstringManual overrides for each
wait_msnumberWait before capture (SPAs, animations, lazy-loaded content)
max_heightnumber20000Cap full-page height to avoid pathological pages

The two flags that change the economics of agent vision:

  • block_ads: true — by default. Strips banner ads, sidebar promos, and sponsored content from the rendered image. The model sees the article, not the ad-tech.
  • block_cookie_banners: true — by default. Removes the "We use cookies" overlay that obscures the page on first visit. The model sees what the user sees after they've dismissed the banner.

Both are on by default, which means a stock capture_screenshot call already returns a clean, focused image. This is the right default — vision models are expensive, and wasting tokens on chrome is wasted money.

Country Proxies: See the Web From Anywhere

The country parameter is the killer feature for international agents:

"Take a screenshot of https://www.spiegel.de from a German IP"
→ capture_screenshot(url="https://www.spiegel.de", country="Germany")

"Screenshot the Apple homepage as seen from Japan"
→ capture_screenshot(url="https://apple.com/jp/", country="Japan")

"Show me what Google.de looks like vs google.com"
→ capture_screenshot(url="https://www.google.de", country="Germany")
→ capture_screenshot(url="https://www.google.com", country="US")

The country proxy auto-sets the IP address, the Accept-Language header, the timezone, and the geolocation. The page renders as a local user would see it. For agents doing international market research, localization QA, or geo-restriction testing, this is the entire workflow in one parameter.

Real Chromium, Not Headless Puppeteer

Site-Shot runs real Chromium in a managed environment — not a stripped-down headless Puppeteer, not a slim render-as-HTML service. This matters for three reasons:

  1. JavaScript-heavy pages render correctly — React SPAs, Vue apps, SvelteKit sites all load and execute before capture
  2. Web fonts and CSS animations are present — the screenshot is visually accurate, not a sanitized approximation
  3. wait_ms actually works — you can wait for animations, lazy-loaded images, or async data fetches to complete

For agents that need to inspect SPAs, web apps, and modern JS-driven sites (which is most of the web in 2026), the rendering fidelity is the differentiator.

Facio Integration

{
  "mcpServers": {
    "site-shot": {
      "command": "npx",
      "args": ["-y", "site-shot-mcp"],
      "env": {
        "SITESHOT_API_KEY": "${credentials.SITESHOT_API_KEY}"
      }
    }
  }
}

Facio's audit trail captures every screenshot call with the URL, the parameters, the country proxy, the format, and the resulting image metadata. For agents performing competitive intelligence, market research, or visual QA workflows, this creates a complete record: "agent screenshotted the German Apple homepage at 14:32 UTC, blocked ads and cookie banners, returned a 1280×800 PNG."

For HITL workflows, the screenshot call is inherently read-only — it returns an image, it doesn't mutate any state. The human reviews the screenshot after the agent surfaces it, and decides whether the agent should take the next action (refund, click, purchase, write up). The MCP is the read-side; the human is the action-side; Facio captures both.

Quickstart

# 1. Get a Site-Shot API key
#    https://www.site-shot.com/pricing/

# 2. Add the MCP to your client
{
  "mcpServers": {
    "site-shot": {
      "command": "npx",
      "args": ["-y", "site-shot-mcp"],
      "env": {
        "SITESHOT_API_KEY": "your_api_key"
      }
    }
  }
}

# 3. Restart your MCP client (Claude Desktop, Cursor, etc.)
#    MCP servers load at launch, not on hot-reload.

# 4. First prompts
# "Take a screenshot of https://news.ycombinator.com and tell me the top 3 stories"
# "Screenshot the Apple homepage as seen from Germany — full page, 1440px wide"
# "Capture https://www.spiegel.de from a German IP, wait 2 seconds for ads to load, block the cookie banner"
# "Take a full-page screenshot of our competitor's pricing page and compare it to ours"
# "Screenshot our landing page in mobile (375×667) and desktop (1440×900) viewports"

Use Cases

Competitive intelligence: "Screenshot our top 3 competitors' homepages from a US IP at full page width. Tell me the value propositions and pricing they lead with." Agent makes 3 calls, gets 3 clean images, reasons across them.

Localization QA: "Screenshot our app from Germany, France, Japan, and Brazil. Verify that the localized strings are showing correctly and no English fallback text appears." 4 calls with different country values.

Visual regression testing: "Take a full-page screenshot of our staging environment at 1440px wide, then compare it to last week's screenshot. Highlight any visual changes." Pair Site-Shot with a diff tool to surface regressions.

Market research: "Screenshot the top 5 e-commerce sites in our category from a US IP. What's the dominant design pattern for the product card?" Agent captures, the model reasons across the images.

Lead enrichment: "Take a screenshot of the prospect's homepage. Use the image to write a personalized first-touch email referencing their current value prop and product positioning." Image → visual context → personalized copy.

Web archive screenshots: "Capture https://example.com/pricing as it looks right now, full page, and save the image for our competitive archive." On-demand visual snapshots of any public web page.

Ad-blocking verification: "Take two screenshots of the same page — one with block_ads: true and one with block_ads: false. The difference shows what ad-tech is on the page." Useful for ad ops and privacy audits.

Bottom Line

Site-Shot MCP is the first screenshot server that ships with ad and cookie-banner removal as defaults — and that single design choice changes the economics of agent vision. Fewer tokens per image, cleaner context for the model, faster runs.

Two tools, real Chromium rendering, country proxies for international captures, and a single SITESHOT_API_KEY config block. For any agent that needs to see the web — competitive intelligence, market research, localization QA, visual regression, lead enrichment — this is the missing layer.

npx -y site-shot-mcp and your agent can see any page, from any country, with the chrome stripped out.


MCP Spotlight is a series covering servers that give AI agents real capabilities. Every server is evaluated for tool design, output quality, and integration fit with Facio's HITL-first agent runtime.