Facio's Web Research Stack: How web_search and web_fetch Turn AI Agents Into Live Research Engines
The reasoning capabilities of AI agents have advanced dramatically in 2026. The data layer they reason over has not. An agent trained on data with a January 2026 cutoff cannot tell you what changed in the React docs last Tuesday, what a competitor's pricing page says today, or what regulators announced yesterday.
Firecrawl's recent research on agentic web search frames this as the "search-reason loop": the agent doesn't search once and reason once. It searches, reads, updates what it knows, and searches again with better questions. The loop continues until the agent has enough coverage or hits a budget limit.
Facio's web_search and web_fetch tools are the implementation of this loop — built into the runtime as first-class tools, not external services the agent has to integrate with. Here's how the architecture works and what it enables.
The Two-Tool Web Research Architecture
Facio splits web research into two distinct operations, each optimized for its specific role:
web_search for discovery. The agent provides a query, and the tool returns a list of relevant sources — titles, URLs, and snippets. The agent uses these results to decide which pages are worth reading in full. No content fetching, no browser rendering — just the discovery layer, fast and token-efficient.
web_fetch for content. The agent provides a URL, and the tool extracts the page's readable content (HTML converted to clean markdown or text) and returns it for the agent to process. The agent reads the actual content of the pages it identified as relevant — without needing browser automation, JavaScript execution, or headless Chrome.
This separation mirrors how human researchers work: search engines to find candidates, full reads to extract detail. The two operations have different costs, different latencies, and different output formats. Facio's tools reflect that.
# Step 1: Find candidate sources
web_search(query="browser automation AI agents 2026 landscape", count=10)
# Returns: 10 URLs with titles and snippets
# Step 2: Read the most relevant ones in full
web_fetch(url="https://zylos.ai/research/2026-browser-automation-landscape")
# Returns: Clean markdown of the page content
The Search-Reason Loop in Practice
The arXiv paper "From Web Search towards Agentic Deep Research" describes a four-stage evolution of how AI systems access information:
- Keyword matching with static ranked results
- LLMs answering from training data with no retrieval
- RAG adding a retrieval step before generation
- Agentic deep research with search-reason loops that adapt in real time
Facio's web tools are designed for stage 4. The agent isn't doing one-shot search and answer. It's iterating:
Initial query: "What are the top open-source MCP servers in 2026?"
web_search → 8 results, mostly blog posts
Refined query: "MCP servers production-ready enterprise 2026"
web_search → 5 results, GitHub repos and case studies
web_fetch → reads the top 3 results in full
Synthesis: "Based on 3 in-depth reads, the top production-ready MCP servers are..."
Verification query: "browserbase MCP server limitations"
web_search → finds a known issue
Updated synthesis: "with the caveat that X has a Y limitation in production"
The agent reads, updates its understanding, and queries again. The loop continues until the research is complete — bounded by token budget, search result count, or the agent's own quality threshold.
Why web_fetch + Browser is Not the Same Thing
Facio also has a full browser_* tool suite for interactive web work. The natural question is: why have a separate web_fetch when the agent can navigate to a page with browser_navigate?
The answer is specialization:
web_fetch | browser_* | |
|---|---|---|
| Best for | Static content, articles, docs, pricing pages | Interactive workflows, logins, form submissions, multi-step navigation |
| Output format | Clean markdown/text | Accessibility tree + screenshots |
| JavaScript | Not executed (faster, simpler) | Executed (handles JS-heavy sites) |
| Authentication | Public pages only | Persistent sessions with cookies |
| Token cost | Low (clean text extraction) | Higher (snapshot + accessibility tree) |
| Speed | Fast (HTTP request + extraction) | Slower (page load + render) |
For a research task where the agent needs to read 20 articles, web_fetch is the right tool. For a research task where the agent needs to log into a SaaS dashboard and extract data, browser_* is the right tool. The agent picks the right tool based on the task, not by defaulting to the more powerful one for every job.
Token-Efficient Content Extraction
web_fetch converts HTML to readable markdown, stripping the noise — navigation, ads, scripts, styles, cookie banners. The output is what the human would see if they read the page with all the chrome removed:
web_fetch(url="https://example.com/article", extractMode="markdown")
The agent gets:
# Article Title
The article body, with formatting preserved. Links are inline...
## Section Heading
Paragraphs of actual content, not sidebar widgets or footer navigation.
The default maxChars of 50,000 is enough for most articles. For longer pages, the agent can pass a custom maxChars to control context window consumption. The agent decides how much of the page it needs — the first 5,000 characters for a summary, or the full 50,000 for a deep read.
When to Use Each Combination
Facio's research stack has three primary patterns:
Pattern 1: Search + Read (Most Common)
1. web_search(query="...") → find candidate sources
2. web_fetch(url=...) → read the top 3-5 results in full
3. Synthesize findings
The standard research workflow. Used for blog posts, market research, competitive analysis, technical documentation lookup.
Pattern 2: Read Directly (Known Source)
1. web_fetch(url="https://specific-source.com/page") → skip search, go straight to known source
When the agent already knows which source it needs (from memory, from a previous research session, from a user-provided URL), it skips the search step and reads directly. Faster, fewer tokens.
Pattern 3: Search + Browser + Fetch (Deep Research)
1. web_search(query="...") → find candidate sources
2. browser_navigate(url=...) for sites that require login → authenticated read
3. web_fetch(url=...) for public sources → fast read
4. Synthesize
For research that mixes public sources (articles, documentation) with authenticated sources (paid databases, internal dashboards). The agent uses web_fetch for the easy reads and browser_* for the authenticated reads — minimizing the expensive tool.
Integration with the Memory System
Research findings should be remembered. Facio's web research tools integrate naturally with the memory system:
1. web_search + web_fetch → research findings
2. edit_file(MEMORY.md) → save key findings as durable facts
3. write_file(output/research-report.md) → save the full synthesis as a deliverable
The research is both immediately actionable (the agent uses the findings to make a decision) and durably stored (memory and output files preserve the work for future sessions).
For example: an agent researching MCP server options can save the analysis to output/mcp-server-comparison.md and update MEMORY.md with "We chose Server X for our production deployment — see output/mcp-server-comparison.md for the full evaluation." Future sessions know the decision was made, can find the analysis, and don't re-research the same ground.
Quota and Cost Management
Live web research can be expensive — every search and fetch costs tokens, and the search-reason loop multiplies the consumption. Facio manages this through the same token budget system that governs all agent activity:
- The agent has a per-session iteration budget (visible in runtime context)
- Each
web_searchandweb_fetchcounts against the budget - The agent learns to be efficient: tight queries, targeted fetches, focused iteration
The result: agents that research thoroughly when needed but don't burn tokens on exhaustive searches for tasks that only need a quick check. The search-reason loop runs as long as the budget allows, then synthesizes with whatever it has.
What the Web Research Stack Doesn't Do
- No paywall bypass.
web_fetchworks for public content. For paywalled sources, the agent needs authenticated browser sessions. - No deep web / dark web. These tools access the public web via standard search engines. Specialized sources require other tools.
- No real-time streaming. Search results and fetched content are point-in-time snapshots. For real-time data (live stock prices, current weather), agents use specialized APIs or cron-scheduled fetches.
- No automatic fact-checking. The agent reads what it finds and reasons over it. Cross-referencing and source quality assessment are the agent's responsibility.
Bottom Line
AI agents that can't access the live web are answering questions about a world that has moved on. Training data is stale the moment it's published, and even the most recent model checkpoints miss developments from the last few weeks — exactly the time period when freshness matters most for business, research, and decision-making.
Facio's web_search and web_fetch tools give agents direct access to the current web — search for sources, fetch clean content, reason over fresh data, and iterate. The search-reason loop runs in the same tool surface as every other runtime primitive: no external service, no separate API integration, no browser overhead for static content.
Because an agent that can reason but can't research is only half-useful. The other half is being able to ask "what's the latest?" and get an answer based on what the world looks like right now.
See the web research documentation for search query patterns, content extraction options, and integration with the search-reason loop.