Back to blog

Product · Jun 5, 2026

Facio's Built-in Log System: How read_logs Makes Agent Execution Auditable in Real Time

When an AI agent makes a mistake at 4 AM, you need to know what happened — not wait for a human to grep through server logs. Facio's read_logs tool gives agents access to their own persistent execution log, with level filtering, time-range queries, and regex search. The agent diagnoses its own failures. Here's how the architecture works and why self-auditability matters for production autonomy.

Log SystemAuditabilitySelf-DiagnosticsAgent OperationsObservability

Facio's Built-in Log System: How read_logs Makes Agent Execution Auditable in Real Time

When an AI agent makes a mistake at 4 AM, the worst-case scenario is silence. The agent failed, you don't know why, and the only path to diagnosis is a human SSH-ing into a server to grep through application logs. By the time someone reads the logs, the context is cold and the damage might already be compounding.

Facio solves this by giving agents direct access to their own persistent execution log via the read_logs tool. The agent can query its own history — filtered by severity, time range, or regex pattern — and use that information to diagnose failures, detect patterns, and self-correct. Here's how the architecture works.

The Architecture: Persistent, Queryable, Agent-Accessible

Facio maintains a persistent log file that records every tool call, every error, every state change — across sessions, cron jobs, and sub-agent executions. The log is:

  • Persistent. Survives agent restarts and session boundaries. A cron job's log entry from last Tuesday is as accessible as the current session's output.
  • Structured. Every entry has a timestamp, severity level (DEBUG through CRITICAL), and source context.
  • Queryable. The read_logs tool supports level filtering, time-range queries (durations like 5m or ISO timestamps), and regex pattern matching.
  • Agent-accessible. The agent queries its own logs the same way it queries memory or reads files — no separate monitoring dashboard required.
# What errors happened in the last hour?
read_logs(level="ERROR", since="1h")

# What did the cron job at midnight produce?
read_logs(since="2026-06-05T00:00:00Z", grep="cron")

# Any CRITICAL events in the last 24 hours?
read_logs(level="CRITICAL", since="1d")

The agent doesn't just produce log entries — it reads them back and acts on what it finds.

Self-Diagnosis: The Agent as Its Own Ops Engineer

The most powerful use of read_logs is self-diagnosis. When the agent detects a failure — a tool call returned an error, an MCP server is unresponsive, a cron job produced unexpected output — it can query the log history to understand what happened.

1. Agent's tool call returns an error.
2. Agent calls read_logs(level="ERROR", since="30s") to see the failure context.
3. Log shows: "MCP server 'weather-api' connection refused after 3 retries."
4. Agent cross-references with recent warnings: read_logs(level="WARNING", since="10m", grep="weather")
5. Log shows repeated timeout warnings for the same server over the last 10 minutes.
6. Agent concludes: server is degraded, not a transient error.
7. Agent calls manage_mcp(action="disable", name="weather-api"), logs the action, continues the workflow.

No human intervened. The agent diagnosed a production issue, traced it through warning history, isolated the failing component, and continued in degraded mode — all using a tool that reads its own execution record.

This is the difference between an agent that fails and waits for help, and an agent that fails and figures out why.

The Filter Surface: Finding Signal in the Noise

Production agent logs are noisy. A single complex workflow can generate hundreds of entries across DEBUG, INFO, WARNING, and ERROR levels. read_logs has three independent filter axes to cut through the noise:

By severity (level). Skip DEBUG and INFO entries when diagnosing a failure. Start at WARNING or ERROR. The level parameter is inclusive — level="WARNING" returns WARNING, ERROR, and CRITICAL entries.

By time (since). Accepts both ISO timestamps ("2026-06-05T09:00:00Z") and human-readable durations ("5m", "2h", "1d"). "What went wrong in the last 5 minutes?" is a natural query that doesn't require the agent to compute timestamps.

By pattern (grep). Regex matching against raw log lines. Filter to entries containing a specific tool name ("browser_navigate"), a specific MCP server ("weather-api"), or a specific error code ("ECONNREFUSED").

Combined, the three filters zero in on exactly the relevant context:

read_logs(
    level="WARNING",
    since="1h",
    grep="rate.limit|timeout"
)

This returns all WARNING-or-higher entries from the last hour containing "rate limit" or "timeout" — the exact pattern you'd query when investigating a slowdown.

Use Cases: When the Agent Reads Its Own Logs

Pattern 1: Cron Job Verification

A cron job runs at 2 AM to generate a daily report. At 8 AM, the agent verifies it worked:

read_logs(since="2026-06-05T02:00:00Z", grep="daily-report")

If the log shows the job completed successfully, no action needed. If it shows errors or no entries at all, the agent triggers a retry or alerts the human — before the missing report becomes a problem.

Pattern 2: Failure Pattern Detection

Over time, the agent can recognize recurring failures:

read_logs(level="ERROR", since="7d", grep="ECONNREFUSED")

# Returns: 14 entries over 7 days, all from "weather-api" MCP server
# Pattern: connection refused every ~12 hours
# Action: replace or repair the server

The agent doesn't need a separate monitoring system to tell it a server is flapping. It reads its own logs, finds the pattern, and acts.

Pattern 3: Post-Mortem Context

After a workflow failure, the agent reconstructs the timeline:

# What happened in the 2 minutes leading up to the failure?
read_logs(since="2026-06-05T09:58:00Z")

# Returns the full execution trace: tool calls, responses, warnings, errors
# Agent can identify the exact step where the chain broke

This turns "the workflow failed" into "the workflow failed at step 7 because the browser session timed out after step 5 took longer than expected" — actionable diagnosis, not a vague error message.

Pattern 4: Compliance Reporting

For regulated environments, the agent can produce an audit-ready execution summary:

read_logs(level="INFO", since="1d")

Filtered and summarized, this becomes: "In the last 24 hours, this agent executed 47 tool calls across 3 sessions, encountered 2 warnings (both transient network timeouts, self-resolved), and 0 errors." That's a compliance officer's summary, generated by the agent from its own logs — not pieced together from disparate monitoring systems.

Integration with the Audit Trail

read_logs is the operational layer of Facio's observability architecture. The compliance layer is the audit trail — the append-only record of every tool call, every human approval, and every state change.

The two layers serve different purposes:

read_logsAudit trail
PurposeOperational diagnosis and self-correctionCompliance and traceability
Access patternAgent queries at runtimeHuman reviews for audits
GranularityPer-entry with severity levelsPer-event with accountability metadata
RetentionRolling, configurableLong-term, append-only

The agent uses read_logs to stay healthy. The compliance officer uses the audit trail to stay compliant. Both draw from the same underlying execution record — just at different levels of abstraction for different audiences.

What read_logs Doesn't Do

The log system has deliberate constraints:

  • The agent cannot modify logs. The log file is append-only. read_logs is a read operation. The agent can diagnose from the record but cannot alter it — preserving the integrity of the audit trail.
  • The agent cannot read other agents' logs. Each agent's log is scoped to its own execution context. Cross-agent correlation requires the audit trail or external aggregation.
  • Raw log files are not directly accessible. The agent uses the read_logs tool, not read_file against the log path. The tool enforces the filtering, pagination, and access controls.

Bottom Line

Observability for AI agents isn't just about dashboards and alerts for humans. It's about giving the agent the ability to understand its own execution — to read its own history, diagnose its own failures, and act on what it finds.

Facio's read_logs tool gives agents that capability: a queryable, filterable, persistent record of everything they've done. No separate monitoring infrastructure. No "wait for a human to check the logs." The agent queries, diagnoses, and self-corrects — at 4 AM, without waking anyone up.

Because an autonomous agent that can't read its own logs isn't fully autonomous. It's just waiting to fail.


See the log system documentation for filter syntax, severity level configuration, and integration patterns with the audit trail.

Keep reading

More on Product

View category