Back to blog

Security · Jun 24, 2026

When Your AI Agent Goes Rogue at 3 AM: The Runtime Forensics Playbook for the First 72 Hours

You get paged at 3 AM. The agent is calling APIs at 10x normal rate, your SIEM is silent, your audit trail has the first 200 events but not the last 8,000, and the compliance officer needs an incident timeline by 9 AM. The agent runtimes that survive this are the ones built for it.

Runtime ForensicsIncident ResponseAudit TrailAgent Observability72-Hour Playbook

When Your AI Agent Goes Rogue at 3 AM: The Runtime Forensics Playbook for the First 72 Hours

The page comes at 3:14 AM. A PagerDuty alert with subject "agent-rates-anomaly" — one of your production agents is calling the customer database at ten times its baseline rate. The CFO has approved nothing unusual. The change log shows no new deployments. The on-call engineer opens the SIEM dashboard, which is green. The agent's logs show the first 200 events of the session, then stop. The compliance officer has been looped in and needs an incident timeline by 9 AM. The postmortem deadline is 48 hours later.

This is not a hypothetical. Versions of this incident have played out at financial services firms, healthcare providers, and SaaS companies throughout 2025 and 2026. The pattern is consistent: a runtime anomaly is detected, the audit trail is incomplete, the forensic timeline takes days to assemble, and the postmortem identifies gaps in the logging architecture that the incident would not have been possible without. The agent runtimes that survive this scenario are the ones whose forensic architecture was designed before the incident, not after.

This post is the playbook for the first 72 hours of an agent incident, and the runtime forensics architecture that makes the playbook work. The 72-hour window is the difference between a contained incident with a clear postmortem and an uncontrolled breach with regulators asking questions the security team cannot answer.

Why Most Agent Audit Trails Fail Under Incident Load

The audit trail is the artifact that determines whether the incident is manageable. In traditional software, the audit trail is the application log combined with the system log, both of which are designed to be retained, queryable, and complete. In AI agent deployments, the audit trail is often a partial artifact that fails in three specific ways.

The trail is sampled, not captured. The agent generates too many events to log every one. The team configured a 1% sampling rate to keep storage costs down. The 1% rate is fine for operational monitoring; it is useless for incident response, where the specific events that explain the incident may be in the unsampled 99%. The cost optimization that seemed prudent in 2025 is the investigation gap in 2026.

The trail is in the agent's context, not in the runtime. The agent's reasoning is preserved in the LLM's context window. The context window is volatile; once it fills, older reasoning is evicted. The evicted reasoning is gone. The runtime that captures the context snapshot at each tool invocation has the data; the framework that relies on the model to log its own reasoning does not. When the incident is reconstructed, the model's "why" is unrecoverable.

The trail is not tamper-evident. The agent's logs are stored in a database that the agent's process can write to. A compromised agent can modify its own logs, retroactively rewriting the incident timeline. The audit trail that the postmortem relies on may not reflect the actual sequence of events. The incident timeline is no longer trustworthy.

These three failure modes are structural. They are not implementation bugs that better logging libraries can fix. The audit trail architecture is wrong, and the fix is at the design layer.

The First Hour: Triage and Containment

The first hour of the incident is triage. The goal is to stop the bleeding, preserve the evidence, and establish a controlled environment for investigation. Three things happen in parallel.

Containment: pause or kill the agent. The on-call engineer reaches the kill switch — or discovers that there is no kill switch, which is itself a finding. A runtime with a properly designed kill switch can pause the agent immediately. The agent's tool invocations are suspended. The agent's in-flight actions are either allowed to complete or rolled back, depending on the architecture. The agent is no longer taking new actions, but its state is preserved for forensic analysis.

Evidence preservation: snapshot the runtime. The agent's memory, its recent context windows, the audit trail, the policy decisions, the tool invocation history — all of it is snapshotted to a write-once location. The snapshot is timestamped and cryptographically signed. The snapshot is the evidence; subsequent investigation reads from the snapshot, not from the live system, which may continue to be modified.

Stakeholder notification: page the right people. The on-call security engineer, the agent's owner, the compliance officer, and the legal team are notified. The notification is via a verified channel, not via the agent itself. The agent is not used to communicate about the incident; the agent is the subject of the incident.

The first hour ends with the bleeding stopped, the evidence preserved, and the right people aware. The next phase is investigation.

The First 24 Hours: Investigation and Timeline Reconstruction

The investigation phase is where the audit trail architecture is tested. The questions are specific: what did the agent do, when, in what order, with what inputs, and why? The audit trail must answer each question with timestamps, source attribution, and a chain of custody.

Tool invocation timeline. The audit trail records every tool the agent called, the arguments, the policy decision that authorized or denied the call, the response, and the timestamp. The timeline is queryable. The incident responder can see, at a glance, the rate anomaly that triggered the page: the customer database calls, the timestamps, the arguments that look anomalous in retrospect. The timeline answers the "what" question.

Input taint chain. The audit trail records, for each tool call, the input sources that influenced the call. The taint chain answers the "why" question: what content did the agent process that led it to make this call? For a prompt injection incident, the taint chain traces back to the specific document, email, or API response that contained the injected instructions. The taint chain is the evidence of the attack vector.

Policy decision log. The audit trail records, for each tool call, the policy that was evaluated, the inputs to the policy evaluation, and the decision. The policy decision log answers the "was it caught" question: did the policy engine see the call and authorize it, or did the call bypass the policy? A bypass indicates a configuration gap or a vulnerability; an authorization indicates the policy itself was insufficient.

Identity and authorization context. The audit trail records the agent's identity, the user's identity (if a human triggered the task), the task context, the credentials used, and the scope of those credentials. The identity context answers the "what permissions" question: what could the agent do, and was the action within scope? An action outside scope is a privilege escalation indicator; an action within scope is a misconfiguration of the scope itself.

Reasoning trace (where available). The runtime may capture the agent's reasoning at each decision point — the LLM context, the model's interpretation, the alternative actions considered, the selection rationale. The reasoning trace answers the "what was the model thinking" question. The trace is the most fragile part of the audit trail; it depends on the runtime's architecture for capturing and preserving the model's reasoning, not just its outputs.

The investigation produces a timeline. The timeline is the artifact that the postmortem, the compliance officer, and the regulator will all read. The quality of the timeline is the quality of the audit trail architecture.

The 24-72 Hour Window: Root Cause, Containment, Disclosure

The third phase is where the incident becomes a learning event. The root cause is identified, the containment measures are solidified, the disclosure is prepared, and the architectural changes that will prevent recurrence are scoped.

Root cause analysis. The timeline supports the root cause analysis. The analysis identifies the entry point (the poisoned input, the compromised credential, the model vulnerability), the propagation path (how the compromise spread through the agent's actions), and the detection gap (why the anomaly was not caught earlier). The root cause is not a single bug; it is a chain of decisions that allowed the incident to occur.

Containment measures. Beyond pausing the agent, containment may include: revoking the agent's credentials; rotating any secrets the agent had access to; reviewing and tightening the tool allowlist; updating the policy engine with new rules; notifying downstream systems of the compromise; deploying the agent to a clean environment with restored configuration. Each measure is logged in the incident response system.

Disclosure preparation. The compliance officer and the legal team prepare disclosures. For incidents subject to regulatory notification (GDPR Art. 33, NIS2, SEC cyber disclosure rules, sector-specific regulations), the disclosure timeline is calculated from the incident timeline. The audit trail is the source of the disclosure. An incomplete audit trail produces an incomplete disclosure; an incomplete disclosure produces regulatory consequences.

Architectural changes. The postmortem identifies the architectural changes that will prevent recurrence. Common changes: tightening the policy engine's rules; adding new input taint patterns; increasing the audit trail capture rate; implementing additional circuit breakers; restricting the agent's tool access; adding human review at the affected decision boundary. Each change is scoped, scheduled, and tracked.

The 72-hour window closes with the incident contained, the postmortem drafted, the disclosure prepared, and the architectural changes queued. The audit trail's quality is what made each of these steps possible.

What the Audit Trail Must Capture

The audit trail that supports the 72-hour playbook is not the audit trail that most agent deployments produce. The required capture set is:

Per-event capture, not sampling. Every tool invocation, every policy decision, every model call is captured. The cost of comprehensive capture is not zero, but it is far less than the cost of an investigation that cannot answer its own questions. Storage costs of $50–$200 per agent per month are typical for comprehensive capture at production volumes.

Tamper-evident storage. The audit trail is stored in a write-once, cryptographically signed format. Each event is hashed; the hash chain is anchored to an external timestamp authority. A compromise of the agent's runtime cannot modify the audit trail retroactively. The integrity of the evidence is preserved.

Full input capture. The complete inputs to each tool call are captured, not summaries. The taint chain analysis requires the actual content that influenced the call, not the agent's description of it. For large inputs (documents, API responses), the capture includes the hash and a reference to the storage location; the full content is available for forensic analysis.

Reasoning capture where feasible. The runtime's architecture determines what reasoning can be captured. Architectures that capture the model context at each tool invocation, or that use reasoning-specific log formats, produce more complete traces. Architectures that rely on the model to log its own reasoning produce partial traces. The architectural choice is made before the incident, not during it.

Cross-system correlation. The audit trail is correlated with broader SIEM telemetry, with the agent's deployment history, and with upstream service logs. The correlation is what makes the timeline coherent. A tool call to a customer database is correlated with the network traffic record, with the database's own audit log, and with the customer service platform's session record. The correlation is the cross-system evidence.

Retention aligned with compliance requirements. The retention period is determined by the regulatory requirements, not by the storage budget. GDPR-relevant events are retained for the required period; sector-specific requirements (HIPAA, PCI-DSS, FINRA) impose additional retention obligations. The retention is a policy, not a default.

These six requirements define the audit trail architecture that supports the 72-hour playbook. The architecture is a runtime design choice, not a logging library choice.

The Runtime Forensics Architecture

The runtime that supports the playbook has five architectural properties. Each property is a design decision that the runtime's vendor or in-house team makes.

1. Comprehensive event capture at the execution layer. Every tool call, every model call, every policy decision is captured at the execution layer — the same layer where the policy is enforced (covered in the Facio analysis from June 2026). The capture is comprehensive, not sampled. The execution layer is the right place because it sees the event as it happens, with full context, before any sampling or summarization can drop information.

2. Tamper-evident storage with external anchoring. The captured events are stored in a write-once format, with each event's hash chained to the previous event's hash, and the chain anchored to an external timestamp authority at defined intervals. The external anchor is what makes the chain tamper-evident: an attacker who modifies an event in the chain must also modify the anchor, which is held by a third party.

3. Per-event input taint marking. Each captured event records the taint marks of its inputs. The taint marks propagate from the original untrusted sources through every transformation the agent performs. The audit trail is queryable by taint: show me every tool call whose input was derived from content with taint = untrusted-web. The taint-aware query is the forensic primitive that reconstructs the attack chain.

4. Policy decision log integrated with event log. The policy decisions are not a separate log; they are part of the same event log. Every tool call event includes the policy evaluated, the inputs to the policy, the decision, and the rationale. The integration is what allows the audit trail to answer the "was it caught" question directly, without correlating separate logs.

5. Human review workflow with full context. When the policy engine routes an action to human review, the review request includes the full event context: the tool call, the arguments, the policy decision, the agent's reasoning, the taint chain, and the user. Placet.io (the HITL inbox and messenger) delivers the review request with the context attached. The human's decision is logged in the same event log.

Facio (the HITL-first agent runtime) implements all five properties at the platform level. The audit trail is comprehensive, tamper-evident, taint-aware, policy-integrated, and Placet.io-connected. The architecture is what makes the 72-hour playbook executable.

Common Pitfalls in the First 72 Hours

The pitfalls that the playbook must anticipate:

Pitfall 1: Modifying the live system during investigation. The on-call engineer, in the rush to understand the incident, modifies the agent's configuration, restarts the agent, or clears the agent's memory. The live modifications destroy the evidence. The investigation must work from the snapshot, not from the live system. The live system is read-only until the snapshot is complete.

Pitfall 2: Trusting the agent's self-report. The agent may have logging output that summarizes its recent actions. The summary is the agent's interpretation, not the runtime's record. The summary may be incomplete, may be misleading, or may itself be compromised. The audit trail from the runtime is the source of truth; the agent's self-report is supplementary at best.

Pitfall 3: Incomplete timeline reconstruction. The investigation produces a timeline based on the audit trail. The timeline is incomplete because the audit trail is incomplete — sampling dropped the relevant events, or the capture started after the incident began. The postmortem identifies the gap; the postmortem is honest about what is known and what is not.

Pitfall 4: Disclosure timeline calculated from a wrong start. The compliance officer calculates the disclosure timeline from the moment the alert fired. The alert may have fired hours after the incident actually began. The audit trail's first event that shows anomalous behavior is the correct start; the alert is a downstream signal. The disclosure timeline must be calculated from the trail, not from the alert.

Pitfall 5: Postmortem focuses on the agent, not the architecture. The postmortem identifies the specific input that triggered the incident and recommends "better input validation." The recommendation misses the architectural lesson: the audit trail was inadequate, the policy engine was insufficient, the kill switch was missing or too slow. The architectural changes are the ones that prevent recurrence; the input-specific changes are bandages.

The pitfalls are familiar from traditional incident response, with one new twist: the agent's role in the incident. The agent may be the attacker, the victim, the witness, and the investigator, all at once. The roles must be kept separate. The agent's output is evidence; the agent's reasoning is evidence; the agent's continued operation during the investigation is a risk, not an asset.

The Bottom Line

The first 72 hours of an AI agent incident are the hours that determine whether the incident is a contained learning event or an uncontrolled breach. The audit trail architecture is what makes the difference. The architecture is a runtime design choice: comprehensive event capture at the execution layer, tamper-evident storage with external anchoring, per-event input taint marking, integrated policy decision logging, and human review workflows with full context.

The organizations that survive the 3 AM page are the ones whose audit trail architecture was built before the incident. The runtime that captures every event, marks every input, anchors every hash, and routes every high-blast-radius action to a human reviewer is the runtime that produces a 72-hour postmortem instead of a 72-day investigation.

The alternative is the 3 AM page with a partial audit trail, an unsampled event log, a non-tamper-evident storage, and a compliance officer who cannot answer the regulator's questions. The choice is the runtime. The choice is made before the incident. Make it now.

Facio (the HITL-first agent runtime) is the runtime. Placet.io (the HITL inbox and messenger) is the human review workflow. Together, they provide the audit trail architecture that turns a 3 AM page from a crisis into a procedure.


Further reading: