Back to blog

Human-in-the-loop · Jun 22, 2026

Reviewer Context: The State That Makes HITL Decisions Good Across Sessions and Handoffs

Most HITL articles focus on individual decisions. This one focuses on the decision-maker. The reviewer's context is more important than the reviewer's credentials — and most HITL systems destroy context by design. Here's the design pattern for building reviewer context that survives across sessions, escalations, and handoffs.

HITLReviewer ContextAgent ArchitectureState ManagementHuman Oversight

Reviewer Context: The State That Makes HITL Decisions Good Across Sessions and Handoffs

Every HITL article focuses on the decision. This one focuses on the decision-maker — and on the context the decision-maker needs to make decisions that are not just locally correct but systemically good.

Most HITL systems destroy context by design. The reviewer sees one action. They approve or reject. They move to the next action. They have no memory of what came before, no understanding of what's coming after, no awareness of the broader system state. The reviewer's local decision may be locally correct but globally wrong — and the system never knows the difference.

The fix is reviewer context — the persistent state about the reviewer, the workflow, the customer, the agent's history, and the system's current state that travels with the review and survives across sessions, escalations, and handoffs. Without it, every review is isolated. With it, every review is informed.

This is the missing layer in most HITL systems. It's also the most important one for moving HITL from "approve the action in front of me" to "make decisions that improve the system over time."


The Reviewer Context Problem

Consider a typical HITL review:

Monday, 9:00am. Reviewer A logs in. Sees a refund approval request. Approves.

Monday, 9:03am. Reviewer A sees a refund approval request. Approves.

Monday, 9:06am. Reviewer A sees a refund approval request. Approves.

Twelve approvals in 60 minutes. All locally reasonable. The reviewer is doing their job.

The reviewer has no context for:

  • That this customer's account was flagged last week for excessive refund requests
  • That the previous three reviewers approved refunds for the same customer in the last 48 hours
  • That the agent's prompt was updated 6 hours ago and the new prompt has a known accuracy issue with this customer's action type
  • That the customer's tier was downgraded last month and the refund threshold should now be lower
  • That the previous 12 refunds to this customer were all approved by automated review, not human

Each individual decision looks reasonable. The cumulative effect is leakage. The reviewer has no way to know — the system doesn't tell them.

This is the context destruction problem. The review interface shows a single action in isolation. The audit trail records the single decision. The reviewer's mental model is bounded by what the interface shows. The system has all the context; the reviewer has none of it.


The Four Context Layers

The fix is to surface four layers of context to the reviewer — each at the right time, with the right depth, integrated into the review interface.

Layer 1: The Reviewer's Own History

The reviewer's own recent decisions. What did they approve, what did they reject, what patterns are visible in their own activity?

  • The 12 refunds they just approved (so they can see they're in a pattern)
  • The override rate of their last 100 decisions
  • The time-of-day distribution of their reviews (so they can see if they're fatigued)
  • The actions they escalated to senior reviewers (so they can see if they're escalating appropriately)

The reviewer's own history is the most underrated context. A reviewer who can see "I just approved 12 refunds in 60 minutes" is more likely to slow down. A reviewer who can see "my override rate this week is 18% vs. my 30-day average of 8%" is more likely to ask why.

Layer 2: The Customer's / Resource's History

The entity the action targets. The customer, the record, the resource — what's their history with the system?

  • The customer's full interaction history (not just this interaction)
  • The customer's account state (tenure, tier, flags, recent changes)
  • The customer's recent agent interactions (was this customer in a different workflow 6 hours ago?)
  • The customer's profile in the reviewer's domain (are they a known good actor, a known risk, or unknown?)

The customer's history is the context that makes a refund approval different for a 5-year customer vs. a 1-week customer. The reviewer should see this without having to look it up in a separate system.

Layer 3: The Agent's Recent Performance

The agent's history with this type of action. Is the agent generally correct on this action type, or has something changed?

  • The agent's override rate on this action type (last 30 days)
  • The agent's confidence distribution on this action type
  • The model version in use (was it just changed?)
  • The recent pattern in similar actions (are similar actions being approved or rejected?)

The agent's performance is the context that makes a high-confidence action different from a low-confidence action. A reviewer who knows the agent's override rate on this action type is 2% can trust the high confidence. A reviewer who knows the override rate is 15% will look more carefully.

Layer 4: The System's Current State

The broader system state. What's happening right now that might affect this review?

  • Concurrent actions on the same resource (another agent is doing X to this customer right now)
  • The active policy version (was the policy just changed?)
  • The reviewer's queue state (queue depth, expected wait times, escalation backlog)
  • System-level anomalies (recent spike in this action type, recent model degradation alert)

The system's state is the context that catches the things individual reviews would miss. A reviewer who knows another agent is also acting on the same customer can flag a potential conflict. A reviewer who knows the policy was just changed 6 hours ago knows to be more careful.


The Architecture: Persistent Reviewer Context

The four context layers don't survive naturally in a typical HITL system. They have to be designed in. The architecture has three components:

Component 1: Context Store

A persistent store of reviewer context, keyed by reviewer ID, customer ID, action type, and time. The store captures the events that matter for context — reviewer's decisions, customer's history, agent's performance, system state at decision time.

The store is not just an audit log. It's a queryable, indexed state that the review interface can read in real-time. The store is updated as events happen — not batched at end of day.

Component 2: Context Aggregator

At review time, the aggregator pulls the relevant context for the specific review. The aggregator knows which context matters for which action type:

  • For a refund review: customer's refund history, agent's refund accuracy, recent refunds to similar customers, the reviewer's own recent refund decisions
  • For a code change review: agent's recent PRs, the security reviewer's history with this code area, recent related changes
  • For a data deletion review: customer's full history, regulatory flags, concurrent actions, the policy's specific requirements for this action type

The aggregator returns a structured context object — the four layers, each with the relevant data, sized for fast display.

Component 3: Context-Aware Review Interface

The review interface displays the context, integrated with the action being reviewed. The context is not a separate page the reviewer has to navigate to. The context is part of the action display:

┌─────────────────────────────────────────────────┐
│ [Action] Refund $487 to customer #C-48291        │
│                                                  │
│ ▼ Customer History (3 alerts)                    │
│   1. Tenure 3.2y, no prior refunds               │
│   2. Tier 2 customer (last upgrade: 8mo ago)     │
│   3. Recent support interactions: 4 in last 30d   │
│                                                  │
│ ▼ Agent Performance on process_refund            │
│   Last 30d override rate: 4.2%                   │
│   Confidence on similar actions: 0.84 avg        │
│   Model: claude-opus-4-7 (active 12d)            │
│                                                  │
│ ▼ Your Recent Decisions on process_refund        │
│   Last hour: 12 approved, 0 rejected             │
│   Last 24h: 47 approved, 1 rejected (2.1%)       │
│   This is your 13th in 64 minutes                │
│                                                  │
│ ▼ System State                                   │
│   Queue: 8 reviews pending, 5-min SLA            │
│   Policy version: v4.2.1 (active 6d)             │
│   No concurrent actions on this customer         │
└─────────────────────────────────────────────────┘

The reviewer sees, in one display, the action + the four context layers. The display is structured to be readable in 5 seconds (at a glance) and 30 seconds (for full context). The reviewer doesn't have to leave the review to gather context.


Context Handoffs: When Reviews Escalate

The context problem compounds when reviews are escalated or handed off. Reviewer A starts the review. Reviewer B takes over. Reviewer C makes the final decision. Each handoff risks losing context.

The fix: context travels with the review, not with the reviewer.

When Reviewer A hands off to Reviewer B, the handoff includes:

  • The action being reviewed
  • The context Reviewer A had (all four layers)
  • What Reviewer A did (approved, rejected, modified, deferred)
  • Why Reviewer A made that decision (their stated reasoning)
  • What Reviewer A couldn't determine (gaps in context, questions that arose)

When Reviewer C makes the final decision, they see the full chain: the action, the context, Reviewer A's verdict, Reviewer B's verdict, the escalation reason. The decision is informed by the full history, not just the current state.

This requires the context store to support handoffs. Each handoff is an event in the store. The next reviewer reads the full event chain. The audit trail records the handoffs and the context each reviewer had.


The Anti-Pattern: Context-Free Reviews

The most common HITL design is the context-free review:

Reviewer sees: action + summary
Reviewer decides: approve/reject
Reviewer logs: decision

This is the design that produces rubber-stamp approvals, inconsistent decisions, and reviewers who don't catch the systemic issues. The reviewer's local decision may be locally correct, but it's made in a vacuum.

The context-free design is also the cheapest to build. No context store, no aggregator, no context-aware interface. The reviewer just sees what the agent proposed. The system records the decision.

The cheap design is the wrong design. The reviewer is the most expensive part of HITL — and the design that ignores their context makes them less effective, not more.


The Right-Sized Context: Not Everything, Always

Surfacing all four context layers for every review would overwhelm reviewers. The right pattern is right-sized context — the right context for the action type and risk profile.

For a $20 refund to a known good customer, the reviewer's primary context is the customer's tenure and the agent's recent accuracy. The other layers are noise.

For a $5,000 refund to a new account, the reviewer needs the full four layers — the customer's history, the agent's confidence, the reviewer's own recent pattern, the system state.

For a data deletion in a regulated environment, the reviewer needs the full context, the policy version, and the concurrent action check. The reviewer needs to know what the system knows, not what the interface happens to surface.

The manifest encodes which context layers are required per action type:

actions:
  process_refund_low:
    classification: autonomous
    required_context: []  # No context needed
    
  process_refund_medium:
    classification: sampled
    required_context: [customer_history, agent_performance]
    
  process_refund_high:
    classification: synchronous
    required_context: [customer_history, agent_performance, reviewer_history, system_state]
    
  delete_customer_data:
    classification: synchronous
    required_context: [all_layers, plus: policy_version, concurrent_actions, regulatory_flags]

The classification engine and the context engine are the same manifest. The classification determines the action's pattern; the context engine determines what the reviewer sees. Both are encoded in the same version-controlled file.


Where Facio Fits

Facio's context store and aggregator are built into the runtime. Every reviewer's decision updates the store. Every action's evaluation pulls the relevant context. The audit trail records not just the decision but the context the reviewer had at decision time.

Placet.io's review interface is context-aware by default. The four context layers are integrated into the review display — the customer history, the agent's performance, the reviewer's own pattern, the system state. The reviewer doesn't navigate to context — the context is in the review.

The handoff pattern is built in. When a review escalates, the context travels. The next reviewer sees the full chain — the previous reviewer's verdict, the context they had, the reasoning they gave. The decision chain is preserved.

The combination means the reviewer is informed, not isolated. The reviewer makes decisions that are locally correct and systemically good. The HITL system improves over time because the reviewer has the context to improve it.


Key Takeaways

  • The reviewer's context matters more than the reviewer's credentials — most HITL systems destroy context by design
  • Four context layers — reviewer's own history, customer's history, agent's recent performance, system's current state
  • The review interface must display the context, integrated with the action — not on a separate page the reviewer has to navigate to
  • Context must travel with the review, not with the reviewer — escalation and handoffs preserve the full chain
  • The anti-pattern is the context-free review — the cheap design produces rubber-stamps, inconsistent decisions, and systemic blindness
  • Right-sized context per action type — not everything for every review, but the right context for the action's risk profile
  • The classification engine and the context engine share the same manifest — classification determines pattern, context engine determines what the reviewer sees
  • Facio + Placet.io are context-aware by default — the four layers are integrated into the review, the handoffs preserve the chain

Sources: The reviewer context design draws on decision science research (Klein, sources of power in decision-making), the established patterns of clinical decision support systems (where context-aware interfaces measurably improve diagnostic accuracy), and production patterns from HITL deployments where context-aware review interfaces showed 30–50% improvement in override rate relevance.