Back to blog

Human-in-the-loop · Jul 5, 2026

The HITL Boundary Problem: Where Exactly Does the Agent Stop and the Human Start?

Most HITL posts assume the boundary between agent and human is obvious. It's not. The boundary is contested, contextual, and constantly shifting. The agent does some preparation, the human does some finalization, the agent does some cleanup. Where exactly does oversight end and execution begin? The boundary problem is the deepest architectural challenge in HITL — and the one most teams don't even realize they're facing.

HITLBoundariesArchitectureAgent DesignHuman Oversight

The HITL Boundary Problem: Where Exactly Does the Agent Stop and the Human Start?

Most HITL blog posts assume the boundary between agent and human is obvious. The agent proposes an action. The human reviews. The action executes. The boundary is at the review step. Simple. Clean.

It's not simple. It's not clean. The boundary is contested, contextual, and constantly shifting.

The agent does preparation that looks like oversight. The human does execution that looks like preparation. The agent does cleanup that looks like execution. The system makes decisions that look like agent work but are actually policy work. The boundary moves depending on the action type, the customer's history, the reviewer's expertise, the time of day, the regulatory regime.

This is the boundary problem — the deepest architectural challenge in HITL, and the one most teams don't realize they're facing. Most teams think they're building a system where "the agent proposes, the human decides, the system executes." They're actually building a system where the boundaries are negotiated continuously, and the negotiation is the architecture.

This post is about the boundary problem. What the boundary actually is. Why it's not what you think. How to design the HITL system to accommodate the shifting boundary. And what happens when the boundary is wrong.


The False Simplicity of the Boundary

The textbook HITL diagram shows a clean handoff:

Agent → Proposal → Human → Decision → System → Execution

The boundaries are at the proposal and the execution. The agent works until it produces a proposal. The human reviews the proposal and decides. The system executes the decision. Three components, three handoffs, clean separation.

The textbook diagram is wrong. The actual HITL system looks like this:

Agent: prepare context → analyze situation → draft proposal → refine proposal
        ↑                                                    ↓
System: maintain policy, aggregate history, validate constraints
        ↑                                                    ↓
Human: review context → question reasoning → modify proposal → approve
        ↑                                                    ↓
Agent: refine based on feedback → execute approved action → post-execute cleanup
        ↑                                                    ↓
System: monitor execution, log outcome, escalate if needed

The agent does preparation, drafting, refinement, post-execution cleanup. The human does review, questioning, modification, approval. The system does policy, validation, monitoring, escalation. The boundaries are nowhere — or rather, they are everywhere. The work is distributed across all three participants continuously.


The Five Boundary Types in HITL

The HITL system has five distinct boundary types. Each is contested in different ways.

Boundary 1: The Reasoning Boundary

Where does the agent's reasoning end and the human's reasoning begin? The agent reasons about the action's inputs and outputs. The human reasons about the agent's reasoning. But the human also reasons about the action itself — sometimes disagreeing with the agent, sometimes extending the agent's reasoning, sometimes replacing it.

The reasoning boundary is contested because reasoning is invisible. The agent's reasoning is internal to the LLM. The human's reasoning is internal to the reviewer. The system can record the reasoning, but it can't enforce where one ends and the other begins.

The boundary is moved by:

  • The interface — does the interface show the agent's reasoning?
  • The training — is the reviewer trained to reason about the action or the agent's reasoning?
  • The policy — does the policy require the human to reason from scratch or to validate the agent's reasoning?

Boundary 2: The Action Boundary

Where does the action proposal end and the execution begin? The agent proposes the action. The human approves the action. The system executes the action. But between the proposal and the execution, both the agent and the human modify the action. The action that is executed is not the action the agent originally proposed, and not the action the human originally approved.

The action boundary is contested because modification is invisible. The system logs "approved with modification by human reviewer." But the modification can be substantial — the human may rewrite the entire action, change parameters, alter the tone, restructure the response. The executed action may differ significantly from the proposed action.

The boundary is moved by:

  • The interface — does the interface make modification easy or hard?
  • The training — is the reviewer trained to modify the action or to accept/reject?
  • The policy — does the policy allow modification or require binary accept/reject?

Boundary 3: The Context Boundary

Where does the agent's context end and the human's context begin? The agent uses retrieved documents, prior interactions, system data. The human uses the interface, the agent's context, the customer's history, the policy. The contexts overlap but are not identical.

The context boundary is contested because context is selective. The system shows the human a subset of the agent's context. The human may seek additional context not provided by the system. The agent may use context the human never sees (e.g., internal tool calls, retrieved documents not surfaced).

The boundary is moved by:

  • The interface — what context is shown to the human?
  • The retrieval — what context does the agent use?
  • The policy — what context is required for review?

Boundary 4: The Decision Boundary

Where does the decision boundary sit — at the human's approval, at the system's validation, at the agent's final preparation? The decision boundary was supposed to be at the human's approval, but the system also makes decisions (timeout behavior, fallback actions, escalation chains). The agent also makes decisions (post-execution cleanup, retry logic).

The decision boundary is contested because decisions are distributed. The human's approval is one decision. The system's validation is another. The agent's final preparation is another. The cumulative decision is the sum of all three, but no single entity made it.

The boundary is moved by:

  • The policy — what actions require human approval?
  • The system — what fallbacks exist for human non-response?
  • The agent — what decisions can the agent make after approval?

Boundary 5: The Accountability Boundary

Where does accountability begin and end? The agent is accountable for its reasoning. The human is accountable for their decision. The system is accountable for its enforcement. The deployer is accountable for the system's configuration. The vendor is accountable for the model.

The accountability boundary is contested because accountability is shared. Every action has multiple contributors. The failure of any one contributor would have changed the outcome. The "responsible party" is not obvious — it depends on what failed.

The boundary is moved by:

  • The legal framework — how does the jurisdiction distribute liability?
  • The audit trail — what does the trail attribute to whom?
  • The policy — who is named as accountable?

Why the Boundaries Shift

The boundaries shift because the participants have different incentives and capabilities. The agent wants to do more (it can be more efficient). The human wants to do less (they are overloaded). The system wants to enforce more (it has oversight responsibility). The deployer wants to spend less (cost optimization).

The shifting boundaries are predictable patterns:

Pattern 1: Scope Creep Into the Human

The agent does less, the human does more. The reason: the agent is uncertain about its outputs. The system routes more actions to the human. The human becomes the primary decision-maker. The agent becomes the assistant.

This shift is often unconscious. The system designers notice "the override rate is high" and conclude "the agent needs more human oversight." They increase the human's scope. The human becomes the safety net for an increasingly uncertain agent.

Pattern 2: Scope Creep Into the Agent

The human does less, the agent does more. The reason: the human is overloaded. The system graduates action types to autonomous. The agent becomes the primary decision-maker. The human becomes the auditor.

This shift is also often unconscious. The system designers notice "the queue is overflowing" and conclude "we need to reduce human load." They graduate action types to autonomous. The agent takes over. The human reviews samples instead of every action.

Pattern 3: Boundary Friction

The boundary between agent and human creates friction. The human reviews actions that don't need review. The agent waits for approvals that don't change the outcome. The system incurs latency for decisions that don't matter.

The boundary friction is reduced by removing the boundary entirely (autonomy), removing the action entirely (graduation), or reducing the review depth (sampling). Each reduction shifts the boundary further toward the agent.

Pattern 4: Boundary Inversion

The agent reviews the human, not the other way around. The reason: the human's decision is the input to the agent. The agent's review of the human's decision determines whether the action executes.

This inversion happens when the system is sophisticated enough to evaluate the human's decision. The agent's review is faster and more consistent than the human's review. The agent becomes the gate for the human's gate.

Pattern 5: Boundary Collapse

The boundary disappears entirely. The agent and the human operate as a unit. The human is a node in the agent's reasoning. The agent is a node in the human's intuition.

This collapse happens when the system is deeply integrated with the human. The human's input is part of the agent's context. The agent's output is the human's decision. The boundary is conceptual, not operational.


The Boundary Failure Modes

When the boundaries are wrong, the system fails in specific ways:

Failure Mode 1: The Agent Has Too Much

The boundary is shifted too far toward the agent. The human reviews too little. The agent takes actions that should have been reviewed. The failures aren't caught.

This is the graduated autonomy failure — the agent earns trust but the trust is not calibrated. The agent takes actions it shouldn't take. The human isn't there to catch them.

Failure Mode 2: The Human Has Too Much

The boundary is shifted too far toward the human. The agent waits for approvals that the agent could make autonomously. The system is slow. The reviewers are overloaded.

This is the drift into theater — the human becomes the bottleneck, the reviews become routine, the system breaks.

Failure Mode 3: The Boundary Is Unclear

The system has no clear rule for which actions are the agent's and which are the human's. The boundary is negotiated case by case. The negotiation is inconsistent. The reviewers don't know what to expect.

This is the classification engine problem — the right-sized HITL is not designed in, and the boundary is implicit.

Failure Mode 4: The Boundary Is Static

The system has a clear rule but never changes it. The action types, thresholds, and review patterns are fixed. The agent improves, the policy doesn't adjust, the review becomes miscalibrated.

This is the model versioning problem — the boundary is correct for one model version but stale for the next.

Failure Mode 5: The Boundary Is Visible

The boundary is exposed in a way that allows gaming. The reviewers learn which actions to approve and which to reject. The agent learns which actions to propose to which reviewers. The system optimizes for the boundary, not for the decision.

This is the gaming the gate failure — the visible boundary becomes a target for optimization.


The Boundary Management Architecture

The HITL system that manages boundaries well has five components:

Component 1: Explicit Boundary Definition

Every action type has an explicit boundary definition in the manifest. The boundary defines:

  • What the agent does (preparation, draft proposal, refinement, post-execution)
  • What the human reviews (proposal, refinement, modification, decision)
  • What the system enforces (validation, escalation, monitoring)

The boundary is recorded in the manifest. The boundary is versioned. The boundary is reviewable.

Component 2: Boundary Calibration

The boundary is calibrated to the action's risk profile, the agent's reliability, and the human's expertise. The calibration is dynamic — the boundary adjusts as the participants' capabilities change.

The calibration uses the continuous feedback loop — the override rate, the error rate, the reviewer pattern, the agent's performance.

Component 3: Boundary Negotiation

When the boundary is contested (the human disagrees with the agent, the system disagrees with the human, the deployer disagrees with the system), the negotiation is structured. The escalation paths are clear. The decision authority is explicit.

The negotiation produces a decision. The decision is recorded. The pattern of decisions informs the next calibration.

Component 4: Boundary Visibility

The boundary is visible to the deployer, the reviewer, and the agent — but not in a way that enables gaming. The deployer sees the boundary's current state. The reviewer sees what they're responsible for. The agent knows what it can do autonomously.

The visibility is structural (the manifest), not probabilistic (the pattern). The boundary is in the code, not in the behavior.

Component 5: Boundary Evolution

The boundary evolves over time. The action types graduate from synchronous review to sampled to autonomous. The review patterns adapt to the agent's improvement. The human's role shifts as the system matures.

The evolution is structured, not ad-hoc. The graduation criteria are documented. The review process is consistent. The system's maturity is measurable.


The Boundary in Practice: A Concrete Example

Consider a customer refund action:

Stage 1: Preparation (Agent)

The agent aggregates the customer's history, the original purchase, the refund request. The agent drafts a refund proposal with the amount, the reason, the expected customer experience. The agent's reasoning is captured.

Stage 2: Validation (System)

The system validates the proposal against the policy. The policy says refunds above $1000 require senior reviewer approval. The refund amount is $487. The policy routes to tier 1 reviewer.

Stage 3: Review (Human)

The tier 1 reviewer sees the proposal, the customer's history, the agent's reasoning. The reviewer modifies the proposal — adjusts the reason text to be more empathic. The reviewer approves the modified proposal.

Stage 4: Execution (Agent)

The agent executes the modified refund. The refund is processed. The customer is notified.

Stage 5: Post-Execution (System)

The system monitors the refund. The customer's response is tracked. The outcome is logged.

Stage 6: Audit (All)

The audit trail records the agent's preparation, the system's validation, the human's review and modification, the agent's execution, the system's monitoring. Every boundary is preserved. Every contribution is attributed.

The boundary moves through the six stages. The agent prepares and executes. The human reviews and modifies. The system validates and monitors. The deployer configures the boundary. The vendor provides the model. Each participant contributes within their boundary. Each contribution is recorded.


The Boundary and the Audit Trail

The audit trail is the proof of the boundary's correctness. The trail records:

  • What the agent did, with what reasoning, in what context
  • What the system enforced, with what policy, at what thresholds
  • What the human decided, with what modification, in what time
  • What was executed, with what parameters, in what order

The audit trail shows the boundary in action. The trail is the proof that the agent stopped at the right point, the human started at the right point, the system enforced at the right point. The trail is the boundary made visible, accountable, and verifiable.


The Cultural Shift: The Boundary Is the Architecture

The teams that get HITL right treat the boundary as the architecture. The boundary is not a side effect of the implementation — it is the implementation. The boundary defines the system. The system is the boundary.

The cultural shift is from "where does the agent end and the human start" to "where should the agent end and the human start, and how do we make sure the boundary is right."

The teams that don't make this shift treat the boundary as an accident. The boundary emerges from the implementation. The implementation reflects the team's preferences. The preferences change. The boundary drifts. The system degrades.


Where Facio Fits

Facio's policy engine is the boundary definition. The manifest specifies, for each action type, what the agent does, what the human reviews, what the system enforces. The boundary is in the code.

Placet.io's review interface is the boundary surface. The interface shows what the human is responsible for. The modification is structured. The decision is captured. The boundary is visible to the reviewer.

The continuous calibration is the boundary management. The override rate, the error rate, the reviewer pattern, the agent's reliability — all measured. The boundary is calibrated dynamically.

The audit trail is the boundary proof. Every contribution is recorded. The boundary is verifiable. The system's behavior is defensible.

The boundary is the architecture. Facio is designed for the boundary.


Key Takeaways

  • The HITL boundary is not what the textbook shows — it's contested, contextual, constantly shifting across five dimensions
  • Five boundary types: reasoning, action, context, decision, accountability — each contested differently
  • Five shifts: scope creep into human, scope creep into agent, boundary friction, boundary inversion, boundary collapse
  • Five failure modes: agent too much, human too much, boundary unclear, boundary static, boundary visible (gameable)
  • The boundary management architecture: explicit definition, calibration, negotiation, visibility, evolution
  • The boundary is the architecture — the teams that get HITL right treat the boundary as the architecture, not a side effect
  • The audit trail is the boundary proof — every contribution is recorded, every boundary is preserved, every participant is attributed
  • Facio + Placet.io are designed around the boundary — the manifest defines it, the interface surfaces it, the calibration manages it, the audit trail proves it

Sources: The HITL boundary analysis draws on activity theory (the distribution of work between actors), distributed cognition research (how cognitive work is shared between humans and machines), the documented patterns of human-AI collaboration in production systems during 2025-2026, and the established architectural patterns for human-in-the-loop control systems.

Keep reading

More on Human-in-the-loop

View category