Engineering

Human Review for AI Agents

Five approval patterns for production agents: pre-action, confidence-based, sampled, tiered, and post-action review.

Yaz CalebJanuary 31, 202615 min

The goal is controlled autonomy: agents that operate independently within defined boundaries, escalate when necessary, and integrate human judgment where it adds the most value. This is not about slowing agents down. An agent that auto-approves 95% of actions and routes 5% to a human is faster than one that requires approval for everything, and infinitely safer than one that requires approval for nothing.

The EU AI Act (Article 14) requires "effective oversight by natural persons" for high-risk AI systems. GDPR Article 22 grants individuals the right not to be subject to decisions "based solely on automated processing." And SOC 2 CC7.2 requires continuous monitoring of system components. Human review is not optional for many consequential AI workflows; it is also the practical control that keeps high-impact actions from becoming silent automation.

Five Approval Patterns

Not every action needs the same level of oversight. Production human-review systems use a mix of patterns, applied based on the risk and context of each tool call:

1. Pre-Action Approval

The agent proposes a tool call, pauses execution, and waits for a human to approve or deny before proceeding. This is the highest-friction pattern and should be reserved for consequential actions: fund transfers, data deletion, contract execution, or communications with legal implications.

pre_action_approval.yaml
# Pre-action: agent stops and waits before executing
- tool: transfer_funds
  action: require_approval
  approval:
    channel: workspace  # appears in Veto approval queue
    timeout: 300s  # 5 min to respond
    escalation: deny  # no response = denied
    context_shown:  # what the reviewer sees
      - tool_name
      - arguments
      - session_history  # full conversation context
      - risk_score

2. Confidence-Based Routing

Route to human review only when the agent's confidence is below a threshold. For tasks like customer request classification, the agent can handle clear-cut cases without human review when configured and escalates ambiguous ones. This keeps throughput high while catching edge cases.

confidence_routing.yaml
# Confidence-based: route low-confidence decisions to humans
- tool: classify_ticket
  conditions:
    - match:
        context.confidence: ">= 0.9"
      action: allow
    - match:
        context.confidence: ">= 0.7"
      action: allow
      logging:
        level: full  # log for later review
        flag_for_review: true # async review queue
    - match:
        context.confidence: "< 0.7"
      action: require_approval
      approval:
        channel: approval_channel
        timeout: 600s
        escalation: queue  # keep in queue, do not auto-deny

3. Sampled Approval

Approve 100% of high-risk actions, but only sample 5-20% of low-risk actions for human review. This catches drift, validates agent behavior over time, and supports audit requirements without creating a bottleneck. The key insight: you are not reviewing to catch every destructive action. You are reviewing to maintain calibration and detect systematic failures.

sampled_approval.yaml
# Sampled: review a percentage of low-risk actions
- tool: send_internal_email
  conditions:
    - match:
        arguments.recipients_count: "<= 5"
      action: allow
      sampling:
        rate: 0.10  # review 10% of allowed actions
        channel: workspace
        async: true  # do not block the agent
    - match:
        arguments.recipients_count: "> 5"
      action: require_approval

4. Tiered Escalation

Route actions through progressively higher approval levels based on risk classification. Tier 1 goes to front-line reviewers with fast SLAs. Tier 2 goes to team leads. Tier 3 goes to domain specialists or executives. If a tier times out, escalate to the next level rather than auto-denying.

tiered_escalation.yaml
# Tiered: progressively higher approval authority
- tool: modify_customer_contract
  action: require_approval
  approval:
    tiers:
      - level: 1
        reviewers:
          - role: account_manager
        timeout: 1800s  # 30 min
        response_window: respond_within

      - level: 2
        reviewers:
          - role: team_lead
          - role: legal_ops
        timeout: 3600s  # 1 hour
        response_window: respond_within

      - level: 3
        reviewers:
          - role: vp_sales
          - role: general_counsel
        timeout: 86400s  # 24 hours
        response_window: respond_within

    final_escalation: deny
    notify_on_escalation:
      channel: pagerduty

5. Post-Action Review

Allow the action immediately but queue it for asynchronous human review. If the reviewer flags an issue, the system can trigger a rollback or corrective action. This pattern works for reversible operations where speed matters more than pre-approval.

post_action_review.yaml
# Post-action: allow immediately, review after
- tool: update_customer_record
  action: allow
  post_action:
    review_required: true
    channel: workspace
    reviewer_pool:
      - role: data_quality
    rollback_enabled: true
    rollback_tool: revert_customer_record
    review_sla: 4hours

The Approval Queue: Veto Workspace

When an agent's tool call triggers require_approval, it appears in the Veto workspace's approval queue. Reviewers see the decision context: the tool being called, the arguments, the conversation history that led to the call, the policy that flagged it, and the risk score. They can approve, deny, or modify the arguments before approval.

Approvals can also be routed to the configured review channel, PagerDuty, or any webhook. The agent's execution pauses (with a configurable timeout) until a decision is made.

Implementation: The protect() + wait_for_approval() Pattern

Here is the implementation pattern for a Claude agent with human review approval workflows:

human_review_implementation.py
import os
import anthropic
from veto import Veto, Decision

client = anthropic.Anthropic()
veto = Veto(api_key=os.environ["VETO_API_KEY"], project="customer-ops-agent")

async def run_agent_with_human_review(user_message: str, context: dict):
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model=os.environ["ANTHROPIC_MODEL"],
            max_tokens=4096,
            tools=TOOLS,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return response

        tool_blocks = [b for b in response.content if b.type == "tool_use"]
        tool_results = []

        for block in tool_blocks:
            decision = veto.protect(
                tool=block.name,
                arguments=block.input,
                context=context,
            )

            if decision.action == Decision.ALLOW:
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

            elif decision.action == Decision.DENY:
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"BLOCKED: {decision.reason}",
                    "is_error": True,
                })

            elif decision.action == Decision.APPROVAL_REQUIRED:
                # Agent execution pauses here.
                # A notification fires to the configured channel.
                # The reviewer sees decision context in the Veto workspace.
                approval = veto.wait_for_approval(
                    decision_id=decision.id,
                    timeout=decision.approval_timeout,
                )

                if approval.granted:
                    # Reviewer approved: execute with the original
                    # or modified arguments
                    final_args = approval.modified_arguments or block.input
                    result = await execute_tool(block.name, final_args)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result),
                    })
                elif approval.timed_out:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": "Approval timed out: action not taken",
                        "is_error": True,
                    })
                else:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"DENIED by {approval.reviewer}: {approval.reason}",
                        "is_error": True,
                    })

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Calibrating Your Thresholds

The most common mistake in human-review systems is over-routing. If 50% of actions require approval, your reviewers develop alert fatigue and start rubber-stamping everything. The goal is to route less than 10% of total actions to human review, but route 100% of materially high-risk actions.

Route aggressively at first (require approval for anything uncertain), then loosen thresholds as you build confidence in the agent's behavior. Veto's decision records give you the data to calibrate: look at approval rates, denial rates, and the reasons for each. If a particular tool call is approved 99% of the time, it is a candidate for automatic approval with sampled review.

Timeout Strategies

What happens when nobody reviews an approval request? Your timeout strategy depends on the consequence of the action:

  • Default deny: For irreversible actions (fund transfers, data deletion). If no one reviews it, the action does not happen. The agent is told the action was denied and must inform the user.
  • Escalate: For time-sensitive actions (customer-facing responses). If the primary reviewer does not respond, escalate to the next tier. Keep escalating until someone responds or the final timeout is reached.
  • Default allow: For low-risk, reversible actions where the cost of delay exceeds the cost of a mistake. Auto-approve after timeout but flag for post-action review.
  • Queue: For non-urgent actions. Keep the request in the queue indefinitely. The agent tells the user "this action is pending review" and moves on to other tasks.

Regulatory Alignment

For high-impact workflows, human review can be a legal, contractual, or assurance requirement across multiple frameworks:

  • EU AI Act, Art. 14: Requires "effective oversight by natural persons" for high-risk AI systems, including the ability to "intervene in the operation of the high-risk AI system or interrupt the system."
  • GDPR, Art. 22: Grants the right to "obtain human intervention on the part of the controller" for automated decisions that produce legal effects.
  • SOC 2, CC8.1: Requires that all changes be "authorized and strategized" before deployment, with documented approval workflows.
  • SOX Section 404: Requires internal controls over financial reporting, including approval workflows for transactions above materiality thresholds.

Veto's approval logs provide the evidence record these frameworks often require: who approved what, when, with what context, and what the outcome was.

First governed call

Adding human review to an existing agent takes two changes: set action: require_approval on the tools that need it, and add the wait_for_approval() call in your tool execution path. The Veto workspace handles the reviewer UI, notifications, and decision records.

Sign up and add approval workflows to your agent, or read about EU AI Act evidence to understand the regulatory requirements in detail.

Related posts

Sign up