AI Agent Incident Patterns to Watch

The recurring incidents are not just bad answers; they are authorized credentials executing the wrong side effect.

What changed

The old guardrail conversation centered on prompts, filters, and model behavior. Production agents have a second problem: they can call tools that move money, expose data, update records, deploy code, and message customers.

Old question	Better production question
Will the model say the right thing?	Can the proposed action run under policy?
Did we log the chat?	Did we record the decision before execution?
Can a human review the output?	Can a human stop the side effect before it happens?

Practical takeaway

Pick one protected action, wrap it, define allow and review thresholds, and export the decision record. That is a better comeback than publishing generic thought leadership.

first-protected-action.yaml

protected_action:
  tool: release_payment
  allow_when:
    amount_usd: "<= 500"
  require_approval_when:
    amount_usd: "> 500"
  deny_when:
    tenant_mismatch: true

FAQ

What should a team authorize before production agent actions?⌄

Authorize the exact tool name, arguments, actor, tenant, environment, and review requirement before the side effect reaches the upstream system.

Why not rely on prompts for this?⌄

Prompts guide model behavior, but they do not reliably stop a tool dispatch. Runtime authorization sits after the model proposes an action and before the tool executes.

What evidence should the page produce?⌄

Keep a decision record with the actor, tool, arguments summary, policy version, verdict, reviewer when required, timestamp, and source system context.

What changed

Practical takeaway

Sources

FAQ