Claude Agent Guardrails: Anthropic SDK Security
Secure Claude agents with runtime authorization: Anthropic SDK code, the protect() pattern, YAML policies, and decision records for SOC 2 and GDPR.
Claude agents are high-impact because Claude is unusually willing to act. Give Claude a set of tools and a goal, and it will chain tool calls across dozens of turns to achieve that goal. This is what gives Claude agents operational value. The same autonomy makes them unbounded without runtime authorization. A Claude agent with access to a file system, shell, and API credentials has the same blast radius as a junior engineer with root access and no code review process.
Anthropic's Claude Agent SDK includes a hooks system for exactly this reason: the framework designers knew that tool_use decisions need external checkpoints. The authorization stack for Claude agents starts at the Messages API's tool_use flow and extends to the Agent SDK's hook system, with YAML policies, TypeScript and Python implementations, and audit evidence that maps to SOC 2 and GDPR requirements.
Claude's tool_use Flow
Claude's tool calling works through the Messages API. You define tools as JSON schemas in the tools parameter. Claude decides when to call a tool by returning a tool_use content block in its response. Your application executes the tool and sends the result back as a tool_result block. The conversation continues until Claude stops requesting tools.
The critical gap is between steps two and three: Claude returns a tool_use block, and your application executes it. In most tutorials and tutorial code, that execution is unconditional. Whatever Claude asks for, the application does. The protect() pattern inserts a policy evaluation at exactly that point.
The protect() Pattern
Wrap governed tool_use blocks in a Veto protect() call before executing the tool. The pattern is identical in Python and TypeScript:
import os
import anthropic
from veto import Veto, Decision
client = anthropic.Anthropic()
veto = Veto(api_key=os.environ["VETO_API_KEY"], project="claude-agent")
TOOLS = [
{
"name": "read_file",
"description": "Read a file from the filesystem",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"}
},
"required": ["path"],
},
},
{
"name": "execute_command",
"description": "Run a shell command",
"input_schema": {
"type": "object",
"properties": {
"command": {"type": "string", "description": "Shell command"}
},
"required": ["command"],
},
},
{
"name": "call_api",
"description": "Make an HTTP API request",
"input_schema": {
"type": "object",
"properties": {
"url": {"type": "string"},
"method": {"type": "string", "enum": ["GET", "POST", "PUT", "DELETE"]},
"body": {"type": "object"},
},
"required": ["url", "method"],
},
},
]
async def run_claude_agent(user_message: str, user_id: str):
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model=os.environ["ANTHROPIC_MODEL"],
max_tokens=4096,
tools=TOOLS,
messages=messages,
)
if response.stop_reason != "tool_use":
return response
tool_blocks = [b for b in response.content if b.type == "tool_use"]
tool_results = []
for block in tool_blocks:
decision = veto.protect(
tool=block.name,
arguments=block.input,
context={
"user_id": user_id,
"model": "ANTHROPIC_MODEL",
"session_id": response.id,
"turn_count": len(messages) // 2,
},
)
if decision.action == Decision.ALLOW:
result = await execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
elif decision.action == Decision.DENY:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"BLOCKED: {decision.reason}",
"is_error": True,
})
elif decision.action == Decision.APPROVAL_REQUIRED:
approval = veto.wait_for_approval(
decision_id=decision.id,
timeout=decision.approval_timeout,
)
if approval.granted:
result = await execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
else:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"DENIED: {approval.reason}",
"is_error": True,
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})Claude Agent SDK Hooks
Anthropic's Claude Agent SDK (the framework behind Claude Code) includes a hooks system designed for exactly this use case. Hooks are lifecycle callbacks that fire at specific points in the agent loop: before_tool_call, after_tool_call, before_model_call, and after_model_call. Veto integrates at the before_tool_call hook to authorize governed tool invocations before they execute.
import { Agent, AgentHooks } from "@anthropic-ai/agent-sdk";
import { Veto, Decision } from "veto";
const veto = new Veto({ apiKey: process.env.VETO_API_KEY!, project: "claude-code-agent" });
const authorizationHooks: AgentHooks = {
async before_tool_call({ toolName, toolInput, context }) {
const decision = await veto.protect({
tool: toolName,
arguments: toolInput,
context: {
userId: context.userId,
sessionId: context.sessionId,
agentId: context.agentId,
},
});
if (decision.action === Decision.DENY) {
return {
abort: true,
result: `BLOCKED: ${decision.reason}`,
};
}
if (decision.action === Decision.APPROVAL_REQUIRED) {
const approval = await veto.waitForApproval({
decisionId: decision.id,
timeout: decision.approvalTimeout,
});
if (!approval.granted) {
return {
abort: true,
result: `DENIED by ${approval.reviewer}: ${approval.reason}`,
};
}
}
return { abort: false };
},
async after_tool_call({ toolName, toolInput, result, context }) {
await veto.logExecution({
tool: toolName,
arguments: toolInput,
result: typeof result === "string" ? result : JSON.stringify(result),
context: { sessionId: context.sessionId },
});
},
};
const agent = new Agent({
model: process.env.ANTHROPIC_MODEL!,
tools: [readFile, executeCommand, callApi],
hooks: authorizationHooks,
});YAML Policies for Claude Agents
Claude agents have characteristic tool patterns: file system access, shell execution, API calls, and code generation. Policies should reflect the specific risks of each:
name: claude-agent
description: "Runtime authorization for Claude-powered agents"
rules:
# File access: allow project directory, deny system paths
- tool: read_file
conditions:
- match:
arguments.path: "^/workspace/"
action: allow
- match:
arguments.path: "^(/etc/|/var/|/usr/|/sys/|/proc/)"
action: deny
reason: "System path access denied"
- match:
arguments.path: "\.(env|pem|key|secret|credentials)$"
action: deny
reason: "Sensitive file access denied"
# Shell commands: allow read-only commands, gate destructive ones
- tool: execute_command
conditions:
- match:
arguments.command: "^(ls|cat|grep|find|wc|head|tail|echo)\s"
action: allow
- match:
arguments.command: "(rm\s+-rf|dd\s|mkfs|chmod\s+777|curl.*\|.*sh)"
action: deny
reason: "Blocked command pattern"
- match:
arguments.command: "^(git|npm|pip|cargo)\s"
action: allow
logging:
level: full
- match:
arguments.command: ".*"
action: require_approval
approval:
channel: workspace
timeout: 120s
# API calls: scope by domain and method
- tool: call_api
conditions:
- match:
arguments.method: "GET"
arguments.url: "^https://(api\.approved\.example\.com|internal\.service)"
action: allow
- match:
arguments.method: "(POST|PUT|DELETE)"
action: require_approval
approval:
channel: approval_channel
timeout: 300s
- match:
arguments.url: ".*"
action: deny
reason: "External API access not permitted"
default_action: deny
logging:
level: full
retention: 1yearTypeScript Implementation: Full Agent
import Anthropic from "@anthropic-ai/sdk";
import { Veto, Decision } from "veto";
const anthropic = new Anthropic();
const veto = new Veto({ apiKey: process.env.VETO_API_KEY!, project: "claude-ts-agent" });
const tools: Anthropic.Tool[] = [
{
name: "read_file",
description: "Read a file from the filesystem",
input_schema: {
type: "object" as const,
properties: { path: { type: "string" } },
required: ["path"],
},
},
{
name: "execute_command",
description: "Run a shell command",
input_schema: {
type: "object" as const,
properties: { command: { type: "string" } },
required: ["command"],
},
},
];
async function runAgent(userMessage: string, userId: string) {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await anthropic.messages.create({
model: process.env.ANTHROPIC_MODEL!,
max_tokens: 4096,
tools,
messages,
});
if (response.stop_reason !== "tool_use") return response;
const toolBlocks = response.content.filter(
(b): b is Anthropic.ToolUseBlock => b.type === "tool_use"
);
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of toolBlocks) {
const decision = await veto.protect({
tool: block.name,
arguments: block.input as Record<string, unknown>,
context: { userId, sessionId: response.id },
});
if (decision.action === Decision.ALLOW) {
const result = await executeTool(block.name, block.input);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: String(result),
});
} else if (decision.action === Decision.DENY) {
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: `BLOCKED: ${decision.reason}`,
is_error: true,
});
} else {
const approval = await veto.waitForApproval({
decisionId: decision.id,
timeout: decision.approvalTimeout,
});
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: approval.granted
? String(await executeTool(block.name, block.input))
: `DENIED: ${approval.reason}`,
is_error: !approval.granted,
});
}
}
messages.push({ role: "assistant", content: response.content });
messages.push({ role: "user", content: toolResults });
}
}Decision Record: What Gets Logged
Every protect() call against a Claude agent tool_use block produces a decision record. Here is what a single denied file access looks like in the log:
{
"record_id": "aud_cl_9x8w7v6u5t4s",
"timestamp": "2026-04-04T14:22:08.331Z",
"event_type": "tool_call_decision",
"identity": {
"agent_id": "claude-agent-v2",
"model": "ANTHROPIC_MODEL",
"session_id": "msg_01XYZ000",
"triggered_by": {
"user_id": "user_892",
"email": "dev@approved.example"
}
},
"tool_call": {
"tool": "read_file",
"arguments": {
"path": "/etc/shadow"
}
},
"policy_evaluation": {
"policy_name": "claude-agent",
"rule_matched": "rule_1_system_path_deny",
"conditions_evaluated": [
{"condition": "path starts with /workspace/", "result": false},
{"condition": "path starts with /etc/", "result": true}
],
"decision": "deny",
"reason": "System path access denied"
},
"compliance_metadata": {
"soc2_controls": ["CC6.1", "CC6.3", "CC7.2"],
"retention_policy": "1year"
}
}SOC 2 Mapping for Claude Agent Control Evidence
Claude agent decision records can be mapped to SOC 2 Trust Services Criteria. The two most relevant controls:
- CC6.1 (Logical Access Controls): Every tool_use decision includes: user identity, agent identity, model version, session ID, and the policy that governed the decision. Your auditor can trace a governed agent action back to the human who initiated the session and the policy that authorized (or denied) it.
- CC6.3 (Access Authorization): YAML policies define per-tool, per-context authorization rules. Denial records show enforcement. Policy version history shows who changed what and when. The combination answers the auditor's question: "What was the agent authorized to do, and was that authorization appropriate?"
For GDPR, every audit record includes the data subject context (when applicable) and the decision explanation, supporting Articles 13-15 (right to information about automated decision-making) and Article 22 (right to human intervention when approvals are configured).
First governed call
Adding authorization to a Claude agent is one protect() call per tool_use block. Adding hooks to a Claude Agent SDK agent is one configuration object. The policies live in YAML, each governed call emits a decision record, and the evidence is ready to inspect.
Sign up and secure your Claude agent. Claude integration docs cover the API surface, and our SOC 2 evidence guide maps every audit control to Veto features.
Related posts
Sign up