Integrations

Claude Agent Guardrails: Anthropic SDK Security

Secure Claude agents with runtime authorization: Anthropic SDK code, the protect() pattern, YAML policies, and decision records for SOC 2 and GDPR.

Yaz CalebFebruary 21, 202614 min

Claude agents are high-impact because Claude is unusually willing to act. Give Claude a set of tools and a goal, and it will chain tool calls across dozens of turns to achieve that goal. This is what gives Claude agents operational value. The same autonomy makes them unbounded without runtime authorization. A Claude agent with access to a file system, shell, and API credentials has the same blast radius as a junior engineer with root access and no code review process.

Anthropic's Claude Agent SDK includes a hooks system for exactly this reason: the framework designers knew that tool_use decisions need external checkpoints. The authorization stack for Claude agents starts at the Messages API's tool_use flow and extends to the Agent SDK's hook system, with YAML policies, TypeScript and Python implementations, and audit evidence that maps to SOC 2 and GDPR requirements.

Claude's tool_use Flow

Claude's tool calling works through the Messages API. You define tools as JSON schemas in the tools parameter. Claude decides when to call a tool by returning a tool_use content block in its response. Your application executes the tool and sends the result back as a tool_result block. The conversation continues until Claude stops requesting tools.

The critical gap is between steps two and three: Claude returns a tool_use block, and your application executes it. In most tutorials and tutorial code, that execution is unconditional. Whatever Claude asks for, the application does. The protect() pattern inserts a policy evaluation at exactly that point.

The protect() Pattern

Wrap governed tool_use blocks in a Veto protect() call before executing the tool. The pattern is identical in Python and TypeScript:

claude_protect_pattern.py
import os
import anthropic
from veto import Veto, Decision

client = anthropic.Anthropic()
veto = Veto(api_key=os.environ["VETO_API_KEY"], project="claude-agent")

TOOLS = [
    {
        "name": "read_file",
        "description": "Read a file from the filesystem",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "File path to read"}
            },
            "required": ["path"],
        },
    },
    {
        "name": "execute_command",
        "description": "Run a shell command",
        "input_schema": {
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "Shell command"}
            },
            "required": ["command"],
        },
    },
    {
        "name": "call_api",
        "description": "Make an HTTP API request",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string"},
                "method": {"type": "string", "enum": ["GET", "POST", "PUT", "DELETE"]},
                "body": {"type": "object"},
            },
            "required": ["url", "method"],
        },
    },
]

async def run_claude_agent(user_message: str, user_id: str):
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model=os.environ["ANTHROPIC_MODEL"],
            max_tokens=4096,
            tools=TOOLS,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return response

        tool_blocks = [b for b in response.content if b.type == "tool_use"]
        tool_results = []

        for block in tool_blocks:
            decision = veto.protect(
                tool=block.name,
                arguments=block.input,
                context={
                    "user_id": user_id,
                    "model": "ANTHROPIC_MODEL",
                    "session_id": response.id,
                    "turn_count": len(messages) // 2,
                },
            )

            if decision.action == Decision.ALLOW:
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })
            elif decision.action == Decision.DENY:
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"BLOCKED: {decision.reason}",
                    "is_error": True,
                })
            elif decision.action == Decision.APPROVAL_REQUIRED:
                approval = veto.wait_for_approval(
                    decision_id=decision.id,
                    timeout=decision.approval_timeout,
                )
                if approval.granted:
                    result = await execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result),
                    })
                else:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"DENIED: {approval.reason}",
                        "is_error": True,
                    })

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Claude Agent SDK Hooks

Anthropic's Claude Agent SDK (the framework behind Claude Code) includes a hooks system designed for exactly this use case. Hooks are lifecycle callbacks that fire at specific points in the agent loop: before_tool_call, after_tool_call, before_model_call, and after_model_call. Veto integrates at the before_tool_call hook to authorize governed tool invocations before they execute.

claude_agent_sdk_hooks.ts
import { Agent, AgentHooks } from "@anthropic-ai/agent-sdk";
import { Veto, Decision } from "veto";

const veto = new Veto({ apiKey: process.env.VETO_API_KEY!, project: "claude-code-agent" });

const authorizationHooks: AgentHooks = {
  async before_tool_call({ toolName, toolInput, context }) {
    const decision = await veto.protect({
      tool: toolName,
      arguments: toolInput,
      context: {
        userId: context.userId,
        sessionId: context.sessionId,
        agentId: context.agentId,
      },
    });

    if (decision.action === Decision.DENY) {
      return {
        abort: true,
        result: `BLOCKED: ${decision.reason}`,
      };
    }

    if (decision.action === Decision.APPROVAL_REQUIRED) {
      const approval = await veto.waitForApproval({
        decisionId: decision.id,
        timeout: decision.approvalTimeout,
      });
      if (!approval.granted) {
        return {
          abort: true,
          result: `DENIED by ${approval.reviewer}: ${approval.reason}`,
        };
      }
    }

    return { abort: false };
  },

  async after_tool_call({ toolName, toolInput, result, context }) {
    await veto.logExecution({
      tool: toolName,
      arguments: toolInput,
      result: typeof result === "string" ? result : JSON.stringify(result),
      context: { sessionId: context.sessionId },
    });
  },
};

const agent = new Agent({
  model: process.env.ANTHROPIC_MODEL!,
  tools: [readFile, executeCommand, callApi],
  hooks: authorizationHooks,
});

YAML Policies for Claude Agents

Claude agents have characteristic tool patterns: file system access, shell execution, API calls, and code generation. Policies should reflect the specific risks of each:

policies/claude-agent.yaml
name: claude-agent
description: "Runtime authorization for Claude-powered agents"

rules:
  # File access: allow project directory, deny system paths
  - tool: read_file
    conditions:
      - match:
          arguments.path: "^/workspace/"
        action: allow
      - match:
          arguments.path: "^(/etc/|/var/|/usr/|/sys/|/proc/)"
        action: deny
        reason: "System path access denied"
      - match:
          arguments.path: "\.(env|pem|key|secret|credentials)$"
        action: deny
        reason: "Sensitive file access denied"

  # Shell commands: allow read-only commands, gate destructive ones
  - tool: execute_command
    conditions:
      - match:
          arguments.command: "^(ls|cat|grep|find|wc|head|tail|echo)\s"
        action: allow
      - match:
          arguments.command: "(rm\s+-rf|dd\s|mkfs|chmod\s+777|curl.*\|.*sh)"
        action: deny
        reason: "Blocked command pattern"
      - match:
          arguments.command: "^(git|npm|pip|cargo)\s"
        action: allow
        logging:
          level: full
      - match:
          arguments.command: ".*"
        action: require_approval
        approval:
          channel: workspace
          timeout: 120s

  # API calls: scope by domain and method
  - tool: call_api
    conditions:
      - match:
          arguments.method: "GET"
          arguments.url: "^https://(api\.approved\.example\.com|internal\.service)"
        action: allow
      - match:
          arguments.method: "(POST|PUT|DELETE)"
        action: require_approval
        approval:
          channel: approval_channel
          timeout: 300s
      - match:
          arguments.url: ".*"
        action: deny
        reason: "External API access not permitted"

default_action: deny
logging:
  level: full
  retention: 1year

TypeScript Implementation: Full Agent

claude_agent_full.ts
import Anthropic from "@anthropic-ai/sdk";
import { Veto, Decision } from "veto";

const anthropic = new Anthropic();
const veto = new Veto({ apiKey: process.env.VETO_API_KEY!, project: "claude-ts-agent" });

const tools: Anthropic.Tool[] = [
  {
    name: "read_file",
    description: "Read a file from the filesystem",
    input_schema: {
      type: "object" as const,
      properties: { path: { type: "string" } },
      required: ["path"],
    },
  },
  {
    name: "execute_command",
    description: "Run a shell command",
    input_schema: {
      type: "object" as const,
      properties: { command: { type: "string" } },
      required: ["command"],
    },
  },
];

async function runAgent(userMessage: string, userId: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await anthropic.messages.create({
      model: process.env.ANTHROPIC_MODEL!,
      max_tokens: 4096,
      tools,
      messages,
    });

    if (response.stop_reason !== "tool_use") return response;

    const toolBlocks = response.content.filter(
      (b): b is Anthropic.ToolUseBlock => b.type === "tool_use"
    );

    const toolResults: Anthropic.ToolResultBlockParam[] = [];

    for (const block of toolBlocks) {
      const decision = await veto.protect({
        tool: block.name,
        arguments: block.input as Record<string, unknown>,
        context: { userId, sessionId: response.id },
      });

      if (decision.action === Decision.ALLOW) {
        const result = await executeTool(block.name, block.input);
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: String(result),
        });
      } else if (decision.action === Decision.DENY) {
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: `BLOCKED: ${decision.reason}`,
          is_error: true,
        });
      } else {
        const approval = await veto.waitForApproval({
          decisionId: decision.id,
          timeout: decision.approvalTimeout,
        });
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: approval.granted
            ? String(await executeTool(block.name, block.input))
            : `DENIED: ${approval.reason}`,
          is_error: !approval.granted,
        });
      }
    }

    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
}

Decision Record: What Gets Logged

Every protect() call against a Claude agent tool_use block produces a decision record. Here is what a single denied file access looks like in the log:

decision_record_claude.json
{
  "record_id": "aud_cl_9x8w7v6u5t4s",
  "timestamp": "2026-04-04T14:22:08.331Z",
  "event_type": "tool_call_decision",

  "identity": {
    "agent_id": "claude-agent-v2",
    "model": "ANTHROPIC_MODEL",
    "session_id": "msg_01XYZ000",
    "triggered_by": {
      "user_id": "user_892",
      "email": "dev@approved.example"
    }
  },

  "tool_call": {
    "tool": "read_file",
    "arguments": {
      "path": "/etc/shadow"
    }
  },

  "policy_evaluation": {
    "policy_name": "claude-agent",
    "rule_matched": "rule_1_system_path_deny",
    "conditions_evaluated": [
      {"condition": "path starts with /workspace/", "result": false},
      {"condition": "path starts with /etc/", "result": true}
    ],
    "decision": "deny",
    "reason": "System path access denied"
  },

  "compliance_metadata": {
    "soc2_controls": ["CC6.1", "CC6.3", "CC7.2"],
    "retention_policy": "1year"
  }
}

SOC 2 Mapping for Claude Agent Control Evidence

Claude agent decision records can be mapped to SOC 2 Trust Services Criteria. The two most relevant controls:

  • CC6.1 (Logical Access Controls): Every tool_use decision includes: user identity, agent identity, model version, session ID, and the policy that governed the decision. Your auditor can trace a governed agent action back to the human who initiated the session and the policy that authorized (or denied) it.
  • CC6.3 (Access Authorization): YAML policies define per-tool, per-context authorization rules. Denial records show enforcement. Policy version history shows who changed what and when. The combination answers the auditor's question: "What was the agent authorized to do, and was that authorization appropriate?"

For GDPR, every audit record includes the data subject context (when applicable) and the decision explanation, supporting Articles 13-15 (right to information about automated decision-making) and Article 22 (right to human intervention when approvals are configured).

First governed call

Adding authorization to a Claude agent is one protect() call per tool_use block. Adding hooks to a Claude Agent SDK agent is one configuration object. The policies live in YAML, each governed call emits a decision record, and the evidence is ready to inspect.

Sign up and secure your Claude agent. Claude integration docs cover the API surface, and our SOC 2 evidence guide maps every audit control to Veto features.

Related posts

Sign up