Multi-Tenant AI Agent Architecture
Three isolation models for multi-tenant agents: per-tenant policy, vector boundaries, and action evidence.
Control points
- Tenant isolation must be enforced on the tool path, not only in prompts or vector database filters.
- Each governed tool call should carry tenant context so policy can reject cross-tenant data access, writes, exports, and integrations.
- Per-tenant policies and audit records give enterprise buyers evidence for isolation, incident review, and governance.
Many enterprise applications are adding task-specific AI agents to multi-tenant SaaS. Unlike standard SaaS where a database row-level policy may be enough, AI agents introduce a fundamentally new isolation challenge: agents work with vector databases, long-term memory, tool access, and external APIs. A wrong permission or a prompt injection can cross tenant boundaries in ways no traditional access control model anticipated.
Why Traditional Tenant Isolation Breaks
In a standard SaaS app, tenant isolation is straightforward: every database query includes a WHERE tenant_id = ? clause, API requests are scoped to an org, and tenant isolation is treated as an application invariant. AI agents strain this model in five ways:
- Agents do not log in. A traditional user has a session with a tenant context. An agent is invoked on behalf of a user, but it does not authenticate as a tenant. Each governed tool call needs explicit tenant scoping.
- Vector search is approximate. Semantic search returns nearest neighbors, not exact matches. If you mix tenant embeddings in one index without strict metadata filtering, a query from Tenant A can return Tenant B's documents.
- Tool access is lateral. An agent with access to a file system, database, or API can reach across tenant boundaries unless governed tools enforces tenant scoping independently.
- Memory persists across sessions. Agent memory systems (conversation history, learned preferences, cached tool results) can leak information between tenants if not isolated.
- Prompt injection crosses boundaries. A malicious user in Tenant A can craft input that causes the agent to access Tenant B's data through tool calls. Prompt-level isolation does not prevent this.
Architecture: Three Isolation Models
There are three patterns for structuring multi-tenant AI agents, each with different cost-isolation tradeoffs:
Model 1: Fully Siloed (highest cost, strongest isolation) ┌─────────────────────────────────────────────────────────┐ │ Tenant A │ Tenant B │ │ ┌─────────────────────────┐ │ ┌─────────────────────────┐ │ │ Agent Instance │ │ │ Agent Instance │ │ │ Vector DB (dedicated) │ │ │ Vector DB (dedicated) │ │ │ File Storage (dedicated) │ │ │ File Storage (dedicated) │ │ │ Memory Store (dedicated) │ │ │ Memory Store (dedicated) │ │ │ Tool Credentials (own) │ │ │ Tool Credentials (own) │ │ └─────────────────────────┘ │ └─────────────────────────┘ └─────────────────────────────────────────────────────────┘ Cost: highest per tenant Blast radius: single tenant Model 2: Shared Infra, Logical Isolation (balanced) ┌─────────────────────────────────────────────────────────┐ │ Shared Infrastructure │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Agent Runtime (shared, context-injected per tenant) │ │ │ └──────────────────────┬──────────────────────────────┘ │ │ │ │ │ ┌──────────────┐ ┌─────┴──────────┐ ┌──────────────┐ │ │ │ Vector DB │ │ Veto Policy │ │ File Storage │ │ │ │ (namespaced) │ │ Engine │ │ (prefixed) │ │ │ │ tenant_a/* │ │ per-tenant │ │ /tenant_a/* │ │ │ │ tenant_b/* │ │ policies │ │ /tenant_b/* │ │ │ └──────────────┘ └────────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────┘ Cost: lower per tenant Blast radius: controlled by policy Model 3: Fully Shared (lowest cost, weakest isolation) ┌─────────────────────────────────────────────────────────┐ │ Single shared agent, single DB, tenant_id column only │ │ ⚠ NOT RECOMMENDED for AI agents: too many leak vectors │ └─────────────────────────────────────────────────────────┘
Model 2 is where most production systems land. Shared infrastructure keeps costs manageable. Logical isolation through a policy engine blocks configured cross-tenant access without the operational burden of managing hundreds of isolated deployments.
Per-Tenant Policies with Veto
Different tenants have different security requirements. An enterprise customer with a SOC 2 evidence requirement needs approval workflows and reviewable decision records. A developer tenant running locally may only need basic rate limiting. Veto lets you define per-tenant policies that the same agent runtime evaluates at execution time:
# Regulated tenant: Enterprise plan, SOC 2 evidence workflow
name: regulated-tenant-01
tenant_id: tenant_regulated_01
plan: enterprise
rules:
- tool: query_database
conditions:
# Enforce tenant scoping on every query
- match:
arguments.query: "tenant_id\s*=\s*'tenant_regulated_01'"
action: allow
- match:
arguments.query: ".*"
action: deny
reason: "Query must include tenant_id = 'tenant_regulated_01'"
- tool: send_email
constraints:
rate_limit: 50/hour
conditions:
- match:
arguments.to: "@approved.example$"
action: allow
- match:
arguments.to: ".*"
action: require_approval
approval:
channel: approval_channel
webhook: "https://hooks.slack.com/services/<workspace>/<channel>/<secret>"
timeout: 600s
- tool: access_file
conditions:
- match:
arguments.path: "^/data/tenant_regulated_01/"
action: allow
- match:
arguments.path: "^/data/"
action: deny
reason: "Cross-tenant file access denied"
default_action: deny
logging:
level: full
retention: 3years# Local tenant: Open Source, basic controls
name: local-tenant-01
tenant_id: tenant_local_01
plan: open_source
rules:
- tool: query_database
conditions:
- match:
arguments.query: "tenant_id\s*=\s*'tenant_local_01'"
action: allow
- match:
arguments.query: ".*"
action: deny
- tool: send_email
constraints:
rate_limit: 10/hour # lower limit for Open Source
action: allow
- tool: access_file
conditions:
- match:
arguments.path: "^/data/tenant_local_01/"
action: allow
- match:
arguments.path: ".*"
action: deny
default_action: deny
logging:
level: decisions_only # reduced logging for Open Source
retention: 90daysThe Runtime: Tenant Context Injection
The agent runtime is shared. For governed requests, Veto receives the tenant context and evaluates the correct policy. There is no conditional logic in your agent code. The same protect() call handles tenant-specific policy lookup:
import os
from veto import Veto, Decision
import anthropic
client = anthropic.Anthropic()
veto = Veto(api_key=os.environ["VETO_API_KEY"], project="multi-tenant-agent")
async def handle_agent_request(user_message: str, tenant_id: str, user_id: str):
"""Shared agent code. Veto handles policy routing."""
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model=os.environ["ANTHROPIC_MODEL"],
max_tokens=4096,
tools=SHARED_TOOLS,
messages=messages,
)
if response.stop_reason != "tool_use":
return response
tool_blocks = [b for b in response.content if b.type == "tool_use"]
tool_results = []
for block in tool_blocks:
# Veto looks up the tenant's policy and evaluates against it
decision = veto.protect(
tool=block.name,
arguments=block.input,
context={
"tenant_id": tenant_id, # this determines which policy applies
"user_id": user_id,
"plan": get_tenant_plan(tenant_id),
}
)
if decision.action == Decision.DENY:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"BLOCKED: {decision.reason}",
"is_error": True,
})
elif decision.action == Decision.APPROVAL_REQUIRED:
approval = veto.wait_for_approval(
decision_id=decision.id, timeout=600
)
if approval.granted:
result = await execute_tool(block.name, block.input, tenant_id)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
else:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"DENIED: {approval.reason}",
"is_error": True,
})
else:
result = await execute_tool(block.name, block.input, tenant_id)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})Vector Database Isolation
Vector databases are the highest-risk point of cross-tenant leakage. Semantic search is approximate by design. Two approaches reduce leakage risk:
- Namespaces: Create a unique namespace per tenant. Queries are scoped to the namespace. This has the smallest cross-tenant result surface. Higher cost (one index per tenant) but strongest isolation.
- Metadata filtering: Stamp every vector with a
tenant_idmetadata field and enforce a mandatory filter on each query. Lower cost but requires discipline: each write must include the metadata, each read must include the filter. A single missing filter leaks data.
Veto enforces the second approach at the policy level. If an agent's vector_search tool call does not include the correct tenant filter in its arguments, the call is denied before it reaches the database.
Rate Limiting Per Tenant
Global rate limits are too blunt for multi-tenant systems. If your limit is 1,000 tool calls per hour across all tenants, one noisy tenant can starve the others. Veto enforces rate limits per tenant, per tool, tracked by tenant context:
# Rate limits scale with plan tier
rate_limits:
team_cloud:
query_database: 100/hour
send_email: 10/hour
vector_search: 200/hour
total_tool_calls: 500/hour
vendor_growth:
query_database: 1000/hour
send_email: 100/hour
vector_search: 2000/hour
total_tool_calls: 5000/hour
enterprise:
query_database: 10000/hour
send_email: 1000/hour
vector_search: 20000/hour
total_tool_calls: 50000/hourDecision Records Per Tenant
Enterprise tenants expect dedicated decision records. When a regulated customer's compliance team asks "show me each governed action the agent took on our data during the review window," you need to produce that report without reconstructing it from scattered service traces. Veto's decision records are tenant-scoped. Each decision record entry includes the tenant context, and decision records can be exported per tenant for evidence review.
Defense in Depth
No single isolation mechanism is sufficient. A production multi-tenant agent system layers defenses:
- Network layer: Tenant-scoped credentials for external APIs. An agent acting on behalf of Tenant A cannot use Tenant B's API keys.
- Storage layer: Namespaced or prefixed access to files, vectors, and databases. The storage system enforces boundaries independently.
- Policy layer (Veto): Runtime authorization on each governed tool call. Even if the storage layer has a misconfiguration, the policy engine blocks cross-tenant access.
- Monitoring layer: Anomaly detection on cross-tenant access patterns. If Tenant A's agent suddenly starts querying paths outside its prefix, alert immediately.
If network filtering fails, the storage layer still protects data. If a container escape happens, the host has no cross-tenant credentials. Each layer reduces the blast radius.
First governed call
Adding multi-tenant support to an existing agent is one configuration change in Veto: include tenant_id in your protect() context and define tenant-specific policies. Your agent code stays identical across tenants.
Sign up to add tenant isolation to your agent, or read about decision records for per-tenant compliance logging.
Implementation paths
FAQ
How should multi-tenant AI agents enforce tenant isolation?⌄
Bind tenant context to each governed tool call and evaluate it before execution. Policies should deny cross-tenant reads, writes, exports, and integration calls even if the model attempts them with valid credentials or retrieved context.
Why are vector database filters not enough for tenant isolation?⌄
Vector filters reduce retrieval risk, but agents can still call tools that read, write, export, or mutate tenant data. Runtime authorization checks the action itself and can block cross-tenant tool calls after retrieval and before execution.
What control evidence do enterprise buyers expect for multi-tenant agents?⌄
They expect tenant ID, actor, tool name, arguments, policy version, decision, timestamp, approver when required, and verification-ready records, where configured, that support SOC 2, GDPR, and customer security reviews.
Sign up