Haystack runtime authorization
Wrap Haystack Tool objects, Agent components, and pipeline tool calls with Veto. Each governed invocation is evaluated before dispatch: allow, review, or deny, with an exportable decision record per governed decision.
Why Haystack needs guardrails
Haystack pipelines combine retrieval with generation, and the Agent component lets the model issue tool calls between turns. The framework runs the underlying Python function whenever the model selects the tool. The retrieved context can carry adversarial instructions: a document in your knowledge base can tell the agent to delete the production namespace, and Haystack will dispatch the call.
RAG amplifies this risk. Every retrieved chunk feeds into the model. A poisoned document becomes part of the prompt. Without guardrails at the tool boundary, an agent that reads and writes the same knowledge base can be steered into corrupting it. See why agent guardrails are their own layer.
Documents in your store can carry prompt instructions. The agent reads them as context and acts on them as commands.
A single Haystack Agent often wraps the document store AND external webhooks. One injection traverses both.
Haystack's tracing captures component runs. It does not produce per-tool-call authorization evidence.
Before and after Veto
The left tab shows a standard Haystack Agent with three Tool objects. The right tab adds Veto inside each tool function. Same agent, same tools, each governed call evaluated against policy first.
import os
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool
from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
def index_document(doc_id: str, content: str, namespace: str) -> str:
"""Index a new document into the knowledge base."""
return document_store.write(doc_id, content, namespace)
def delete_namespace(namespace: str) -> str:
"""Wipe an entire namespace of documents."""
return document_store.delete_namespace(namespace)
def push_to_webhook(url: str, payload: dict) -> str:
"""POST a JSON payload to a customer webhook."""
return http.post(url, json=payload).text
tools = [
Tool(
name="index_document",
description="Index a new document",
parameters={"doc_id": str, "content": str, "namespace": str},
function=index_document,
),
Tool(
name="delete_namespace",
description="Delete all documents in a namespace",
parameters={"namespace": str},
function=delete_namespace,
),
Tool(
name="push_to_webhook",
description="Send a webhook payload",
parameters={"url": str, "payload": dict},
function=push_to_webhook,
),
]
agent = Agent(
chat_generator=OpenAIChatGenerator(model=os.environ["OPENAI_MODEL"]),
tools=tools,
system_prompt="You manage documents and webhooks for customer integrations.",
)
# A prompt injection inside a retrieved document can steer the agent
# into calling delete_namespace(namespace="production"). Haystack runs it.
result = agent.run(messages=[ChatMessage.from_user(user_message)])Tool calls inside RAG pipelines
Haystack pipelines often pass tools into the generator so the model can read retrieved context AND execute side effects. Veto guards the tool body. The pipeline topology, retrievers, and prompt builders stay outside the policy surface.
import os
from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.tools import Tool
from veto_sdk import Veto
veto = Veto(api_key=os.environ["VETO_API_KEY"])
def execute_sql(query: str) -> str:
"""Run a SQL query and return results."""
decision = veto.guard(
tool="execute_sql",
arguments={"query": query},
context={"role": "rag-pipeline"},
)
if decision.decision != "allow":
return f"Blocked: {decision.reason}"
return db.execute(query)
sql_tool = Tool(name="execute_sql", description="Run SQL", parameters={"query": str}, function=execute_sql)
# RAG pipeline that injects retrieved context AND can run SQL on demand.
# Veto can require retrieval-driven SQL to stay inside SELECT.
rag = Pipeline()
rag.add_component("embedder", OpenAITextEmbedder())
rag.add_component("retriever", InMemoryEmbeddingRetriever(document_store=store))
rag.add_component("prompt", PromptBuilder(template=RAG_TEMPLATE))
rag.add_component("llm", OpenAIGenerator(model=os.environ["OPENAI_MODEL"], tools=[sql_tool]))
rag.connect("embedder.embedding", "retriever.query_embedding")
rag.connect("retriever.documents", "prompt.documents")
rag.connect("prompt.prompt", "llm.prompt")Policy configuration
Declarative YAML, version controlled. Apply the same policies across Haystack Agent in your stack.
rules:
- name: block_production_namespace_deletes
description: Never delete the production namespace from an agent
tool: delete_namespace
when: args.namespace in ["production", "default", "main"]
action: deny
message: "Critical namespaces cannot be deleted by agents"
- name: approve_other_namespace_deletes
description: Other namespace deletes need a human
tool: delete_namespace
action: require_approval
approvers: [data-platform]
timeout: 2h
- name: webhook_allowlist
description: Only POST to allowlisted webhook domains
tool: push_to_webhook
when: "!args.url.match(/^https:\\/\\/(api\\.examplepany|hooks\\.partner)\\.example\\//)"
action: deny
message: "Webhook target not in allowlist"
- name: prevent_secret_leak_in_indexing
description: Refuse to index documents containing API keys
tool: index_document
when: "args.content =~ /(sk-[A-Za-z0-9]{32,}|AKIA[0-9A-Z]{16})/"
action: deny
message: "Document appears to contain a credential: refused"
- name: read_only_sql
description: RAG SQL tool may only SELECT
tool: execute_sql
when: "!args.query.upper().trimStart().startsWith('SELECT')"
action: deny
message: "Only SELECT statements are permitted from the RAG pipeline"How Veto fits
Install the SDK
pip install veto-sdk haystack-aiDefine policies
Create veto/policies.yaml. Match on the same name you use in the Tool constructor.
Guard each Tool function
Call veto.guard() at the guarded functions passed into Haystack tools. Pipelines and Agent components stay outside the policy surface.
Use cases
Customer support RAG
A Haystack support agent that searches docs and creates tickets. Restrict ticket creation to authenticated users, block bulk ticket spam, require approval for refund tickets.
Knowledge base writes
Agents that index new documents into the knowledge base. Block content containing API keys or credentials, restrict namespaces, block production-data deletion on the governed path.
Webhook fan-out
Pipelines that POST results to customer webhooks. Allowlist target domains, validate payload shape, cap notification rate per customer.
RAG with SQL
Pipelines that combine document retrieval with on-demand SQL queries. Require SELECT-only behavior, apply row limits, block PII tables: even when the model is asked to do otherwise.
Frequently asked questions
Does Veto work with Haystack 2.x pipelines and Agents?
Can Veto enforce policies on retrieved context?
What about Haystack tracing and observability?
Does this work with deepset Cloud deployments?
Related integrations
Wrap one Haystack tool path and inspect the decision record.