Together AI runtime authorization

Name: Veto
Availability: InStock
Author: Veto

Wrap Together AI tool calls with Veto. For the models you host on Together, each governed tool selection is evaluated before dispatch: allow, review, or deny, with an exportable decision record per governed decision.

Why Together AI needs guardrails

Together AI hosts many open-weight and hosted models with tool-calling support across sizes and providers. The platform's value is choice. The risk is that every model has its own jailbreak history, its own training quirks, and its own tool-calling personality. Without guardrails at the dispatch boundary, switching models also changes the behavior you are relying on.

Open-weight models are also the most heavily fine-tuned. Custom adapters, LoRAs, and dedicated endpoints can shift behavior in ways the original safety training does not cover. Enforcement at the tool boundary stays constant: it is independent of which fine-tune produced the tool call.

Model heterogeneity

Each model family has a different jailbreak surface. Per-model safety analysis does not scale. Enforcement does.

Fine-tune drift

Dedicated endpoints and LoRAs change behavior. The guardrail boundary enforces invariants regardless of training drift.

A/B model swaps

Teams hot-swap models for cost or speed. Without policy-level enforcement, swapping the model also swaps the safety profile.

Before and after Veto

The left tab shows standard Together AI function calling. The model returns tool_calls, your code dispatches them. The right tab adds Veto between the selection and the execution. Same model, same tools, each governed call evaluated against policy first.

import os
import json
from together import Together

client = Together(api_key=os.environ["TOGETHER_API_KEY"])

tools = [
    {
        "type": "function",
        "function": {
            "name": "run_data_export",
            "description": "Export a customer dataset to S3",
            "parameters": {
                "type": "object",
                "properties": {
                    "dataset": {"type": "string"},
                    "bucket": {"type": "string"},
                    "include_pii": {"type": "boolean"},
                },
                "required": ["dataset", "bucket"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "deploy_workflow",
            "description": "Deploy a workflow definition to production",
            "parameters": {
                "type": "object",
                "properties": {
                    "workflow_name": {"type": "string"},
                    "definition": {"type": "string"},
                },
                "required": ["workflow_name", "definition"],
            },
        },
    },
]

response = client.chat.completions.create(
    model=os.environ["TOGETHER_MODEL"],
    messages=[{"role": "user", "content": user_message}],
    tools=tools,
    tool_choice="auto",
)

# Together hosts multiple model families that can return tool_calls.
# Your code runs them: no guardrail step in between.
for tool_call in response.choices[0].message.tool_calls or []:
    args = json.loads(tool_call.function.arguments)
    execute_tool(tool_call.function.name, args)

Multi-model dispatch with shared policy

Together AI's strength is letting you mix models for different paths. Veto enforces the same policy across each model path. Track which model produced which call via context tags and condition policies on the model id when needed.

together_multi_model_with_veto.py

import os
import json
from together import Together
from veto_sdk import Veto

client = Together(api_key=os.environ["TOGETHER_API_KEY"])
veto = Veto(api_key=os.environ["VETO_API_KEY"])

# Together hosts many open models. Enforcement is model-agnostic: the same policies cover each configured model.
MODELS = [
    "TOGETHER_MODEL",
    "FALLBACK_MODEL",
    "SPECIALIZED_MODEL",
]

def run_turn(model: str, messages: list, tools: list, ctx: dict):
    response = client.chat.completions.create(
        model=model, messages=messages, tools=tools, tool_choice="auto",
    )
    results = []
    for tc in response.choices[0].message.tool_calls or []:
        args = json.loads(tc.function.arguments)
        decision = veto.guard(
            tool=tc.function.name,
            arguments=args,
            context={**ctx, "model": model},
        )
        if decision.decision != "allow":
            results.append({"tool_call_id": tc.id, "blocked": decision.reason})
            continue
        results.append({
            "tool_call_id": tc.id,
            "output": execute_tool(tc.function.name, args),
        })
    return response, results

# Same policy enforcement across the model paths the team experiments with.
for model in MODELS:
    response, results = run_turn(model, messages, tools, {"user_id": user_id})

Policy configuration

Declare guardrail rules in YAML. Per-model overrides via the model context field: e.g. restrict experimental models to dev, allow production models broader coverage.

veto/policies.yaml

rules:
  - name: block_pii_exports
    description: PII exports require compliance review
    tool: run_data_export
    when: args.include_pii == true
    action: require_approval
    approvers: [compliance, privacy-office]
    timeout: 24h

  - name: bucket_allowlist
    description: Data exports may only target internal buckets
    tool: run_data_export
    when: "!args.bucket.startsWith('internal-')"
    action: deny
    message: "Exports restricted to internal-* S3 buckets"

  - name: workflow_deploy_approval
    description: Workflow deploys to prod need engineering sign-off
    tool: deploy_workflow
    when: context.environment == "production"
    action: require_approval
    approvers: [engineering-leads]
    timeout: 2h

  - name: workflow_definition_size
    description: Definitions over 200KB likely include adversarial payloads
    tool: deploy_workflow
    when: args.definition.length > 200000
    action: deny
    message: "Workflow definition exceeds size limit"

  - name: model_specific_caps
    description: Restrict experimental models to dev
    tool: run_data_export
    when: "context.model_family == "experimental" && context.environment != "development""
    action: deny
    message: "Experimental models restricted to development environment"

How Veto fits

Install the SDK

pip install veto-sdk together

Define policies

Create veto/policies.yaml. Match on tool name. Use context fields to condition on model id, environment, or user role.

Guard the dispatch loop

Call veto.guard() for each governed tool_call returned by Together's API before invoking the handler.

Use cases

Data export agents

Agents that export customer datasets to S3. Block PII exports without approval, restrict to internal buckets, record governed export with the originating model and prompt.

Workflow deployment

Agents that deploy workflow definitions to production. Require engineering approval for prod deploys, cap definition size, block deploys outside change windows.

Model A/B testing

Run the same task across configured model families with one policy file. Compare outputs without rebuilding the guardrail surface.

Dedicated endpoint guardrails

Fine-tuned dedicated endpoints can drift from base safety training. Veto's policy layer stays constant: fine-tuning the model does not change the policy check at the tool boundary.

Frequently asked questions

Which Together AI models work with Veto?

Use Together models or endpoints that emit tool or function calls through the chat API. Keep the model list in your own integration tests because Together's catalog changes; Veto guards the dispatch path, not the model.

Does Veto work with Together's dedicated endpoints?

Yes. Dedicated endpoints serve the same chat completions API. The tool_call shape is the same, so the integration point is identical. Veto's policy enforcement is unaffected by fine-tuning, LoRA adapters, or custom dedicated infrastructure.

Can I gate experimental models?

Yes. Include the model id in the context passed to veto.guard(). Policies can condition on the model: for example, restrict experimental models to development while allowing approved model families in production.

Does this work with Together's batch inference?

Yes. Batch inference returns the same tool_call structure per request. Apply veto.guard() in your post-processing loop before dispatching the tools. For batch, the policy decision runs in the post-processing loop before tools dispatch.

Related integrations

OpenAI

GPT function call guardrails

Bedrock

AWS Bedrock agent guardrails

Vercel AI SDK

Guardrails for streaming AI SDK tool calls

Wrap one Together AI tool path and inspect the decision record.