Skip to content

Guardrails & safety

Safety is built on hook points: tool output is treated as untrusted data, and blocking guardrails stop the run cleanly (a guardrail stop reason, not a crash).

PII redaction

Redacts email / SSN / credit-card / phone / IPv4 from tool output before it reaches the model or the trace, and from the final answer:

from spine_middleware import PIIRedaction

agent = Agent("openai:gpt-4o-mini", tools=tools, middleware=[PIIRedaction()])

Prompt-injection screening

Treats tool output as untrusted — annotate with a caution banner (default) or block on injection patterns:

from spine_middleware import PromptInjectionScreen

# annotate (default) — wrap suspicious output so the model treats it as data
PromptInjectionScreen()
# or hard-block
PromptInjectionScreen(action="block")

Content policy

Block the user input and/or the final answer on banned patterns or a custom predicate:

from spine_middleware import ContentPolicy

ContentPolicy(banned=["password", "ssn"])
ContentPolicy(validate=lambda text: len(text) < 2000)

Reliability primitives

Middleware Protects against
CircuitBreaker repeated provider failures — fail fast for a cooldown
RateLimit exceeding a call budget (token bucket)
Idempotency duplicate side effects on retries
Sandbox runaway CPU/memory in a sync tool (subprocess + rlimits)

Per-tenant budgets

Enforce a cumulative cost/token ceiling per tenant across many runs:

from spine_middleware import TenantBudget

budget = TenantBudget(max_cost_usd=10.0)
agent = Agent("openai:gpt-4o-mini", middleware=[budget], tenant_id="acme")

Sandbox scope

Sandbox is a resource sandbox (stops runaway CPU/memory/hangs), not a security jail. For genuinely untrusted code use a container or VM boundary.