Middleware catalog
All middlewares live in spine-middleware and are built purely on the kernel
hook points. The first group is registered by name
for spine.toml chains.
Execution & reliability
| Middleware |
Hook(s) |
What it does |
Retry(max_attempts, base, factor, jitter) |
on_error |
Exponential backoff + full jitter on provider errors |
ModelFallback(*providers) |
on_error |
Switch to the next provider when one fails |
LoopGuard(window, max_repeats) |
after_model |
Stop when the same tool action repeats (StopReason.LOOP) |
CircuitBreaker(threshold, cooldown_s) |
before/after_model, on_error |
Open after N failures; fail fast for a cooldown |
RateLimit(max_calls, per_s) |
before_model |
Per-process token-bucket on model calls |
Cost, caching & shaping
| Middleware |
Hook(s) |
What it does |
CostTracking(input_per_mtok, output_per_mtok) |
after_model |
Fill cost_usd from a price table so cost guards bite |
Cache(ttl_s, max_size) |
before/after_model |
Serve an identical request from a content-hashed cache (free on hit) |
Compaction(max_messages, keep_last) |
before_model |
Trim long histories non-destructively |
StructuredOutput(schema, max_repairs) |
before/after_model |
Validate the final answer vs a Pydantic schema, repairing on failure |
| Middleware |
Hook(s) |
What it does |
ToolTimeout(timeout_s, tools=...) |
before_tool |
Per-tool wall-clock timeout (kernel cancels) |
ToolOutputTruncation(max_chars) |
after_tool |
Cap huge tool outputs before re-feeding context |
Idempotency(tools=..., store=...) |
before/after_tool |
Run a side-effecting tool once per (tool, args) |
Sandbox(tools=..., timeout_s, max_cpu_s, max_memory_mb) |
before_tool |
Run a sync tool in a resource-limited subprocess (POSIX) |
Safety & multi-tenancy
| Middleware |
Hook(s) |
What it does |
PIIRedaction(entities=...) |
after_tool, after_model |
Redact PII from tool output, traces, and final answer |
PromptInjectionScreen(action="annotate"\|"block") |
after_tool |
Treat tool output as untrusted data |
ContentPolicy(banned=..., validate=...) |
before/after_model |
Block input/output (StopReason.GUARDRAIL) |
TenantBudget(max_cost_usd, max_tokens) |
before/after_model |
Cumulative per-tenant ceiling across runs |
Memory & replay
| Middleware |
Hook(s) |
What it does |
MemoryRecall(memory, k, scope_session) |
before_model, on_run_end |
Inject recalled memories; persist the exchange |
Recorder() / Replayer(recording) |
record / replay |
Deterministic replay of model + tool outputs |
Observability
| Middleware |
Hook(s) |
What it does |
ConsoleLogger() |
all hooks |
Opt-in pretty terminal log of each step/tool/result (Rich if installed) |
OTelMiddleware(tracer=...) (from spine-otel) |
run/model/tool spans |
One OpenTelemetry span tree per run |
See Guardrails & safety and Middleware
concepts for usage and ordering.