Middleware catalog¶

All middlewares live in spine-middleware and are built purely on the kernel hook points. The first group is registered by name for spine.toml chains.

Execution & reliability¶

Middleware	Hook(s)	What it does
`Retry(max_attempts, base, factor, jitter)`	`on_error`	Exponential backoff + full jitter on provider errors
`ModelFallback(*providers)`	`on_error`	Switch to the next provider when one fails
`LoopGuard(window, max_repeats)`	`after_model`	Stop when the same tool action repeats (`StopReason.LOOP`)
`CircuitBreaker(threshold, cooldown_s)`	`before/after_model`, `on_error`	Open after N failures; fail fast for a cooldown
`RateLimit(max_calls, per_s)`	`before_model`	Per-process token-bucket on model calls

Cost, caching & shaping¶

Middleware	Hook(s)	What it does
`CostTracking(input_per_mtok, output_per_mtok)`	`after_model`	Fill `cost_usd` from a price table so cost guards bite
`Cache(ttl_s, max_size)`	`before/after_model`	Serve an identical request from a content-hashed cache (free on hit)
`Compaction(max_messages, keep_last)`	`before_model`	Trim long histories non-destructively
`StructuredOutput(schema, max_repairs)`	`before/after_model`	Validate the final answer vs a Pydantic schema, repairing on failure

Tools¶

Middleware	Hook(s)	What it does
`ToolTimeout(timeout_s, tools=...)`	`before_tool`	Per-tool wall-clock timeout (kernel cancels)
`ToolOutputTruncation(max_chars)`	`after_tool`	Cap huge tool outputs before re-feeding context
`Idempotency(tools=..., store=...)`	`before/after_tool`	Run a side-effecting tool once per `(tool, args)`
`Sandbox(tools=..., timeout_s, max_cpu_s, max_memory_mb)`	`before_tool`	Run a sync tool in a resource-limited subprocess (POSIX)

Safety & multi-tenancy¶

Middleware	Hook(s)	What it does
`PIIRedaction(entities=...)`	`after_tool`, `after_model`	Redact PII from tool output, traces, and final answer
`PromptInjectionScreen(action="annotate"\\|"block")`	`after_tool`	Treat tool output as untrusted data
`ContentPolicy(banned=..., validate=...)`	`before/after_model`	Block input/output (`StopReason.GUARDRAIL`)
`TenantBudget(max_cost_usd, max_tokens)`	`before/after_model`	Cumulative per-tenant ceiling across runs

Memory & replay¶

Middleware	Hook(s)	What it does
`MemoryRecall(memory, k, scope_session)`	`before_model`, `on_run_end`	Inject recalled memories; persist the exchange
`Recorder()` / `Replayer(recording)`	record / replay	Deterministic replay of model + tool outputs

Observability¶

Middleware	Hook(s)	What it does
`ConsoleLogger()`	all hooks	Opt-in pretty terminal log of each step/tool/result (Rich if installed)
`OTelMiddleware(tracer=...)` (from `spine-otel`)	run/model/tool spans	One OpenTelemetry span tree per run

See Guardrails & safety and Middleware concepts for usage and ordering.