Every data point on this page came from live API calls made to production. Nothing simulated, nothing extrapolated. Run your own tests - the API is open.
Every attack type tested live against /v1/validate. Green = caught by the gateway before reaching downstream agents.
Wall-clock time measured server-side from 30 sequential /v1/validate calls. These are the numbers from the actual API response field latency_ms.
/v1/compress tested on conversations from 4 to 32 messages. Before/after token counts are live API responses. GPT-4o pricing at $2.50/1M input tokens.
/v1/swarm/check evaluates agent chain confidence using geometric mean. One bad output kills the chain - saving all downstream LLM calls. Clean chains pass through untouched.
/v1/compress summarizes accumulated context between swarm steps. The longer the chain, the more it saves. Pure CPU - no LLM calls.
/v1/context/check returns real-time action recommendations. Tested at 9 fill levels from 10% to 110%. Actions escalate based on context percentage and unsaved work status.
The integrity guard inspects for fabrication patterns - fake URLs, reasoning leakage, and citation-style confidence inflation. Tested live with 8 messages covering different fabrication types.
Wall-clock time for parallel batches of 1 to 50 requests fired concurrently from a single client. Shows how the gateway handles concurrency without degradation.