API Documentation
Complete reference for the CR Gateway API. Validate LLM responses, kill bad agent chains, compress context, and relay messages between agents — all with standard HTTP.
Quick Start
Get from zero to validated in under a minute.
Sign Up
Create an account on the home page or call POST /v1/onboard programmatically. You will receive an API key instantly.
Save Your Key
Store the returned api_key securely. It starts with bc_live_ and cannot be retrieved after creation.
Make Your First Call
Send an LLM response to /v1/validate to check safety, confidence, and hallucination.
First Request
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/validate \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "message": { "type": "analysis", "content": "Revenue increased 15% in Q4 driven by enterprise expansion.", "confidence": 0.88 } }'
Response
{
"valid": true,
"checks": {
"safety": { "passed": true },
"confidence": { "passed": true, "raw": 0.88, "threshold": 0.65 },
"hallucination": { "passed": true, "flags": [] },
"danger_terms": { "passed": true }
},
"latency_ms": 1,
"request_id": "req_a1b2c3d4e5"
}
Authentication
All endpoints except /health and /v1/onboard require authentication via the X-API-Key header.
X-API-Key: bc_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
API keys are provisioned through the signup form or the POST /v1/onboard endpoint. Keys begin with bc_live_ and are 40+ characters long.
Requests without a valid key receive a 401 Unauthorized or 403 Forbidden response.
Validate an LLM response against safety, confidence, hallucination, and danger term checks. Pure CPU — sub-millisecond execution, zero network calls.
Request Body
{
"message": {
"type": "string", // message type (e.g. "analysis", "deal_response")
"content": "string", // the LLM output to validate
"confidence": 0.85, // 0-1 or null (LLM self-assessment)
"metadata": {} // optional key-value pairs
},
"options": { // optional
"checks": ["safety", "confidence", "hallucination", "danger_terms"],
"session_id": "string" // optional session context
}
}
Response 200 / 422
{
"valid": true,
"checks": {
"safety": { "passed": true },
"confidence": { "passed": true, "raw": 0.85, "threshold": 0.65 },
"hallucination": { "passed": true, "flags": [] },
"danger_terms": { "passed": true }
},
"latency_ms": 1,
"request_id": "req_a1b2c3d4e5"
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/validate \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "message": { "type": "analysis", "content": "Based on market data, growth is projected at 12%.", "confidence": 0.85 } }'
null for confidence when the LLM does not provide a self-assessment. Null confidence always passes the threshold check. If a calibration oracle model is trained for your account, the response will include a calibrated confidence alongside the raw value.
Validate a message and store it in VERNOT-B binary format. Messages are stored with automatic compression and are retrievable via GET /v1/messages/:id.
Request Body
{
"session_id": "string", // required — groups messages together
"message": {
"type": "string",
"content": "string",
"confidence": 0.9,
"metadata": {}
}
}
Response 201
{
"stored": true,
"message_id": "resp_f7e8d9c0b1",
"session_id": "session_123",
"wire_format": "VERNOT-B",
"compression": {
"original_bytes": 284,
"stored_bytes": 190,
"savings_pct": 33
},
"validation": { /* same shape as /v1/validate response */ }
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/store \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "session_id": "session_123", "message": { "type": "analysis", "content": "Quarterly revenue hit $4.2M.", "confidence": 0.92 } }'
"stored": false with a 422 status and the validation details.
Retrieve a previously stored message by its ID. Messages are scoped to your company — you can only retrieve your own. The ID format is resp_ followed by hex characters.
Response 200
{
"message": {
"id": "resp_f7e8d9c0b1",
"type": "analysis",
"content": "Quarterly revenue hit $4.2M.",
"confidence": 0.92,
"metadata": {},
"timestamp": "2026-03-16T12:00:00.000Z",
"signature": "ed25519_hex_string"
},
"storage_metadata": {
"type": "analysis",
"confidence": 0.92,
"timestamp": "2026-03-16T12:00:00.000Z",
"session_id": "session_123",
"wire_format": "B",
"original_size": 284,
"wire_size": 190
}
}
curl
curl https://cr-gateway-worker.jnowlan21.workers.dev/v1/messages/resp_f7e8d9c0b1 \ -H "X-API-Key: YOUR_API_KEY"
Fail-fast chain check for agent swarms. Given a chain of agents with confidence scores, determines whether the chain should proceed or be killed early. Uses the probability chain rule: chain confidence = product of all individual confidences.
Request Body
{
"session_id": "string",
"agent_chain": [ // also accepts "agents" as field name
{ "agent_id": "planner", "confidence": 0.92 },
{ "agent_id": "researcher", "confidence": 0.87 },
{ "agent_id": "writer", "confidence": 0.45 }
],
"threshold": 0.65 // optional — defaults to your tenant threshold
}
Response 200
{
"proceed": false,
"reason": "Weak links detected: [writer]. Kill chain to save downstream LLM calls.",
"chain_confidence": 0.36,
"weak_links": [
{ "agent_id": "writer", "confidence": 0.45, "position": 2 }
],
"request_id": "req_x1y2z3",
"latency_ms": 0
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/swarm/check \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "session_id": "swarm_001", "agent_chain": [ { "agent_id": "planner", "confidence": 0.92 }, { "agent_id": "researcher", "confidence": 0.87 }, { "agent_id": "writer", "confidence": 0.45 } ] }'
null confidence is always treated as a weak link. This prevents agents that refuse to self-assess from silently poisoning a chain.
Validated agent-to-agent message delivery. The gateway validates the message, stores it internally in VERNOT-B, then delivers clean JSON to the recipient webhook. The recipient never sees VERNOT — they get standard JSON with a gateway_validated: true flag.
Request Body
{
"session_id": "string", // required
"deliver_to": "https://agent-b.example.com/inbox", // HTTPS only
"message": {
"type": "string",
"content": "string",
"confidence": 0.9,
"metadata": {}
}
}
Response 200 / 502
{
"relayed": true,
"message_id": "resp_a1b2c3",
"validation": { /* /v1/validate response */ },
"delivery": {
"status": 200,
"latency_ms": 142
},
"stored_id": "resp_a1b2c3",
"latency_ms": 156
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/relay \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "session_id": "relay_001", "deliver_to": "https://agent-b.example.com/inbox", "message": { "type": "handoff", "content": "Analysis complete. Passing to reviewer.", "confidence": 0.91 } }'
deliver_to URL must use HTTPS. Private/loopback IPs (127.x, 10.x, 192.168.x, etc.) are blocked to prevent SSRF. Delivery has an 8-second timeout. The webhook receives an X-Gateway-Signature HMAC header for authenticity verification.
Compress a conversation history to reduce tokens for your next LLM call. Pure CPU — no LLM calls, no external network. Supports four output formats and three compression strategies.
Request Body
{
"messages": [
{ "role": "user", "content": "What is the price?" },
{ "role": "assistant", "content": "The price is $50/unit." },
// ... more messages
],
"options": {
"target_reduction": 0.5, // 0-1, default 0.5 (50% reduction target)
"preserve_recent": 3, // keep last N messages verbatim (default 3)
"strategy": "summarize", // "summarize" | "deduplicate" | "trim"
"format": "compressed" // "compressed" | "briefing" | "structured" | "both"
}
}
Output Formats
| Format | Returns | Best For |
|---|---|---|
compressed |
Array of compressed messages (ready to feed to LLM) | Drop-in context replacement |
briefing |
Natural language summary string (context field) |
System prompt injection |
structured |
JSON object with state, parties, signals, key numbers, action items | Programmatic analysis |
both |
Both context (briefing) and structured fields |
Full state extraction |
Response — format: "compressed" (default)
{
"compressed": [
{ "role": "system", "content": "[Context summary of 8 earlier messages]: ..." },
{ "role": "user", "content": "latest message" }
],
"metrics": {
"original_tokens": 1240,
"compressed_tokens": 580,
"reduction_pct": 53.2,
"messages_original": 12,
"messages_compressed": 4,
"strategy_used": "summarize"
},
"request_id": "req_x1y2z3",
"latency_ms": 2
}
Response — format: "both"
{
"context": "Conversation state: negotiating (10 messages, 5 rounds). Parties: ...",
"structured": {
"state": "negotiating", // "negotiating" | "agreed" | "rejected" | "stale" | "unknown"
"round": 5,
"parties": {
"user": { "last_position": { "rates": ["$50/unit"] }, "message_count": 5 }
},
"agreed_terms": {},
"open_terms": ["delivery date still pending"],
"confidence_trend": [0.85, 0.78, 0.82],
"signals": ["concession", "counter_offer"],
"key_numbers": {
"rates": ["$50/unit"], "amounts": ["$5,000"],
"dates": ["March 20"], "percentages": ["10%"]
},
"topics": ["..."],
"findings": ["..."],
"action_items": ["Send revised proposal by Friday"],
"last_message_role": "assistant",
"message_count": 10
},
"original_tokens": 2400,
"compressed_tokens": 320,
"savings_percent": 86.7,
"request_id": "req_a1b2c3",
"latency_ms": 3
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/compress \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "messages": [ { "role": "user", "content": "What is your best price for 100 units?" }, { "role": "assistant", "content": "We can offer $50/unit for 100 units." }, { "role": "user", "content": "That is too high. Can you do $40?" }, { "role": "assistant", "content": "How about $45/unit?" } ], "options": { "format": "both" } }'
summarize groups older messages into role-tagged summaries with key entity extraction. deduplicate merges messages with >55% word overlap. trim drops oldest messages to fit a token budget. All strategies preserve the most recent N messages verbatim.
Context window management for agents. Monitors context usage and recommends when to flush, wrap up, or emergency save. Prevents agents from running out of context and losing unsaved work.
Request Body
{
"context_used": 75000, // tokens consumed so far
"context_limit": 100000, // max tokens for this model
"tasks_remaining": 3, // how many tasks left to do
"has_unsaved_work": true, // is there work that hasn't been persisted?
"session_id": "string" // optional
}
Response 200
{
"action": "flush_now", // "continue" | "flush_now" | "wrap_up" | "emergency_save"
"reason": "Context 75.0% full with unsaved work.",
"context_pct": 75.0,
"tokens_remaining": 25000,
"budget_per_task": 7333,
"recommendation": "Save your work now, then continue. You have ~25000 tokens left and 3 tasks remaining (~7333 tokens/task).",
"request_id": "req_d4e5f6"
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/context/check \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "context_used": 75000, "context_limit": 100000, "tasks_remaining": 3, "has_unsaved_work": true }'
continue = under 75% used. flush_now = 75-90% or insufficient budget per task. wrap_up = 90%+ with no unsaved work. emergency_save = 90%+ with unsaved work. A reserve of 3,000 tokens is held for the final report.
Orchestrator NEW
Define multi-agent workflows as directed acyclic graphs (DAGs). The gateway fires webhook steps, auto-validates outputs at each node, tracks chain confidence across the entire workflow, and kills weak chains early to save LLM costs.
Create a workflow definition. Each step has a webhook URL, dependencies (other step IDs), and a failure policy. Steps with no dependencies are root steps and fire immediately when a run starts.
Request Body
"name": "Data Pipeline", "description": "Extract, transform, load", "steps": [ { "id": "extract", "name": "Extract Data", "webhook_url": "https://your-api.com/extract", "depends_on": [], "validate_output": true, "on_failure": "retry", "retry_count": 2 }, { "id": "transform", "name": "Transform Data", "webhook_url": "https://your-api.com/transform", "depends_on": ["extract"], "on_failure": "stop" }, { "id": "load", "webhook_url": "https://your-api.com/load", "depends_on": ["transform"], "on_failure": "skip" } ]
Step Options
id | Step ID (auto-generated if omitted) |
webhook_url | HTTPS URL to POST when step is ready (required) |
depends_on | Array of step IDs that must complete first |
validate_output | Auto-validate output (default: true) |
confidence_threshold | Per-step override (0-1) |
on_failure | stop | skip | retry |
retry_count | Max retries (0-5, default 0) |
timeout_ms | Step timeout (1-300s, default 30s) |
Start a workflow run. Root steps (no dependencies) fire immediately via webhook. The gateway POSTs a JSON payload to each step's webhook URL including the run ID, step ID, input data, and a callback_url for the agent to call when done.
Webhook Payload (sent to your agent)
{
"run_id": "run_abc123",
"workflow_id": "wf_def456",
"step_id": "extract",
"step_name": "Extract Data",
"input": { /* your run input */ },
"dependency_outputs": { /* outputs from completed upstream steps */ },
"chain_confidence": 0.92,
"callback_url": "https://cr-gateway.../runs/run_abc123/steps/extract/complete"
}
Report step completion. When your agent finishes its task, POST the result here. The gateway will:
- Validate the output (hallucination, safety, confidence, danger terms)
- Update chain confidence (product of all step confidences)
- If chain confidence drops below threshold — kill remaining steps (swarm fail-fast)
- Fire downstream steps whose dependencies are now met
- Pass dependency outputs to downstream steps so they have context
Request Body
{
"output": "Extracted 500 records from source",
"confidence": 0.92,
"type": "extraction_result"
}
"status": "failure" with an "error" field to report failures. The gateway will apply the step's on_failure policy (stop, skip, or retry).
Get full run status including per-step state, chain confidence, progress counts, and duration.
Cancel a running workflow. All pending and running steps are immediately cancelled.
List all workflow definitions. Returns name, step count, and creation date for each workflow. Max workflows per tier: Free=3, Pro=25, Scale=100, Enterprise=1000.
Self-service tenant signup. Creates a company, generates an API key, and activates the free tier instantly. No authentication required — this is how you get a key.
Request Body
{
"company_name": "Acme AI Corp", // min 2 characters
"email": "dev@acme.ai" // valid email, must be unique
}
Response 201
{
"company_id": "uuid",
"company_name": "Acme AI Corp",
"api_key": "bc_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"tier": "free",
"message": "Welcome to CR Gateway! Save your API key - it won't be shown again.",
"quickstart": {
"validate": "curl -X POST ..."
}
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/onboard \ -H "Content-Type: application/json" \ -d '{ "company_name": "Acme AI Corp", "email": "dev@acme.ai" }'
409 Conflict.
Report whether a previously validated message was correct or incorrect. This data trains your per-tenant confidence calibration oracle. After 50 data points, the oracle auto-builds an isotonic regression model that calibrates future /v1/validate confidence scores.
Request Body
{
"request_id": "req_a1b2c3d4e5", // from the /v1/validate response
"outcome": 1, // 1 = correct, 0 = incorrect
"confidence": 0.85, // optional if stored from original request
"vertical": "freight", // optional — enables per-vertical calibration
"message_type": "analysis" // optional — enables per-type calibration
}
Response 200
{
"recorded": true,
"total_feedback": 52,
"model_status": "ready", // "insufficient_data" | "building" | "ready"
"model_ece": 0.032, // expected calibration error (lower = better)
"request_id": "req_x1y2z3",
"latency_ms": 45
}
curl
curl -X POST https://cr-gateway-worker.jnowlan21.workers.dev/v1/feedback \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "request_id": "req_a1b2c3d4e5", "outcome": 1, "vertical": "freight" }'
/v1/validate calls is automatically looked up — you only need to pass request_id and outcome.
Check the status of your confidence calibration oracle. Returns model metadata if trained, or guidance on how to start training.
Response — No model yet
{
"status": "no_model",
"message": "Submit feedback via POST /v1/feedback to train your confidence oracle. Minimum 50 data points required.",
"feedback_count": 0
}
Response — Model ready
{
"status": "ready",
"trained_at": "2026-03-16T12:00:00.000Z",
"sample_count": 250,
"ece": 0.028,
"calibration_points": 18,
"verticals": ["freight", "insurance"]
}
curl
curl https://cr-gateway-worker.jnowlan21.workers.dev/v1/oracle/status \ -H "X-API-Key: YOUR_API_KEY"
Retrieve your current tenant configuration, including confidence threshold, danger terms, guard checks, and tier.
Response 200
{
"company_id": "uuid",
"company_name": "Acme AI Corp",
"config": {
"confidence_threshold": 0.65,
"danger_terms": [],
"guard_checks": ["reasoning_leakage", "confidence_calibration"],
"allowed_types": null, // null = accept any type
"kill_switch": false,
"safe_mode": false,
"tier": "free"
}
}
curl
curl https://cr-gateway-worker.jnowlan21.workers.dev/v1/config \ -H "X-API-Key: YOUR_API_KEY"
Update your tenant configuration. Only the fields you include will be changed. You can adjust confidence threshold, danger terms, guard checks, and allowed message types.
Request Body
{
"confidence_threshold": 0.70, // 0-1
"danger_terms": ["guaranteed", "risk-free"], // your custom blocked terms
"guard_checks": [ // which hallucination checks to run
"reasoning_leakage",
"confidence_calibration",
"rate_consistency",
"fabrication_markers"
],
"allowed_types": ["analysis", "summary"] // restrict to specific types, or null for any
}
Response 200
{
"updated": true,
"config": { /* full updated config object */ }
}
curl
curl -X PUT https://cr-gateway-worker.jnowlan21.workers.dev/v1/config \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{ "confidence_threshold": 0.70, "danger_terms": ["guaranteed", "risk-free"] }'
reasoning_leakage (detects exposed chain-of-thought), confidence_calibration (flags miscalibrated confidence), rate_consistency (checks numeric consistency), fabrication_markers (catches fabricated data patterns).
Get daily usage statistics for your account. Returns request counts and storage bytes per day.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
days | integer | 30 | Number of days to look back (max 90) |
Response 200
{
"company_id": "uuid",
"tier": "free",
"usage": [
{ "date": "2026-03-16", "requests": 142, "bytes_stored": 28400 },
{ "date": "2026-03-15", "requests": 98, "bytes_stored": 19600 }
]
}
curl
curl "https://cr-gateway-worker.jnowlan21.workers.dev/v1/usage?days=7" \ -H "X-API-Key: YOUR_API_KEY"
Health check endpoint. Returns the service status and current timestamp. No authentication required.
Response 200
{
"status": "ok",
"service": "cr-gateway",
"timestamp": "2026-03-16T12:00:00.000Z"
}
curl
curl https://cr-gateway-worker.jnowlan21.workers.dev/health
SDK
The official TypeScript/JavaScript SDK wraps every endpoint with full type safety.
Install
npm install @cipherandrow/gateway
Usage
import { CRGateway } from '@cipherandrow/gateway'; // Initialize with your API key const gw = new CRGateway('bc_live_your_api_key'); // Validate an LLM response const result = await gw.validate({ type: 'analysis', content: 'Revenue grew 15% in Q4.', confidence: 0.88, }); console.log(result.valid); // true // Store a message const stored = await gw.store( { type: 'analysis', content: 'Revenue grew 15%.', confidence: 0.88 }, 'session_123' ); console.log(stored.message_id); // "resp_..." // Swarm fail-fast check const swarm = await gw.swarmCheck([ { agent_id: 'planner', confidence: 0.92 }, { agent_id: 'researcher', confidence: 0.87 }, { agent_id: 'writer', confidence: 0.91 }, ]); if (!swarm.proceed) { console.log('Killing chain:', swarm.reason); } // Compress context const compressed = await gw.compress(messages, { strategy: 'summarize', format: 'both', }); console.log(compressed.savings_percent + '% saved'); // Context window management const ctx = await gw.contextCheck({ context_used: 75000, context_limit: 100000, tasks_remaining: 3, has_unsaved_work: true, }); if (ctx.action === 'emergency_save') { // Save everything immediately } // Update configuration await gw.updateConfig({ confidence_threshold: 0.70, danger_terms: ['guaranteed', 'risk-free'], }); // Check usage const usage = await gw.getUsage(7); console.log(usage.usage);
Error Handling
import { CRGateway, GatewayError } from '@cipherandrow/gateway'; try { await gw.validate(message); } catch (e) { if (e instanceof GatewayError) { console.log(e.status); // HTTP status code console.log(e.message); // Error message console.log(e.requestId); // Request ID for support } }
Error Handling
All errors return JSON with a consistent shape:
{
"error": "Human-readable error message"
}
HTTP Status Codes
| Code | Meaning | When |
|---|---|---|
200 | OK | Successful request |
201 | Created | Successful onboard or store |
400 | Bad Request | Missing or invalid fields in request body |
401 | Unauthorized | Missing or invalid API key |
403 | Forbidden | API key valid but lacks permission |
404 | Not Found | Unknown endpoint or message ID |
409 | Conflict | Duplicate email on onboard |
415 | Unsupported Media Type | Content-Type is not application/json |
422 | Unprocessable Entity | Message failed validation checks |
429 | Too Many Requests | Rate limit exceeded (includes Retry-After header) |
502 | Bad Gateway | Relay delivery failed (webhook returned error) |
500 | Internal Error | Unexpected server error |
request_id field. Include this when contacting support for faster debugging.
Rate Limits
Rate limits are enforced per API key based on your tier. Exceeding the limit returns 429 Too Many Requests.
| Tier | Requests / Day | Storage | Retention | Price |
|---|---|---|---|---|
| Free | 1,000 | 10 MB | 30 days | $0 |
| Pro | 16,667 (~500K/mo) | 5 GB | 90 days | $49/mo |
| Scale | 166,667 (~5M/mo) | 50 GB | 365 days | $199/mo |
| Enterprise | Unlimited | Unlimited | 365 days | Contact us |
All tiers include access to every endpoint. The free tier defaults to 2 guard checks (reasoning_leakage and confidence_calibration). Pro and above unlock all 4 guard checks.
https://cr-gateway-worker.jnowlan21.workers.dev