How Gatekeeper Routes
Your AI Requests
Every request travels through a deterministic, cryptographically-verified pipeline. Token validation is offline. Rate limiting uses CRDTs. All requests log to JetStream.
Request Flow
Click any step to expand details.
offline validation in offline validation target — no network call, no OCSP, no PKI server. The token IS the credential.
Distributed nodes agree on budget without a coordinator. No Redis, no central DB, no SPOF.
Audit, load shedding, health tracking, and routing state are validated in the target environment.
3 Routing Strategies
Select per request via the X-Routing-Mode header, or set a default per API key.
Routes to the cheapest model that meets the quality threshold. Simple queries go to Llama on Groq at $0.00008/1K tokens. Complex queries escalate to GPT-4o or Claude only when the task complexity score requires it.
- Cost reduction validated during assisted onboarding
- Task complexity scored before routing
- Quality threshold configurable per API key
- Cost tracked via CRDT GCounter across nodes
→ llama-3.1-8b on Groq ($0.00008/1K)
→ claude-3-5-sonnet ($0.003/1K)
→ deepseek-r1 ($0.0005/1K)
JetStream Stream Targets
Stream provisioning, load shedding, audit, and health tracking are validated during assisted onboarding.
BH_AUDIT
All requests — append-only, tamper-evident
BH_EVENTS
Provider failover + routing decisions
BH_HEALTH
Provider p99 latency time-series
BH_ALERTS
Budget exhaustion + SLA breach alerts
BH_SESSIONS
API key session state
BH_CONFIG
Routing rules + capability matrix (KV)
Failover Validation
Provider error classes, retry order, timing, health tracking, and client response behavior are tested in the target environment before hard failover claims are published.
CRDT Token Budget
Distributed rate limiting targets a GCounter CRDT from internal/crdt/. Usage gossip, convergence, and budget behavior are validated during assisted onboarding.
GCounter.Value(key) <= budget // always consistent
Zero Single Point of Failure
Gatekeeper runs as multiple stateless nodes. JetStream provides durable message delivery and audit. BH_HEALTH KV tracks provider state. Custom node can handle any request — no sticky sessions, no shared mutable state outside CRDTs.