Quick Start
Get adaptive gating running in your application in three steps.
1. Create an API key pair
Sign up and create a key pair in the dashboard. You'll get a publishKey (identifies your org) and a secretKey (signs telemetry). The secret is shown once.
2. Install and configure the SDK
npm install @waitstate/sdkimport { WaitState } from '@waitstate/sdk';
const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});3. Gate traffic
Call gate(tag, weight) before processing a request. It's synchronous, in-memory, and never makes a network call.
const decision = ws.gate('free', 1);
if (decision.allowed) { // proceed with request} else { // respond with 429 or queue res.status(429).json({ error: 'rate_limited' });}gate() is always fast, always synchronous, always fail-open.
Tags & Weights
Tags categorize traffic. Weights determine priority. When the reflex engine fires, low-weight traffic is shed first.
- Pass a
tagandweighttogate()on each call. - Create tags in the dashboard with default weights. The SDK doesn't need to know about tags ahead of time.
- Reflex rules can target specific tags (e.g. block "free" when latency > 500ms) or apply to all traffic.
How weight-based gating works
The control plane sets a globalMaxWeight and per-tag tagMaxWeights in the policy. When you call gate(tag, weight), the SDK compares your weight against these limits. If your weight exceeds the max, you're denied.
Example: during a latency spike, the policy might set globalMaxWeight: 5. Traffic with weight 1 ("free") gets blocked. Traffic with weight 10 ("enterprise") still gets blocked because 10 > 5. But the policy can also set tagMaxWeights to {"enterprise": Infinity} to exempt specific tags.
Reflex Rules
Reflex rules define automated responses to backend health metrics. They run in the control plane's PulseAggregator and affect the policy that gets pushed to your SDK instances.
Rule schema
{ "tagName": "free", "metric": "latency", "operator": "gt", "threshold": 500, "action": "block", "actionValue": null, "enabled": true, "priority": 1}| Field | Type | Description |
|---|---|---|
tagName | string | null | Target tag. null means all traffic. |
metric | string | Metric name: latency, errors, p50_latency, p95_latency, p99_latency, or any custom metric name. |
operator | string | gt, gte, lt, lte. |
threshold | number | Value to compare against. |
action | string | throttle or block. |
actionValue | number | null | Multiplier for throttle (e.g. 0.5 = halve allowed weight). Null for block. |
enabled | boolean | Whether the rule is active. Defaults to true. |
priority | number | Evaluation order. Lower = higher priority. |
Throttle vs Block
| Action | Effect on policy | Use when |
|---|---|---|
block | Sets the tag's max weight to 0. All traffic for that tag is denied. | Origin is in danger. Shed this traffic class entirely. |
throttle | Multiplies the tag's max weight by actionValue. E.g. 0.5 halves it. | Origin is stressed but not critical. Reduce load gradually. |
Throttle example
{ "tagName": "pro", "metric": "latency", "operator": "gt", "threshold": 500, "action": "throttle", "actionValue": 0.5, "enabled": true, "priority": 2}When latency exceeds 500ms, the reflex engine multiplies pro's max weight by 0.5. If the default max weight for pro is 10, it drops to 5. A gate('pro', 5) call still passes, but gate('pro', 7) would be denied.
Scenario: layered rules
Three rules, evaluated by priority:
1. Block free tier when latency is critical
{ "tagName": "free", "metric": "latency", "operator": "gt", "threshold": 1000, "action": "block", "priority": 1}2. Throttle free tier when latency is elevated
{ "tagName": "free", "metric": "latency", "operator": "gt", "threshold": 500, "action": "throttle", "actionValue": 0.5, "priority": 2}3. Throttle pro tier when errors spike
{ "tagName": "pro", "metric": "errors", "operator": "gt", "threshold": 50, "action": "throttle", "actionValue": 0.7, "priority": 3}Progressive load shedding
As conditions worsen, more rules fire and lower tiers are shed first:
| Conditions | Rules fired | free | pro | enterprise |
|---|---|---|---|---|
| latency: 80ms, errors: 2 | none | allowed | allowed | allowed |
| latency: 600ms | #2 | throttled | allowed | allowed |
| latency: 1200ms | #1 | blocked | allowed | allowed |
| latency: 1200ms, errors: 60 | #1, #3 | blocked | throttled | allowed |
Manage rules in the dashboard or via the management API.
Pulse & Policy Flow
The SDK communicates with the control plane through two channels. You don't interact with either directly.
Pulses (SDK → Control Plane)
The SDK periodically sends a pulse containing gate() call counts, per-tag metrics, and process-level latency and error counts. Pulses are HMAC-signed with your secret key.
Pulse headers
POST /v1/pulsex-waitstate-id: ws_pub_xxx # your publishKeyx-waitstate-signature: <hmac-hex> # HMAC-SHA256(body + '.' + timestamp, secretKey)x-waitstate-timestamp: 1740000060000content-type: application/jsonPulse payload
{ "instanceId": "web-01", "siteId": "site-prod", "usageDelta": 967, "bouncedUnits": 47, "metrics": { "latency": 142, "errors": 0 }, "tagMetrics": [ { "tag": "free", "latency": 0, "count": 847 }, { "tag": "pro", "latency": 0, "count": 120, "customMetrics": { "queue_depth": { "min": 3, "max": 87, "sum": 450, "count": 10 } } } ], "ts": 1740000060000}| Field | Description |
|---|---|
instanceId | Unique identifier for this SDK instance. Defaults to randomUUID() if not set. |
siteId | Optional. Shards this instance into a separate aggregator for independent health tracking. |
usageDelta | Number of gate() calls since the last pulse. Used for billing and plan cap enforcement. |
bouncedUnits | Number of gate() calls that were denied since the last pulse. |
metrics | Process-level metrics: latency (average ms), errors (count). The reflex engine evaluates rules against these. |
tagMetrics | Array of per-tag counters. Each entry has tag (name), latency (avg ms), count (calls), and optional customMetrics (min/max/sum/count aggregates from report()). Drives per-tag reflex rules. |
ts | Timestamp in epoch milliseconds. Must match the x-waitstate-timestamp header used in HMAC signing. |
Policy (Control Plane → SDK)
The pulse response contains the current policy. The SDK also polls GET /v1/policy/{orgId} with a JWT for edge-cached reads. Both sources update the same in-memory policy cache.
Policy shape
{ "globalMaxWeight": 5, "tagMaxWeights": { "free": 0, "pro": 3 }, "pulseInterval": 2000, "leaseDurationSeconds": 120, "status": "ok" // "ok" | "cap_exceeded"}| Field | Description |
|---|---|
globalMaxWeight | Maximum weight allowed globally. Traffic with weight above this is denied. |
tagMaxWeights | Per-tag weight overrides. A tag set to 0 is fully blocked. |
pulseInterval | Server-controlled pulse interval in ms. SDK adjusts automatically. |
killSignal | Emergency kill. SDK stops pulsing, resets to fail-open, sleeps 24h. |
status | ok, over_limit, or cap_exceeded. When cap is exceeded, SDK fails open and backs off to 5-minute pulses. |
leaseDurationSeconds | How long the SDK trusts the current policy without hearing from the control plane. If the SDK doesn't receive a successful pulse or policy response within this window, it enters safe mode (50 req/sec). Varies by plan: Hobby 300s, Pro 120s, Enterprise 60s. |
gate() never makes a network call. It reads from the in-memory policy cache. Pulses and policy fetches happen in the background on a timer. This is why gate() is synchronous and sub-millisecond.
Authentication
WaitState uses a two-key model:
| Key | Purpose | Visibility |
|---|---|---|
publishKey | Identifies your organization. Sent as x-waitstate-id on pulses. | Semi-public (in your app code). |
secretKey | Signs pulses (HMAC-SHA256). Exchanged for a JWT. Never sent as plaintext on the wire. | Secret. Environment variable only. |
Pulse auth (HMAC-SHA256)
Each pulse is signed: HMAC-SHA256(body + timestamp, secretKey). The signature is sent as x-waitstate-signature. The control plane verifies before accepting.
Policy auth (JWT)
On init, the SDK exchanges your keys for a JWT via POST /v1/auth/token. The JWT is used for GET /v1/policy/{orgId} reads and auto-refreshes 2 minutes before expiry.
Management auth
Dashboard API calls use session-based auth (cookie). Programmatic management uses a bearer token.
Site Sharding
By default, the control plane routes all pulses from the same organization to a single aggregator. If you run multiple sites (e.g. staging, production, or separate apps) within one org, you can isolate them with siteId.
Pass siteId when creating the SDK instance. Each unique siteId gets its own aggregator with independent health tracking, reflex evaluation, and usage accounting. Sites are created automatically on the first pulse. No manual setup required. You can view and manage your sites in the dashboard.
- If omitted,
siteIddefaults to yourorgId(one aggregator per org). - Billing is still per-org. Monthly usage is tracked by
orgIdregardless of how many sites you have. - Each site gets its own policy. A latency spike on staging won't throttle production.
- The number of sites is capped per plan: Hobby allows 1, Pro allows 10, Enterprise is unlimited.
Fail-Open & Safe Mode
WaitState is designed to never block traffic due to its own failures. But unlimited traffic during a prolonged outage is also dangerous. The lease mechanism provides a safety net.
Fail-open guarantees
- Control plane unreachable (short-term): SDK uses the last cached policy. Gate decisions continue normally.
- Never synced: If no policy was ever fetched (bootstrap), the default policy allows everything (
globalMaxWeight: Infinity). - Pulse fails: SDK retries on next interval. Telemetry counters are preserved and re-sent on the next successful pulse.
- Auth fails: SDK starts pulsing anyway with the default interval. Init errors go to
onError. - No reflex rules: All traffic is allowed.
- Unknown tag:
gate()returns allow if the weight is under the global max. - Kill signal: SDK stops pulsing, resets to fail-open default, and resumes after 24 hours.
- Monthly cap exceeded: Control plane returns
status: cap_exceeded. SDK fails open (globalMaxWeight: Infinity) and backs off to 5-minute pulse intervals. When the new month starts, normal policy resumes automatically.
Lease & safe mode
Each policy includes a leaseDurationSeconds field. This is the maximum time the SDK trusts its cached policy without hearing from the control plane. If no successful pulse or policy response arrives within the lease window, the SDK enters safe mode.
| Plan | Lease Duration |
|---|---|
| Hobby | 5 minutes (300s) |
| Pro | 2 minutes (120s) |
| Enterprise | 1 minute (60s) |
Safe mode behavior
The safeModeStrategy constructor option controls what happens when the lease expires:
| Strategy | Behavior |
|---|---|
open (default) | Fail fully open—all gate calls are allowed regardless of weight or tag. |
fixed_rps | Throttle to safeModeMaxRps requests per second per instance (default: 50). Excess calls are denied. |
last_policy | Keep enforcing the last cached policy as-is. Gate checks continue using stale thresholds until sync resumes. |
In all strategies:
gate()returnsreason: lease_expiredfor all calls (both allowed and denied).- A
[WAITSTATE-FATAL]log message fires once when safe mode activates. - When the SDK successfully syncs again, normal policy-based gating resumes.
Why not just fail-open forever?
Pure fail-open means a prolonged control plane outage removes WaitState's intelligence layer entirely. Safe mode provides a hard floor: your API keeps serving traffic, but at a rate that won't overwhelm your backend. Your existing rate limiters and circuit breakers still apply either way.
Bootstrap grace period
The lease clock only starts after the first successful sync. During bootstrap (before the SDK has ever reached the control plane), gate() is fully fail-open. This prevents false safe-mode activations during cold starts or deployment rollouts.
safeModeStrategy in the constructor—open to fail fully open, fixed_rps to throttle, or last_policy to keep enforcing stale rules.
Telemetry & Metrics
Every SDK and the agent expose the same telemetry methods. These feed into the pulse payload so the reflex engine can react to backend health. Without telemetry, reflex rules on latency and errors evaluate against zero.
Built-in metrics
All SDKs provide methods for reporting latency and errors. Method names follow each language's conventions (reportLatency in JS, report_latency in Rust/Python) but the semantics and aggregation behavior are identical.
| Method | Description |
|---|---|
reportLatency(ms, tag?) | Record a latency observation in milliseconds. Averaged per-tag at pulse time. |
reportError(tag?) | Increment the error counter for a tag. Summed at pulse time. |
startTimer(tag?) | Returns a stop function. Calling stop reports elapsed ms as latency via reportLatency. |
// Timer pattern (recommended)const stop = ws.startTimer('search');const result = await doSearch(query);stop(); // reports elapsed ms as latency
// Direct reportingws.reportLatency(42.5, 'search'); // msws.reportError('search');Custom metrics: report(metric, value, tag?)
Report a custom metric value. Each metric is aggregated as min/max/sum/count per pulse window, giving you average, minimum, and maximum values in the control plane. Create reflex rules on any custom metric name to trigger automated responses.
// Report a custom metric (min/max/sum/count aggregated per pulse)ws.report('queue_depth', 87, 'search');ws.report('cache_hit_rate', 0.92, 'search');
// Without a tag, defaults to '__default__'ws.report('active_connections', 142);| Param | Type | Default | Description |
|---|---|---|---|
metric | string | – | Metric name. Lowercase letters, digits, and underscores. Must start with a letter. Max 63 characters. Cannot be a reserved name (latency, errors, p50_latency, p95_latency, p99_latency). |
value | number | – | The metric value to record. |
tag | string | __default__ | Tag to associate the metric with. |
Custom metric reflex rule
Custom metrics work with the same reflex rule engine as latency and errors. The reflex engine evaluates the metric's average (sum/count) against your threshold.
// Block search traffic when queue depth exceeds 100{ "tagName": "search", "metric": "queue_depth", "operator": "gt", "threshold": 100, "action": "block", "actionValue": null, "enabled": true, "priority": 1}Plan limits
| Plan | Custom metrics per tag |
|---|---|
| Hobby | – |
| Pro | 5 |
| Enterprise | Unlimited |
Metrics reported via report() always flow through the pipeline. The plan limit controls how many distinct custom metric names can be used in reflex rules. Excess metrics in the data pipeline are silently trimmed (alphabetical order, kept first N).
reportLatency in JS, report_latency in Rust/Python) but the semantics, aggregation behavior, and naming rules are identical across every SDK and the agent.
Use Cases
WaitState works anywhere you need to shed traffic intelligently under pressure: GraphQL APIs, AI gateways, e-commerce, multi-tenant SaaS, fintech, gaming, IoT, and more. Each use case includes integration code, reflex rules, and a walkthrough of what happens under load.
See all use cases with integration examples →
JavaScript / TypeScript SDK
The official JS/TS SDK. Zero dependencies. All SDKs share the same gate(tag, weight) interface and pulse/policy protocol.
Install
npm install @waitstate/sdkConstructor
const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY, siteId: 'site-prod', // optional, shard key (defaults to orgId) instanceId: 'web-01', // optional, defaults to randomUUID() pulseInterval: 5000, // optional, ms between pulses (server can override) safeModeStrategy: 'open', // optional, 'open' | 'fixed_rps' | 'last_policy' safeModeMaxRps: 50, // optional, max req/sec in safe mode (default: 50) baseUrl: 'https://api.waitstate.io', // optional onError: (err) => console.error('[waitstate]', err),});| Option | Type | Default | Description |
|---|---|---|---|
publishKey | string | – | Required. Your publish key from the dashboard. |
secretKey | string | – | Required. Your secret key. Store in env vars. |
siteId | string | orgId | Shard key for aggregator routing. Use to isolate sites within the same org. |
instanceId | string | randomUUID() | Identifies this SDK instance in telemetry. |
pulseInterval | number | 5000 | Initial pulse interval in ms. Server can override via policy. |
baseUrl | string | https://api.waitstate.io | Control plane URL. |
safeModeStrategy | string | open | Behavior when the lease expires. open = fail fully open, fixed_rps = throttle to safeModeMaxRps, last_policy = keep enforcing the last cached policy. |
safeModeMaxRps | number | 50 | Max requests per second in safe mode. Only used when safeModeStrategy is fixed_rps. |
onError | function | – | Error callback. Init failures, pulse failures, auth failures. |
gate(tag?, weight?)
Synchronous. Returns a GateResult. Never throws.
// Tag-based gating with weightconst d1 = ws.gate('enterprise', 10); // high-priority trafficconst d2 = ws.gate('free', 1); // low-priority trafficconst d3 = ws.gate(); // no tag, weight defaults to 1
// Check the reason for denialif (!decision.allowed) { switch (decision.reason) { case 'tag_blocked': // this tag is explicitly blocked by policy case 'over_weight': // weight exceeds global max case 'global_block': // all traffic blocked case 'kill_signal': // emergency kill from control plane case 'lease_expired': // control plane lease expired, safe mode (safeModeMaxRps req/s) break; }}| Param | Type | Default | Description |
|---|---|---|---|
tag | string | __default__ | Traffic tag (e.g. "free", "pro", "enterprise"). |
weight | number | 1 | Request weight. Higher = higher priority. |
GateResult
// Allowed{ "allowed": true, "reason": "allowed" }
// Denied{ "allowed": false, "reason": "tag_blocked" }{ "allowed": false, "reason": "over_weight" }{ "allowed": false, "reason": "global_block" }{ "allowed": false, "reason": "kill_signal" }
// Safe mode (lease expired, behavior depends on safeModeStrategy){ "allowed": true, "reason": "lease_expired" }{ "allowed": false, "reason": "lease_expired" }| Field | Type | Description |
|---|---|---|
allowed | boolean | Whether the request should proceed. |
reason | string | allowed, over_weight, tag_blocked, global_block, kill_signal, or lease_expired. |
Telemetry & custom metrics
The JS SDK exposes reportLatency(), reportError(), startTimer(), and report() for custom metrics. These are platform-wide features available in every SDK and the agent. See Telemetry & Metrics for full documentation, naming rules, and plan limits.
shutdown()
Async. Stops the pulse timer, clears the auth refresh timer, and sends one final pulse to flush pending telemetry.
// Flush pending pulses on graceful shutdownprocess.on('SIGTERM', async () => { await ws.shutdown(); process.exit(0);});Advanced exports
For low-level use, the SDK also exports individual components:
gate(policy, tag?, weight?)-standalone pure function, takes aBouncerPolicydirectlyPolicyCache-in-memory policy store (get(),update(),reset())TelemetryCollector-accumulates gate metrics between pulsessignPulse(body, timestamp, secretKey)-returns HMAC-SHA256 hex digestcreateSignedHeaders(body, publishKey, secretKey)-returns signed header objectAuthManager-manages JWT token lifecycleWaitStateError-error class withstatus,code,message
Example: Express middleware
import express from 'express';import { WaitState } from '@waitstate/sdk';
const app = express();const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});
app.use((req, res, next) => { const tag = req.user?.plan ?? 'free'; const weight = tag === 'enterprise' ? 10 : tag === 'pro' ? 5 : 1; const decision = ws.gate(tag, weight);
if (!decision.allowed) { return res.status(429).json({ error: 'rate_limited' }); }
// Track latency and errors for the reflex engine const stop = ws.startTimer(tag); res.on('finish', () => { stop(); if (res.statusCode >= 500) ws.reportError(tag); }); next();});Framework middleware
Drop-in middleware for popular frameworks. Each is a separate subpath export with zero runtime dependencies on the framework — types are structural.
import express from 'express';import { WaitState } from '@waitstate/sdk';import { createMiddleware } from '@waitstate/sdk/express';
const app = express();const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});
app.use(createMiddleware({ waitstate: ws }));import Fastify from 'fastify';import { WaitState } from '@waitstate/sdk';import { createPlugin } from '@waitstate/sdk/fastify';
const app = Fastify();const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});
createPlugin({ waitstate: ws })(app);import { Hono } from 'hono';import { WaitState } from '@waitstate/sdk';import { createMiddleware } from '@waitstate/sdk/hono';
const app = new Hono();const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});
app.use('*', createMiddleware({ waitstate: ws }));import { WaitState } from '@waitstate/sdk';import { withGate } from '@waitstate/sdk/nextjs';
const ws = new WaitState({ publishKey: process.env.WAITSTATE_PUBLISH_KEY, secretKey: process.env.WAITSTATE_SECRET_KEY,});
export const GET = withGate({ waitstate: ws }, async (req) => { return Response.json({ ok: true });});Middleware options
All middleware accept the same options:
createMiddleware({ waitstate: ws, tagFrom: 'x-waitstate-tag', // header name, or (req) => string weightFrom: 'x-waitstate-weight', // header name, or (req) => number retryAfter: 60, // Retry-After header value (seconds) onDenied: (req, res, result) => { // custom deny handler res.statusCode = 429; res.end('Too many requests'); },});| Option | Type | Default | Description |
|---|---|---|---|
waitstate | WaitState | – | Required. Your WaitState instance. |
tagFrom | string | function | x-waitstate-tag | Header name or function returning the tag. |
weightFrom | string | function | – | Header name or function returning the weight. |
retryAfter | number | 60 | Retry-After header value in seconds on 429. |
onDenied | function | – | Custom deny handler. Overrides the default 429 response. |
Each middleware automatically reports latency (on response) and errors (on 5xx) to the reflex engine.
Gotchas
Node.js only. The JS/TS SDK uses node:crypto for HMAC signing. It does not run in browsers or edge runtimes. For Cloudflare Workers, Fastly Compute, and Vercel Edge, use the Edge SDK (@waitstate/edge), which is compiled from the Rust core to WASM.
Constructor is async internally. The new WaitState() call returns immediately, but fires off auth + first pulse in the background. If these fail, errors go to onError and the SDK still starts pulsing (fail-open). There is no await on the constructor.
Pulse interval is server-controlled. You set an initial pulseInterval, but the control plane can override it via the pulse response. The SDK will automatically adjust.
One instance per process. Each WaitState instance creates its own pulse timer and auth lifecycle. In most apps, create one instance at startup and share it.
Always call shutdown(). Without it, the final pulse won't flush and you'll lose telemetry from the last interval. Wire it to SIGTERM / SIGINT.
Rust SDK
The Rust SDK (waitstate-rs) provides the same gate(tag, weight) interface with lock-free, zero-allocation gate checks. Built for high-throughput services where microsecond-level overhead matters.
Install
cargo add waitstateMethods
| Method | Description |
|---|---|
gate(tag, weight) | Lock-free gate check. Single-digit microseconds. Same semantics as the JS SDK. |
report_latency(ms, tag) | Record a latency observation. Truncated to integer ms, averaged per-tag at pulse time. |
report_error(tag) | Increment the error counter for a tag. Summed at pulse time. |
report(metric, value, tag) | Report a custom metric. Aggregated as min/max/sum/count per pulse. Same naming rules as JS SDK. |
set_policy(policy) | Override the current policy (for testing). |
shutdown() | Abort background tasks. |
Internals
- Policy stored in
ArcSwapfor lock-free reads on the gate path. - Per-tag counters use
DashMapwithAtomicU64for contention-free telemetry accumulation. - Background tasks (pulse every 20s, sync every 30s) run on Tokio and never block the gate call.
- HMAC-SHA256 signing via
hmac/sha2crates. JWT auth for policy reads. - Same fail-open and safe-mode guarantees as the JS SDK. If the lease expires,
gate()throttles to 50 req/sec.
Agent (Kubernetes DaemonSet)
The WaitState Agent is a standalone Rust binary that runs as a Kubernetes DaemonSet sidecar. Instead of embedding the SDK in every service, you deploy the agent once per node and call it over localhost.
Routes
| Method | Path | Description |
|---|---|---|
| POST | /gate | Gate check. Pass tag, weight, and optionally latency_ms and error in the JSON body. Returns allow/deny. |
| POST | /coprocess | Apollo Router coprocessor. Optionally include latencyMs and error in the body to report metrics. The agent bundles them into the next pulse. |
| POST | /report-latency | Report a latency observation. Pass ms (number) and optional tag. Returns 204. |
| POST | /report-error | Report an error. Pass optional tag. Returns 204. |
| POST | /report-metric | Report a custom metric. Pass metric (name), value (number), and optional tag. Returns 204. |
| GET | /health | Health check for k8s liveness/readiness probes. |
How it works
The agent wraps the Rust SDK internally. It collects metrics from all pods on the node via /coprocess, aggregates them, and sends a single pulse to the control plane on behalf of every service on that node. This means:
- Your services don't need the WaitState SDK as a dependency. Just
POSTtolocalhost:9000/gate. - Metrics from all pods on the node are bundled into one pulse, giving the reflex engine a node-level view of health.
- Gate checks are still sub-millisecond (localhost HTTP call + in-memory policy lookup in the agent).
- Works with any language. If your service can make an HTTP call, it can use WaitState.
When to use the agent vs. the SDK
| Use the SDK when | Use the agent when |
|---|---|
| A native SDK exists for your language | Your services are polyglot or no native SDK exists yet |
| You want zero network hops on the gate path | You want a single deployment for the whole node |
| You need per-process latency/error metrics | You want node-level metric aggregation |
| You're running outside Kubernetes | You're already running DaemonSets (Datadog, Fluentd, etc.) |
Edge SDK (WASM)
The Edge SDK is the Rust core compiled to WebAssembly. It provides the same gate(tag, weight) interface for edge runtimes where Node.js APIs aren't available.
@waitstate/edgeRuntime: Cloudflare Workers, Fastly Compute, Vercel Edge
Engine:
waitstate-wasm (compiled from the Rust core via wasm-pack)
Install
npm install @waitstate/edgeExported functions
| Function | Description |
|---|---|
gate(policy, tag?, weight?) | Pure gate check against a policy object. Returns GateResult. |
sign_pulse(body, timestamp, secretKey) | HMAC-SHA256 signing for pulse payloads. Returns hex digest. |
The Edge SDK is lower-level than the Node.js SDK. You manage the pulse/policy lifecycle yourself (typically via a Durable Object or KV cache). The gate() function is a pure, synchronous check against a policy you provide.
Python SDK
Native extension compiled from the Rust core via UniFFI. The same lock-free gate engine, exposed as a Python package with framework middleware for FastAPI, Flask, and Django.
Install
pip install waitstateBasic usage
import osfrom waitstate import WaitstateClient, WaitstateConfig
client = WaitstateClient(WaitstateConfig( publish_key=os.environ["WAITSTATE_PUBLISH_KEY"], secret_key=os.environ["WAITSTATE_SECRET_KEY"],))
result = client.gate("search", 1.0)if result.allowed: # proceed with request passFramework middleware
import osfrom fastapi import FastAPI, Depends, Requestfrom waitstate import WaitstateClient, WaitstateConfigfrom waitstate.middleware.fastapi import create_dependency, create_response_hook
app = FastAPI()client = WaitstateClient(WaitstateConfig( publish_key=os.environ["WAITSTATE_PUBLISH_KEY"], secret_key=os.environ["WAITSTATE_SECRET_KEY"],))
gate = create_dependency(client)app.middleware("http")(create_response_hook(client))
@app.get("/search")async def search(request: Request, _=Depends(gate)): return {"results": []}from flask import Flaskfrom waitstate import WaitstateClient, WaitstateConfigfrom waitstate.middleware.flask import init_app
app = Flask(__name__)client = WaitstateClient(WaitstateConfig( publish_key="ws_pub_xxx", secret_key="ws_sec_xxx",))
init_app(app, client)
@app.route("/search")def search(): return {"results": []}from waitstate import WaitstateClient, WaitstateConfig
WAITSTATE_CLIENT = WaitstateClient(WaitstateConfig( publish_key="ws_pub_xxx", secret_key="ws_sec_xxx",))
MIDDLEWARE = [ "waitstate.middleware.django.WaitStateMiddleware", # ...]Agent mode
If you're using the agent sidecar, the Python SDK can talk to it over localhost instead of running the full Rust client in-process:
from waitstate import WaitstateClient, WaitstateConfig
# Agent mode: no keys needed, talks to local sidecarclient = WaitstateClient(WaitstateConfig( publish_key="", secret_key="", agent_url="http://localhost:9000",))Custom metrics
The Python SDK supports custom metrics via report(metric, value, tag). Same naming rules and plan limits as the JS and Rust SDKs.
Go SDK
Java SDK
C# / .NET SDK
Ruby SDK
Elixir SDK
PHP SDK
API Reference
The control plane exposes these endpoints. The SDK handles pulse and policy automatically. Management endpoints are for dashboard or CI use.
SDK endpoints (handled by SDK)
| Method | Path | Auth | Description |
|---|---|---|---|
| POST | /v1/auth/token | publishKey + secretKey | Exchange keys for a JWT. Returns token, expiresAt, orgId. |
| POST | /v1/pulse | HMAC-SHA256 | Send telemetry pulse. Response contains updated policy. |
| GET | /v1/policy/:orgId | Bearer JWT | Fetch current policy (edge-cached). |
Management endpoints
| Method | Path | Description |
|---|---|---|
| GET | /v1/organizations | List organizations. |
| POST | /v1/organizations | Create organization. |
| GET | /v1/api-keys | List API key pairs. |
| POST | /v1/api-keys | Create key pair. |
| DELETE | /v1/api-keys/:id | Revoke key pair. |
| POST | /v1/api-keys/:id/rotate | Rotate key pair. |
| GET | /v1/tags | List tags. |
| POST | /v1/tags | Create tag. |
| DELETE | /v1/tags/:id | Delete tag. |
| GET | /v1/sites | List sites. |
| DELETE | /v1/sites/:id | Delete site. |
| GET | /v1/reflex-rules | List reflex rules. |
| POST | /v1/reflex-rules | Create reflex rule. |
| PATCH | /v1/reflex-rules/:id | Update reflex rule. |
| DELETE | /v1/reflex-rules/:id | Delete reflex rule. |
| GET | /v1/usage | Get usage ledger. |
| GET | /v1/telemetry | Get telemetry rollups. |
Example: Create API key pair
curl -X POST https://api.waitstate.io/v1/api-keys \ -H "Authorization: Bearer <management-token>" \ -H "Content-Type: application/json" \ -d '{ "environment": "live" }'{ "id": "key_xxx", "publishKey": "ws_pub_xxx", "secretKey": "ws_sec_xxx", "environment": "live"}Glossary
Canonical definitions for terms used across WaitState SDKs, the control plane, and this documentation.
| Term | Definition |
|---|---|
| Bounced units | Gate calls that were denied (not allowed through the gate). Tracked separately from allowed units in telemetry rollups. |
| Cap exceeded | State reached when an organization's gate call count hits the plan limit for the current billing period. Behavior depends on plan configuration—calls may be denied or overage-billed. |
| Control plane | The centralized API server (api.waitstate.io) that stores configuration, evaluates gate calls, and serves policies to SDKs. All management and runtime API traffic flows through the control plane. |
| EdgeGate | The WASM-based SDK variant designed for edge runtimes (Cloudflare Workers, Deno Deploy, etc.). Evaluates policies locally with minimal latency. |
| Fail-open | Default SDK behavior when the control plane is unreachable or the policy cache has expired: the gate allows all traffic through rather than blocking. Ensures your application keeps running even if WaitState is down. |
| Gate | A checkpoint in your application code where the SDK evaluates whether a request should proceed. Created by calling gate() in the SDK. |
| Gate call | A single invocation of the gate. Each call consumes one unit against your plan's monthly allowance. This is the primary billing metric. |
| Kill signal | An emergency control that forces all gates to deny traffic, regardless of normal policy rules. Issued from the dashboard or management API. |
| Lease duration | How long the SDK caches its local copy of the policy before requiring a fresh fetch from the control plane. If the SDK cannot reach the control plane before the lease expires, it enters safe mode. |
| Organization | The top-level account entity. Owns sites, API keys, and billing. Each user belongs to one organization. |
| Policy | The full set of rules and configuration the SDK needs to evaluate gates locally. Fetched from the control plane on each pulse and cached for the lease duration. |
| Policy cache | The SDK's local, in-memory copy of the policy. Refreshed every pulse interval. When the cache expires (lease duration exceeded) without a successful refresh, the SDK enters safe mode. |
| Pulse | The periodic heartbeat where the SDK phones home to the control plane to fetch the latest policy and report telemetry. Frequency is set by the pulse interval. |
| Pulse interval | Time between pulses. Varies by plan: 30s (Hobby), 2s (Pro), sub-second (Enterprise). |
| Reflex rule | A conditional rule evaluated server-side by the arbiter. Matches gate calls by tag and applies an action (allow, deny, or weight override). Reflex rules take precedence over default policy. |
| Safe mode | The SDK state entered when the policy cache has expired and the control plane is unreachable. Behavior depends on safeModeStrategy: open (default) fails fully open, fixed_rps throttles to safeModeMaxRps (default: 50 req/sec), last_policy keeps enforcing the last cached policy. |
| Site | An isolated environment within an organization (e.g., production, staging). Each site has its own API key pair, tags, rules, and usage counters. Plan limits restrict the number of sites. |
| Tag | A string label attached to a gate call to categorize it (e.g., "checkout", "search"). Tags are used to target reflex rules and filter telemetry. Plan limits restrict the number of unique tags. |
| Telemetry rollup | Aggregated gate call statistics (allowed, denied, total) grouped by tag and time bucket. Reported by the SDK on each pulse and stored by the control plane. |
| Token (API key pair) | A client ID and secret used to authenticate SDK and management API requests. Scoped to a single site. Created in the dashboard or via the management API. |
| Weight | A numeric value passed to gate(tag, weight) that represents the cost or priority of a request. The SDK compares it against the policy's globalMaxWeight and per-tag tagMaxWeights thresholds—if the weight exceeds either limit, the call is denied. Higher weight = harder to pass through during throttling. Default: 1. |