Quick Start

Get adaptive gating running in your application in three steps.

1. Create an API key pair

Sign up and create a key pair in the dashboard. You'll get a publishKey (identifies your org) and a secretKey (signs telemetry). The secret is shown once.

2. Install and configure the SDK

Terminal window
npm install @waitstate/sdk
import { WaitState } from '@waitstate/sdk';
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});

3. Gate traffic

Call gate(tag, weight) before processing a request. It's synchronous, in-memory, and never makes a network call.

const decision = ws.gate('free', 1);
if (decision.allowed) {
// proceed with request
} else {
// respond with 429 or queue
res.status(429).json({ error: 'rate_limited' });
}
That's it. The SDK handles pulse telemetry and policy fetching in the background. gate() is always fast, always synchronous, always fail-open.

Tags & Weights

Tags categorize traffic. Weights determine priority. When the reflex engine fires, low-weight traffic is shed first.

  • Pass a tag and weight to gate() on each call.
  • Create tags in the dashboard with default weights. The SDK doesn't need to know about tags ahead of time.
  • Reflex rules can target specific tags (e.g. block "free" when latency > 500ms) or apply to all traffic.

How weight-based gating works

The control plane sets a globalMaxWeight and per-tag tagMaxWeights in the policy. When you call gate(tag, weight), the SDK compares your weight against these limits. If your weight exceeds the max, you're denied.

Example: during a latency spike, the policy might set globalMaxWeight: 5. Traffic with weight 1 ("free") gets blocked. Traffic with weight 10 ("enterprise") still gets blocked because 10 > 5. But the policy can also set tagMaxWeights to {"enterprise": Infinity} to exempt specific tags.

Reflex Rules

Reflex rules define automated responses to backend health metrics. They run in the control plane's PulseAggregator and affect the policy that gets pushed to your SDK instances.

Rule schema

{
"tagName": "free",
"metric": "latency",
"operator": "gt",
"threshold": 500,
"action": "block",
"actionValue": null,
"enabled": true,
"priority": 1
}
FieldTypeDescription
tagNamestring | nullTarget tag. null means all traffic.
metricstringMetric name: latency, errors, p50_latency, p95_latency, p99_latency, or any custom metric name.
operatorstringgt, gte, lt, lte.
thresholdnumberValue to compare against.
actionstringthrottle or block.
actionValuenumber | nullMultiplier for throttle (e.g. 0.5 = halve allowed weight). Null for block.
enabledbooleanWhether the rule is active. Defaults to true.
prioritynumberEvaluation order. Lower = higher priority.

Throttle vs Block

ActionEffect on policyUse when
blockSets the tag's max weight to 0. All traffic for that tag is denied.Origin is in danger. Shed this traffic class entirely.
throttleMultiplies the tag's max weight by actionValue. E.g. 0.5 halves it.Origin is stressed but not critical. Reduce load gradually.

Throttle example

{
"tagName": "pro",
"metric": "latency",
"operator": "gt",
"threshold": 500,
"action": "throttle",
"actionValue": 0.5,
"enabled": true,
"priority": 2
}

When latency exceeds 500ms, the reflex engine multiplies pro's max weight by 0.5. If the default max weight for pro is 10, it drops to 5. A gate('pro', 5) call still passes, but gate('pro', 7) would be denied.

Scenario: layered rules

Three rules, evaluated by priority:

1. Block free tier when latency is critical

{
"tagName": "free",
"metric": "latency",
"operator": "gt",
"threshold": 1000,
"action": "block",
"priority": 1
}

2. Throttle free tier when latency is elevated

{
"tagName": "free",
"metric": "latency",
"operator": "gt",
"threshold": 500,
"action": "throttle",
"actionValue": 0.5,
"priority": 2
}

3. Throttle pro tier when errors spike

{
"tagName": "pro",
"metric": "errors",
"operator": "gt",
"threshold": 50,
"action": "throttle",
"actionValue": 0.7,
"priority": 3
}

Progressive load shedding

As conditions worsen, more rules fire and lower tiers are shed first:

ConditionsRules firedfreeproenterprise
latency: 80ms, errors: 2noneallowedallowedallowed
latency: 600ms#2throttledallowedallowed
latency: 1200ms#1blockedallowedallowed
latency: 1200ms, errors: 60#1, #3blockedthrottledallowed
The key idea: throttle gives you a gradient between "fully allowed" and "fully blocked." Use low thresholds with throttle as an early warning, and high thresholds with block as a circuit breaker. Stack multiple rules to create progressive load shedding.

Manage rules in the dashboard or via the management API.

Pulse & Policy Flow

The SDK communicates with the control plane through two channels. You don't interact with either directly.

Pulses (SDK → Control Plane)

The SDK periodically sends a pulse containing gate() call counts, per-tag metrics, and process-level latency and error counts. Pulses are HMAC-signed with your secret key.

Pulse headers

POST /v1/pulse
x-waitstate-id: ws_pub_xxx # your publishKey
x-waitstate-signature: <hmac-hex> # HMAC-SHA256(body + '.' + timestamp, secretKey)
x-waitstate-timestamp: 1740000060000
content-type: application/json

Pulse payload

{
"instanceId": "web-01",
"siteId": "site-prod",
"usageDelta": 967,
"bouncedUnits": 47,
"metrics": { "latency": 142, "errors": 0 },
"tagMetrics": [
{ "tag": "free", "latency": 0, "count": 847 },
{ "tag": "pro", "latency": 0, "count": 120,
"customMetrics": {
"queue_depth": { "min": 3, "max": 87, "sum": 450, "count": 10 }
}
}
],
"ts": 1740000060000
}
FieldDescription
instanceIdUnique identifier for this SDK instance. Defaults to randomUUID() if not set.
siteIdOptional. Shards this instance into a separate aggregator for independent health tracking.
usageDeltaNumber of gate() calls since the last pulse. Used for billing and plan cap enforcement.
bouncedUnitsNumber of gate() calls that were denied since the last pulse.
metricsProcess-level metrics: latency (average ms), errors (count). The reflex engine evaluates rules against these.
tagMetricsArray of per-tag counters. Each entry has tag (name), latency (avg ms), count (calls), and optional customMetrics (min/max/sum/count aggregates from report()). Drives per-tag reflex rules.
tsTimestamp in epoch milliseconds. Must match the x-waitstate-timestamp header used in HMAC signing.

Policy (Control Plane → SDK)

The pulse response contains the current policy. The SDK also polls GET /v1/policy/{orgId} with a JWT for edge-cached reads. Both sources update the same in-memory policy cache.

Policy shape

{
"globalMaxWeight": 5,
"tagMaxWeights": { "free": 0, "pro": 3 },
"pulseInterval": 2000,
"leaseDurationSeconds": 120,
"status": "ok" // "ok" | "cap_exceeded"
}
FieldDescription
globalMaxWeightMaximum weight allowed globally. Traffic with weight above this is denied.
tagMaxWeightsPer-tag weight overrides. A tag set to 0 is fully blocked.
pulseIntervalServer-controlled pulse interval in ms. SDK adjusts automatically.
killSignalEmergency kill. SDK stops pulsing, resets to fail-open, sleeps 24h.
statusok, over_limit, or cap_exceeded. When cap is exceeded, SDK fails open and backs off to 5-minute pulses.
leaseDurationSecondsHow long the SDK trusts the current policy without hearing from the control plane. If the SDK doesn't receive a successful pulse or policy response within this window, it enters safe mode (50 req/sec). Varies by plan: Hobby 300s, Pro 120s, Enterprise 60s.
Key insight: gate() never makes a network call. It reads from the in-memory policy cache. Pulses and policy fetches happen in the background on a timer. This is why gate() is synchronous and sub-millisecond.

Authentication

WaitState uses a two-key model:

KeyPurposeVisibility
publishKeyIdentifies your organization. Sent as x-waitstate-id on pulses.Semi-public (in your app code).
secretKeySigns pulses (HMAC-SHA256). Exchanged for a JWT. Never sent as plaintext on the wire.Secret. Environment variable only.

Pulse auth (HMAC-SHA256)

Each pulse is signed: HMAC-SHA256(body + timestamp, secretKey). The signature is sent as x-waitstate-signature. The control plane verifies before accepting.

Policy auth (JWT)

On init, the SDK exchanges your keys for a JWT via POST /v1/auth/token. The JWT is used for GET /v1/policy/{orgId} reads and auto-refreshes 2 minutes before expiry.

Management auth

Dashboard API calls use session-based auth (cookie). Programmatic management uses a bearer token.

Site Sharding

By default, the control plane routes all pulses from the same organization to a single aggregator. If you run multiple sites (e.g. staging, production, or separate apps) within one org, you can isolate them with siteId.

Pass siteId when creating the SDK instance. Each unique siteId gets its own aggregator with independent health tracking, reflex evaluation, and usage accounting. Sites are created automatically on the first pulse. No manual setup required. You can view and manage your sites in the dashboard.

  • If omitted, siteId defaults to your orgId (one aggregator per org).
  • Billing is still per-org. Monthly usage is tracked by orgId regardless of how many sites you have.
  • Each site gets its own policy. A latency spike on staging won't throttle production.
  • The number of sites is capped per plan: Hobby allows 1, Pro allows 10, Enterprise is unlimited.

Fail-Open & Safe Mode

WaitState is designed to never block traffic due to its own failures. But unlimited traffic during a prolonged outage is also dangerous. The lease mechanism provides a safety net.

Fail-open guarantees

  • Control plane unreachable (short-term): SDK uses the last cached policy. Gate decisions continue normally.
  • Never synced: If no policy was ever fetched (bootstrap), the default policy allows everything (globalMaxWeight: Infinity).
  • Pulse fails: SDK retries on next interval. Telemetry counters are preserved and re-sent on the next successful pulse.
  • Auth fails: SDK starts pulsing anyway with the default interval. Init errors go to onError.
  • No reflex rules: All traffic is allowed.
  • Unknown tag: gate() returns allow if the weight is under the global max.
  • Kill signal: SDK stops pulsing, resets to fail-open default, and resumes after 24 hours.
  • Monthly cap exceeded: Control plane returns status: cap_exceeded. SDK fails open (globalMaxWeight: Infinity) and backs off to 5-minute pulse intervals. When the new month starts, normal policy resumes automatically.

Lease & safe mode

Each policy includes a leaseDurationSeconds field. This is the maximum time the SDK trusts its cached policy without hearing from the control plane. If no successful pulse or policy response arrives within the lease window, the SDK enters safe mode.

PlanLease Duration
Hobby5 minutes (300s)
Pro2 minutes (120s)
Enterprise1 minute (60s)

Safe mode behavior

The safeModeStrategy constructor option controls what happens when the lease expires:

StrategyBehavior
open (default)Fail fully open—all gate calls are allowed regardless of weight or tag.
fixed_rpsThrottle to safeModeMaxRps requests per second per instance (default: 50). Excess calls are denied.
last_policyKeep enforcing the last cached policy as-is. Gate checks continue using stale thresholds until sync resumes.

In all strategies:

  • gate() returns reason: lease_expired for all calls (both allowed and denied).
  • A [WAITSTATE-FATAL] log message fires once when safe mode activates.
  • When the SDK successfully syncs again, normal policy-based gating resumes.

Why not just fail-open forever?

Pure fail-open means a prolonged control plane outage removes WaitState's intelligence layer entirely. Safe mode provides a hard floor: your API keeps serving traffic, but at a rate that won't overwhelm your backend. Your existing rate limiters and circuit breakers still apply either way.

Bootstrap grace period

The lease clock only starts after the first successful sync. During bootstrap (before the SDK has ever reached the control plane), gate() is fully fail-open. This prevents false safe-mode activations during cold starts or deployment rollouts.

Design principle: WaitState fails open during transient issues and transitions to safe mode during prolonged outages. You control the strategy via safeModeStrategy in the constructor—open to fail fully open, fixed_rps to throttle, or last_policy to keep enforcing stale rules.

Telemetry & Metrics

Every SDK and the agent expose the same telemetry methods. These feed into the pulse payload so the reflex engine can react to backend health. Without telemetry, reflex rules on latency and errors evaluate against zero.

Built-in metrics

All SDKs provide methods for reporting latency and errors. Method names follow each language's conventions (reportLatency in JS, report_latency in Rust/Python) but the semantics and aggregation behavior are identical.

MethodDescription
reportLatency(ms, tag?)Record a latency observation in milliseconds. Averaged per-tag at pulse time.
reportError(tag?)Increment the error counter for a tag. Summed at pulse time.
startTimer(tag?)Returns a stop function. Calling stop reports elapsed ms as latency via reportLatency.
// Timer pattern (recommended)
const stop = ws.startTimer('search');
const result = await doSearch(query);
stop(); // reports elapsed ms as latency
// Direct reporting
ws.reportLatency(42.5, 'search'); // ms
ws.reportError('search');

Custom metrics: report(metric, value, tag?)

Report a custom metric value. Each metric is aggregated as min/max/sum/count per pulse window, giving you average, minimum, and maximum values in the control plane. Create reflex rules on any custom metric name to trigger automated responses.

// Report a custom metric (min/max/sum/count aggregated per pulse)
ws.report('queue_depth', 87, 'search');
ws.report('cache_hit_rate', 0.92, 'search');
// Without a tag, defaults to '__default__'
ws.report('active_connections', 142);
ParamTypeDefaultDescription
metricstringMetric name. Lowercase letters, digits, and underscores. Must start with a letter. Max 63 characters. Cannot be a reserved name (latency, errors, p50_latency, p95_latency, p99_latency).
valuenumberThe metric value to record.
tagstring__default__Tag to associate the metric with.

Custom metric reflex rule

Custom metrics work with the same reflex rule engine as latency and errors. The reflex engine evaluates the metric's average (sum/count) against your threshold.

// Block search traffic when queue depth exceeds 100
{
"tagName": "search",
"metric": "queue_depth",
"operator": "gt",
"threshold": 100,
"action": "block",
"actionValue": null,
"enabled": true,
"priority": 1
}

Plan limits

PlanCustom metrics per tag
Hobby
Pro5
EnterpriseUnlimited

Metrics reported via report() always flow through the pipeline. The plan limit controls how many distinct custom metric names can be used in reflex rules. Excess metrics in the data pipeline are silently trimmed (alphabetical order, kept first N).

All SDKs, same interface. The method names follow each language's conventions (reportLatency in JS, report_latency in Rust/Python) but the semantics, aggregation behavior, and naming rules are identical across every SDK and the agent.

Use Cases

WaitState works anywhere you need to shed traffic intelligently under pressure: GraphQL APIs, AI gateways, e-commerce, multi-tenant SaaS, fintech, gaming, IoT, and more. Each use case includes integration code, reflex rules, and a walkthrough of what happens under load.

See all use cases with integration examples →

JavaScript / TypeScript SDK

The official JS/TS SDK. Zero dependencies. All SDKs share the same gate(tag, weight) interface and pulse/policy protocol.

Package: @waitstate/sdk
Runtime: Node.js ≥ 20.3.0
Module: ESM and CJS (dual exports)

Install

Terminal window
npm install @waitstate/sdk

Constructor

const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
siteId: 'site-prod', // optional, shard key (defaults to orgId)
instanceId: 'web-01', // optional, defaults to randomUUID()
pulseInterval: 5000, // optional, ms between pulses (server can override)
safeModeStrategy: 'open', // optional, 'open' | 'fixed_rps' | 'last_policy'
safeModeMaxRps: 50, // optional, max req/sec in safe mode (default: 50)
baseUrl: 'https://api.waitstate.io', // optional
onError: (err) => console.error('[waitstate]', err),
});
OptionTypeDefaultDescription
publishKeystringRequired. Your publish key from the dashboard.
secretKeystringRequired. Your secret key. Store in env vars.
siteIdstringorgIdShard key for aggregator routing. Use to isolate sites within the same org.
instanceIdstringrandomUUID()Identifies this SDK instance in telemetry.
pulseIntervalnumber5000Initial pulse interval in ms. Server can override via policy.
baseUrlstringhttps://api.waitstate.ioControl plane URL.
safeModeStrategystringopenBehavior when the lease expires. open = fail fully open, fixed_rps = throttle to safeModeMaxRps, last_policy = keep enforcing the last cached policy.
safeModeMaxRpsnumber50Max requests per second in safe mode. Only used when safeModeStrategy is fixed_rps.
onErrorfunctionError callback. Init failures, pulse failures, auth failures.

gate(tag?, weight?)

Synchronous. Returns a GateResult. Never throws.

// Tag-based gating with weight
const d1 = ws.gate('enterprise', 10); // high-priority traffic
const d2 = ws.gate('free', 1); // low-priority traffic
const d3 = ws.gate(); // no tag, weight defaults to 1
// Check the reason for denial
if (!decision.allowed) {
switch (decision.reason) {
case 'tag_blocked': // this tag is explicitly blocked by policy
case 'over_weight': // weight exceeds global max
case 'global_block': // all traffic blocked
case 'kill_signal': // emergency kill from control plane
case 'lease_expired': // control plane lease expired, safe mode (safeModeMaxRps req/s)
break;
}
}
ParamTypeDefaultDescription
tagstring__default__Traffic tag (e.g. "free", "pro", "enterprise").
weightnumber1Request weight. Higher = higher priority.

GateResult

// Allowed
{ "allowed": true, "reason": "allowed" }
// Denied
{ "allowed": false, "reason": "tag_blocked" }
{ "allowed": false, "reason": "over_weight" }
{ "allowed": false, "reason": "global_block" }
{ "allowed": false, "reason": "kill_signal" }
// Safe mode (lease expired, behavior depends on safeModeStrategy)
{ "allowed": true, "reason": "lease_expired" }
{ "allowed": false, "reason": "lease_expired" }
FieldTypeDescription
allowedbooleanWhether the request should proceed.
reasonstringallowed, over_weight, tag_blocked, global_block, kill_signal, or lease_expired.

Telemetry & custom metrics

The JS SDK exposes reportLatency(), reportError(), startTimer(), and report() for custom metrics. These are platform-wide features available in every SDK and the agent. See Telemetry & Metrics for full documentation, naming rules, and plan limits.

shutdown()

Async. Stops the pulse timer, clears the auth refresh timer, and sends one final pulse to flush pending telemetry.

// Flush pending pulses on graceful shutdown
process.on('SIGTERM', async () => {
await ws.shutdown();
process.exit(0);
});

Advanced exports

For low-level use, the SDK also exports individual components:

  • gate(policy, tag?, weight?) -standalone pure function, takes a BouncerPolicy directly
  • PolicyCache -in-memory policy store (get(), update(), reset())
  • TelemetryCollector -accumulates gate metrics between pulses
  • signPulse(body, timestamp, secretKey) -returns HMAC-SHA256 hex digest
  • createSignedHeaders(body, publishKey, secretKey) -returns signed header object
  • AuthManager -manages JWT token lifecycle
  • WaitStateError -error class with status, code, message

Example: Express middleware

import express from 'express';
import { WaitState } from '@waitstate/sdk';
const app = express();
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});
app.use((req, res, next) => {
const tag = req.user?.plan ?? 'free';
const weight = tag === 'enterprise' ? 10 : tag === 'pro' ? 5 : 1;
const decision = ws.gate(tag, weight);
if (!decision.allowed) {
return res.status(429).json({ error: 'rate_limited' });
}
// Track latency and errors for the reflex engine
const stop = ws.startTimer(tag);
res.on('finish', () => {
stop();
if (res.statusCode >= 500) ws.reportError(tag);
});
next();
});

Framework middleware

Drop-in middleware for popular frameworks. Each is a separate subpath export with zero runtime dependencies on the framework — types are structural.

import express from 'express';
import { WaitState } from '@waitstate/sdk';
import { createMiddleware } from '@waitstate/sdk/express';
const app = express();
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});
app.use(createMiddleware({ waitstate: ws }));
import Fastify from 'fastify';
import { WaitState } from '@waitstate/sdk';
import { createPlugin } from '@waitstate/sdk/fastify';
const app = Fastify();
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});
createPlugin({ waitstate: ws })(app);
import { Hono } from 'hono';
import { WaitState } from '@waitstate/sdk';
import { createMiddleware } from '@waitstate/sdk/hono';
const app = new Hono();
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});
app.use('*', createMiddleware({ waitstate: ws }));
import { WaitState } from '@waitstate/sdk';
import { withGate } from '@waitstate/sdk/nextjs';
const ws = new WaitState({
publishKey: process.env.WAITSTATE_PUBLISH_KEY,
secretKey: process.env.WAITSTATE_SECRET_KEY,
});
export const GET = withGate({ waitstate: ws }, async (req) => {
return Response.json({ ok: true });
});

Middleware options

All middleware accept the same options:

createMiddleware({
waitstate: ws,
tagFrom: 'x-waitstate-tag', // header name, or (req) => string
weightFrom: 'x-waitstate-weight', // header name, or (req) => number
retryAfter: 60, // Retry-After header value (seconds)
onDenied: (req, res, result) => { // custom deny handler
res.statusCode = 429;
res.end('Too many requests');
},
});
OptionTypeDefaultDescription
waitstateWaitStateRequired. Your WaitState instance.
tagFromstring | functionx-waitstate-tagHeader name or function returning the tag.
weightFromstring | functionHeader name or function returning the weight.
retryAfternumber60Retry-After header value in seconds on 429.
onDeniedfunctionCustom deny handler. Overrides the default 429 response.

Each middleware automatically reports latency (on response) and errors (on 5xx) to the reflex engine.

Gotchas

Node.js only. The JS/TS SDK uses node:crypto for HMAC signing. It does not run in browsers or edge runtimes. For Cloudflare Workers, Fastly Compute, and Vercel Edge, use the Edge SDK (@waitstate/edge), which is compiled from the Rust core to WASM.

Constructor is async internally. The new WaitState() call returns immediately, but fires off auth + first pulse in the background. If these fail, errors go to onError and the SDK still starts pulsing (fail-open). There is no await on the constructor.

Pulse interval is server-controlled. You set an initial pulseInterval, but the control plane can override it via the pulse response. The SDK will automatically adjust.

One instance per process. Each WaitState instance creates its own pulse timer and auth lifecycle. In most apps, create one instance at startup and share it.

Always call shutdown(). Without it, the final pulse won't flush and you'll lose telemetry from the last interval. Wire it to SIGTERM / SIGINT.

Rust SDK

The Rust SDK (waitstate-rs) provides the same gate(tag, weight) interface with lock-free, zero-allocation gate checks. Built for high-throughput services where microsecond-level overhead matters.

Crate: waitstate-rs
Rust: 1.93+, edition 2024
TLS: rustls (no OpenSSL dependency)

Install

Terminal window
cargo add waitstate

Methods

MethodDescription
gate(tag, weight)Lock-free gate check. Single-digit microseconds. Same semantics as the JS SDK.
report_latency(ms, tag)Record a latency observation. Truncated to integer ms, averaged per-tag at pulse time.
report_error(tag)Increment the error counter for a tag. Summed at pulse time.
report(metric, value, tag)Report a custom metric. Aggregated as min/max/sum/count per pulse. Same naming rules as JS SDK.
set_policy(policy)Override the current policy (for testing).
shutdown()Abort background tasks.

Internals

  • Policy stored in ArcSwap for lock-free reads on the gate path.
  • Per-tag counters use DashMap with AtomicU64 for contention-free telemetry accumulation.
  • Background tasks (pulse every 20s, sync every 30s) run on Tokio and never block the gate call.
  • HMAC-SHA256 signing via hmac/sha2 crates. JWT auth for policy reads.
  • Same fail-open and safe-mode guarantees as the JS SDK. If the lease expires, gate() throttles to 50 req/sec.

Agent (Kubernetes DaemonSet)

The WaitState Agent is a standalone Rust binary that runs as a Kubernetes DaemonSet sidecar. Instead of embedding the SDK in every service, you deploy the agent once per node and call it over localhost.

Binary: waitstate-agent
Port: 9000
Image: ghcr.io/waitstate-io/engine/waitstate-agent

Routes

MethodPathDescription
POST/gateGate check. Pass tag, weight, and optionally latency_ms and error in the JSON body. Returns allow/deny.
POST/coprocessApollo Router coprocessor. Optionally include latencyMs and error in the body to report metrics. The agent bundles them into the next pulse.
POST/report-latencyReport a latency observation. Pass ms (number) and optional tag. Returns 204.
POST/report-errorReport an error. Pass optional tag. Returns 204.
POST/report-metricReport a custom metric. Pass metric (name), value (number), and optional tag. Returns 204.
GET/healthHealth check for k8s liveness/readiness probes.

How it works

The agent wraps the Rust SDK internally. It collects metrics from all pods on the node via /coprocess, aggregates them, and sends a single pulse to the control plane on behalf of every service on that node. This means:

  • Your services don't need the WaitState SDK as a dependency. Just POST to localhost:9000/gate.
  • Metrics from all pods on the node are bundled into one pulse, giving the reflex engine a node-level view of health.
  • Gate checks are still sub-millisecond (localhost HTTP call + in-memory policy lookup in the agent).
  • Works with any language. If your service can make an HTTP call, it can use WaitState.

When to use the agent vs. the SDK

Use the SDK whenUse the agent when
A native SDK exists for your languageYour services are polyglot or no native SDK exists yet
You want zero network hops on the gate pathYou want a single deployment for the whole node
You need per-process latency/error metricsYou want node-level metric aggregation
You're running outside KubernetesYou're already running DaemonSets (Datadog, Fluentd, etc.)

Edge SDK (WASM)

The Edge SDK is the Rust core compiled to WebAssembly. It provides the same gate(tag, weight) interface for edge runtimes where Node.js APIs aren't available.

Package: @waitstate/edge
Runtime: Cloudflare Workers, Fastly Compute, Vercel Edge
Engine: waitstate-wasm (compiled from the Rust core via wasm-pack)

Install

Terminal window
npm install @waitstate/edge

Exported functions

FunctionDescription
gate(policy, tag?, weight?)Pure gate check against a policy object. Returns GateResult.
sign_pulse(body, timestamp, secretKey)HMAC-SHA256 signing for pulse payloads. Returns hex digest.

The Edge SDK is lower-level than the Node.js SDK. You manage the pulse/policy lifecycle yourself (typically via a Durable Object or KV cache). The gate() function is a pure, synchronous check against a policy you provide.

Python SDK

Native extension compiled from the Rust core via UniFFI. The same lock-free gate engine, exposed as a Python package with framework middleware for FastAPI, Flask, and Django.

Package: waitstate
Python: ≥ 3.8
Modes: Embedded (full client) or Agent (HTTP to local sidecar)

Install

Terminal window
pip install waitstate

Basic usage

import os
from waitstate import WaitstateClient, WaitstateConfig
client = WaitstateClient(WaitstateConfig(
publish_key=os.environ["WAITSTATE_PUBLISH_KEY"],
secret_key=os.environ["WAITSTATE_SECRET_KEY"],
))
result = client.gate("search", 1.0)
if result.allowed:
# proceed with request
pass

Framework middleware

import os
from fastapi import FastAPI, Depends, Request
from waitstate import WaitstateClient, WaitstateConfig
from waitstate.middleware.fastapi import create_dependency, create_response_hook
app = FastAPI()
client = WaitstateClient(WaitstateConfig(
publish_key=os.environ["WAITSTATE_PUBLISH_KEY"],
secret_key=os.environ["WAITSTATE_SECRET_KEY"],
))
gate = create_dependency(client)
app.middleware("http")(create_response_hook(client))
@app.get("/search")
async def search(request: Request, _=Depends(gate)):
return {"results": []}
from flask import Flask
from waitstate import WaitstateClient, WaitstateConfig
from waitstate.middleware.flask import init_app
app = Flask(__name__)
client = WaitstateClient(WaitstateConfig(
publish_key="ws_pub_xxx",
secret_key="ws_sec_xxx",
))
init_app(app, client)
@app.route("/search")
def search():
return {"results": []}
settings.py
from waitstate import WaitstateClient, WaitstateConfig
WAITSTATE_CLIENT = WaitstateClient(WaitstateConfig(
publish_key="ws_pub_xxx",
secret_key="ws_sec_xxx",
))
MIDDLEWARE = [
"waitstate.middleware.django.WaitStateMiddleware",
# ...
]

Agent mode

If you're using the agent sidecar, the Python SDK can talk to it over localhost instead of running the full Rust client in-process:

from waitstate import WaitstateClient, WaitstateConfig
# Agent mode: no keys needed, talks to local sidecar
client = WaitstateClient(WaitstateConfig(
publish_key="",
secret_key="",
agent_url="http://localhost:9000",
))

Custom metrics

The Python SDK supports custom metrics via report(metric, value, tag). Same naming rules and plan limits as the JS and Rust SDKs.

Go SDK

Coming soon. Static binary compiled from the Rust core via CGo. In the meantime, use the agent to gate Go services over localhost.

Java SDK

Coming soon. JNI bridge to the Rust core. Spring Boot starter included. In the meantime, use the agent to gate Java services over localhost.

C# / .NET SDK

Coming soon. Native bindings via P/Invoke from the Rust core. ASP.NET Core middleware included. In the meantime, use the agent to gate .NET services over localhost.

Ruby SDK

Coming soon. Native gem compiled from the Rust core via Magnus. Rails, Sinatra, and Hanami middleware included. In the meantime, use the agent to gate Ruby services over localhost.

Elixir SDK

Coming soon. Rust NIF via Rustler. Plug middleware included. In the meantime, use the agent to gate Elixir services over localhost.

PHP SDK

Coming soon. Native extension compiled from the Rust core via ext-php-rs. Laravel, Symfony, and Slim middleware included. In the meantime, use the agent to gate PHP services over localhost.

API Reference

The control plane exposes these endpoints. The SDK handles pulse and policy automatically. Management endpoints are for dashboard or CI use.

SDK endpoints (handled by SDK)

MethodPathAuthDescription
POST/v1/auth/tokenpublishKey + secretKeyExchange keys for a JWT. Returns token, expiresAt, orgId.
POST/v1/pulseHMAC-SHA256Send telemetry pulse. Response contains updated policy.
GET/v1/policy/:orgIdBearer JWTFetch current policy (edge-cached).

Management endpoints

MethodPathDescription
GET/v1/organizationsList organizations.
POST/v1/organizationsCreate organization.
GET/v1/api-keysList API key pairs.
POST/v1/api-keysCreate key pair.
DELETE/v1/api-keys/:idRevoke key pair.
POST/v1/api-keys/:id/rotateRotate key pair.
GET/v1/tagsList tags.
POST/v1/tagsCreate tag.
DELETE/v1/tags/:idDelete tag.
GET/v1/sitesList sites.
DELETE/v1/sites/:idDelete site.
GET/v1/reflex-rulesList reflex rules.
POST/v1/reflex-rulesCreate reflex rule.
PATCH/v1/reflex-rules/:idUpdate reflex rule.
DELETE/v1/reflex-rules/:idDelete reflex rule.
GET/v1/usageGet usage ledger.
GET/v1/telemetryGet telemetry rollups.

Example: Create API key pair

Terminal window
curl -X POST https://api.waitstate.io/v1/api-keys \
-H "Authorization: Bearer <management-token>" \
-H "Content-Type: application/json" \
-d '{ "environment": "live" }'
{
"id": "key_xxx",
"publishKey": "ws_pub_xxx",
"secretKey": "ws_sec_xxx",
"environment": "live"
}

Glossary

Canonical definitions for terms used across WaitState SDKs, the control plane, and this documentation.

Term Definition
Bounced unitsGate calls that were denied (not allowed through the gate). Tracked separately from allowed units in telemetry rollups.
Cap exceededState reached when an organization's gate call count hits the plan limit for the current billing period. Behavior depends on plan configuration—calls may be denied or overage-billed.
Control planeThe centralized API server (api.waitstate.io) that stores configuration, evaluates gate calls, and serves policies to SDKs. All management and runtime API traffic flows through the control plane.
EdgeGateThe WASM-based SDK variant designed for edge runtimes (Cloudflare Workers, Deno Deploy, etc.). Evaluates policies locally with minimal latency.
Fail-openDefault SDK behavior when the control plane is unreachable or the policy cache has expired: the gate allows all traffic through rather than blocking. Ensures your application keeps running even if WaitState is down.
GateA checkpoint in your application code where the SDK evaluates whether a request should proceed. Created by calling gate() in the SDK.
Gate callA single invocation of the gate. Each call consumes one unit against your plan's monthly allowance. This is the primary billing metric.
Kill signalAn emergency control that forces all gates to deny traffic, regardless of normal policy rules. Issued from the dashboard or management API.
Lease durationHow long the SDK caches its local copy of the policy before requiring a fresh fetch from the control plane. If the SDK cannot reach the control plane before the lease expires, it enters safe mode.
OrganizationThe top-level account entity. Owns sites, API keys, and billing. Each user belongs to one organization.
PolicyThe full set of rules and configuration the SDK needs to evaluate gates locally. Fetched from the control plane on each pulse and cached for the lease duration.
Policy cacheThe SDK's local, in-memory copy of the policy. Refreshed every pulse interval. When the cache expires (lease duration exceeded) without a successful refresh, the SDK enters safe mode.
PulseThe periodic heartbeat where the SDK phones home to the control plane to fetch the latest policy and report telemetry. Frequency is set by the pulse interval.
Pulse intervalTime between pulses. Varies by plan: 30s (Hobby), 2s (Pro), sub-second (Enterprise).
Reflex ruleA conditional rule evaluated server-side by the arbiter. Matches gate calls by tag and applies an action (allow, deny, or weight override). Reflex rules take precedence over default policy.
Safe modeThe SDK state entered when the policy cache has expired and the control plane is unreachable. Behavior depends on safeModeStrategy: open (default) fails fully open, fixed_rps throttles to safeModeMaxRps (default: 50 req/sec), last_policy keeps enforcing the last cached policy.
SiteAn isolated environment within an organization (e.g., production, staging). Each site has its own API key pair, tags, rules, and usage counters. Plan limits restrict the number of sites.
TagA string label attached to a gate call to categorize it (e.g., "checkout", "search"). Tags are used to target reflex rules and filter telemetry. Plan limits restrict the number of unique tags.
Telemetry rollupAggregated gate call statistics (allowed, denied, total) grouped by tag and time bucket. Reported by the SDK on each pulse and stored by the control plane.
Token (API key pair)A client ID and secret used to authenticate SDK and management API requests. Scoped to a single site. Created in the dashboard or via the management API.
WeightA numeric value passed to gate(tag, weight) that represents the cost or priority of a request. The SDK compares it against the policy's globalMaxWeight and per-tag tagMaxWeights thresholds—if the weight exceeds either limit, the call is denied. Higher weight = harder to pass through during throttling. Default: 1.
Your next incident is coming

Will your rate limiter know before your customers do?

Free tier. No credit card. Add the SDK in under 5 minutes.