Documentation · WaitState

Quick Start

Get adaptive gating running in your application in three steps.

1. Create an API key pair

Sign up and create a key pair in the dashboard. You'll get a publishKey (identifies your org) and a secretKey (signs telemetry). The secret is shown once.

2. Install and configure the SDK

npm install @waitstate/sdk

import { WaitState } from '@waitstate/sdk';

const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

3. Gate traffic

Call gate(tag, weight) before processing a request. It's synchronous, in-memory, and never makes a network call.

const decision = ws.gate('free', 1);

if (decision.allowed) {
  // proceed with request
} else {
  // respond with 429 or queue
  res.status(429).json({ error: 'rate_limited' });
}

That's it. The SDK handles pulse telemetry and policy fetching in the background. gate() is always fast, always synchronous, always fail-open.

Tags & Weights

Tags categorize traffic. Weights determine priority. When the reflex engine fires, low-weight traffic is shed first.

Pass a tag and weight to gate() on each call.
Create tags in the dashboard with default weights. The SDK doesn't need to know about tags ahead of time.
Reflex rules can target specific tags (e.g. block "free" when latency > 500ms) or apply to all traffic.

How weight-based gating works

The control plane sets a globalMaxWeight and per-tag tagMaxWeights in the policy. When you call gate(tag, weight), the SDK compares your weight against these limits. If your weight exceeds the max, you're denied.

Example: during a latency spike, the policy might set globalMaxWeight: 5. Traffic with weight 1 ("free") gets blocked. Traffic with weight 10 ("enterprise") still gets blocked because 10 > 5. But the policy can also set tagMaxWeights to {"enterprise": Infinity} to exempt specific tags.

Reflex Rules

Reflex rules define automated responses to backend health metrics. They run in the control plane's PulseAggregator and affect the policy that gets pushed to your SDK instances.

Rule schema

{
  "tagName": "free",
  "metric": "latency",
  "operator": "gt",
  "threshold": 500,
  "action": "block",
  "actionValue": null,
  "enabled": true,
  "priority": 1
}

Field	Type	Description
`tagName`	string \| null	Target tag. `null` means all traffic.
`metric`	string	Metric name: `latency`, `errors`, `p50_latency`, `p95_latency`, `p99_latency`, or any custom metric name.
`operator`	string	`gt`, `gte`, `lt`, `lte`.
`threshold`	number	Value to compare against.
`action`	string	`throttle` or `block`.
`actionValue`	number \| null	Multiplier for throttle (e.g. 0.5 = halve allowed weight). Null for block.
`enabled`	boolean	Whether the rule is active. Defaults to `true`.
`priority`	number	Evaluation order. Lower = higher priority.

Throttle vs Block

Action	Effect on policy	Use when
`block`	Sets the tag's max weight to `0`. All traffic for that tag is denied.	Origin is in danger. Shed this traffic class entirely.
`throttle`	Multiplies the tag's max weight by `actionValue`. E.g. `0.5` halves it.	Origin is stressed but not critical. Reduce load gradually.

Throttle example

{
  "tagName": "pro",
  "metric": "latency",
  "operator": "gt",
  "threshold": 500,
  "action": "throttle",
  "actionValue": 0.5,
  "enabled": true,
  "priority": 2
}

When latency exceeds 500ms, the reflex engine multiplies pro's max weight by 0.5. If the default max weight for pro is 10, it drops to 5. A gate('pro', 5) call still passes, but gate('pro', 7) would be denied.

Scenario: layered rules

Three rules, evaluated by priority:

1. Block free tier when latency is critical

{
  "tagName": "free",
  "metric": "latency",
  "operator": "gt",
  "threshold": 1000,
  "action": "block",
  "priority": 1
}

2. Throttle free tier when latency is elevated

{
  "tagName": "free",
  "metric": "latency",
  "operator": "gt",
  "threshold": 500,
  "action": "throttle",
  "actionValue": 0.5,
  "priority": 2
}

3. Throttle pro tier when errors spike

{
  "tagName": "pro",
  "metric": "errors",
  "operator": "gt",
  "threshold": 50,
  "action": "throttle",
  "actionValue": 0.7,
  "priority": 3
}

Progressive load shedding

As conditions worsen, more rules fire and lower tiers are shed first:

Conditions	Rules fired	free	pro	enterprise
latency: 80ms, errors: 2	none	allowed	allowed	allowed
latency: 600ms	#2	throttled	allowed	allowed
latency: 1200ms	#1	blocked	allowed	allowed
latency: 1200ms, errors: 60	#1, #3	blocked	throttled	allowed

The key idea: throttle gives you a gradient between "fully allowed" and "fully blocked." Use low thresholds with throttle as an early warning, and high thresholds with block as a circuit breaker. Stack multiple rules to create progressive load shedding.

Manage rules in the dashboard or via the management API.

Pulse & Policy Flow

The SDK communicates with the control plane through two channels. You don't interact with either directly.

Pulses (SDK → Control Plane)

The SDK periodically sends a pulse containing gate() call counts, per-tag metrics, and process-level latency and error counts. Pulses are HMAC-signed with your secret key.

Pulse headers

POST /v1/pulse
x-waitstate-id: ws_pub_xxx          # your publishKey
x-waitstate-signature: <hmac-hex>   # HMAC-SHA256(body + '.' + timestamp, secretKey)
x-waitstate-timestamp: 1740000060000
content-type: application/json

Pulse payload

{
  "instanceId": "web-01",
  "siteId": "site-prod",
  "usageDelta": 967,
  "bouncedUnits": 47,
  "metrics": { "latency": 142, "errors": 0 },
  "tagMetrics": [
    { "tag": "free", "latency": 0, "count": 847 },
    { "tag": "pro",  "latency": 0, "count": 120,
      "customMetrics": {
        "queue_depth": { "min": 3, "max": 87, "sum": 450, "count": 10 }
      }
    }
  ],
  "ts": 1740000060000
}

Field	Description
`instanceId`	Unique identifier for this SDK instance. Defaults to `randomUUID()` if not set.
`siteId`	Optional. Shards this instance into a separate aggregator for independent health tracking.
`usageDelta`	Number of `gate()` calls since the last pulse. Used for billing and plan cap enforcement.
`bouncedUnits`	Number of `gate()` calls that were denied since the last pulse.
`metrics`	Process-level metrics: `latency` (average ms), `errors` (count). The reflex engine evaluates rules against these.
`tagMetrics`	Array of per-tag counters. Each entry has `tag` (name), `latency` (avg ms), `count` (calls), and optional `customMetrics` (min/max/sum/count aggregates from `report()`). Drives per-tag reflex rules.
`ts`	Timestamp in epoch milliseconds. Must match the `x-waitstate-timestamp` header used in HMAC signing.

Policy (Control Plane → SDK)

The pulse response contains the current policy. The SDK also polls GET /v1/policy/{orgId} with a JWT for edge-cached reads. Both sources update the same in-memory policy cache.

Policy shape

{
  "globalMaxWeight": 5,
  "tagMaxWeights": { "free": 0, "pro": 3 },
  "pulseInterval": 2000,
  "leaseDurationSeconds": 120,
  "status": "ok"            // "ok" | "cap_exceeded"
}

Field	Description
`globalMaxWeight`	Maximum weight allowed globally. Traffic with weight above this is denied.
`tagMaxWeights`	Per-tag weight overrides. A tag set to `0` is fully blocked.
`pulseInterval`	Server-controlled pulse interval in ms. SDK adjusts automatically.
`killSignal`	Emergency kill. SDK stops pulsing, resets to fail-open, sleeps 24h.
`status`	`ok`, `over_limit`, or `cap_exceeded`. When cap is exceeded, SDK fails open and backs off to 5-minute pulses.
`leaseDurationSeconds`	How long the SDK trusts the current policy without hearing from the control plane. If the SDK doesn't receive a successful pulse or policy response within this window, it enters safe mode (50 req/sec). Varies by plan: Hobby 300s, Pro 120s, Enterprise 60s.

Key insight: gate() never makes a network call. It reads from the in-memory policy cache. Pulses and policy fetches happen in the background on a timer. This is why gate() is synchronous and sub-millisecond.

Authentication

WaitState uses a two-key model:

Key	Purpose	Visibility
`publishKey`	Identifies your organization. Sent as `x-waitstate-id` on pulses.	Semi-public (in your app code).
`secretKey`	Signs pulses (HMAC-SHA256). Exchanged for a JWT. Never sent as plaintext on the wire.	Secret. Environment variable only.

Pulse auth (HMAC-SHA256)

Each pulse is signed: HMAC-SHA256(body + timestamp, secretKey). The signature is sent as x-waitstate-signature. The control plane verifies before accepting.

Policy auth (JWT)

On init, the SDK exchanges your keys for a JWT via POST /v1/auth/token. The JWT is used for GET /v1/policy/{orgId} reads and auto-refreshes 2 minutes before expiry.

Management auth

Dashboard API calls use session-based auth (cookie). Programmatic management uses a bearer token.

Site Sharding

By default, the control plane routes all pulses from the same organization to a single aggregator. If you run multiple sites (e.g. staging, production, or separate apps) within one org, you can isolate them with siteId.

Pass siteId when creating the SDK instance. Each unique siteId gets its own aggregator with independent health tracking, reflex evaluation, and usage accounting. Sites are created automatically on the first pulse. No manual setup required. You can view and manage your sites in the dashboard.

If omitted, siteId defaults to your orgId (one aggregator per org).
Billing is still per-org. Monthly usage is tracked by orgId regardless of how many sites you have.
Each site gets its own policy. A latency spike on staging won't throttle production.
The number of sites is capped per plan: Hobby allows 1, Pro allows 10, Enterprise is unlimited.

Fail-Open & Safe Mode

WaitState is designed to never block traffic due to its own failures. But unlimited traffic during a prolonged outage is also dangerous. The lease mechanism provides a safety net.

Fail-open guarantees

Control plane unreachable (short-term): SDK uses the last cached policy. Gate decisions continue normally.
Never synced: If no policy was ever fetched (bootstrap), the default policy allows everything (globalMaxWeight: Infinity).
Pulse fails: SDK retries on next interval. Telemetry counters are preserved and re-sent on the next successful pulse.
Auth fails: SDK starts pulsing anyway with the default interval. Init errors go to onError.
No reflex rules: All traffic is allowed.
Unknown tag: gate() returns allow if the weight is under the global max.
Kill signal: SDK stops pulsing, resets to fail-open default, and resumes after 24 hours.
Monthly cap exceeded: Control plane returns status: cap_exceeded. SDK fails open (globalMaxWeight: Infinity) and backs off to 5-minute pulse intervals. When the new month starts, normal policy resumes automatically.

Lease & safe mode

Each policy includes a leaseDurationSeconds field. This is the maximum time the SDK trusts its cached policy without hearing from the control plane. If no successful pulse or policy response arrives within the lease window, the SDK enters safe mode.

Plan	Lease Duration
Hobby	5 minutes (300s)
Pro	2 minutes (120s)
Enterprise	1 minute (60s)

Safe mode behavior

The safeModeStrategy constructor option controls what happens when the lease expires:

Strategy	Behavior
`open` (default)	Fail fully open—all gate calls are allowed regardless of weight or tag.
`fixed_rps`	Throttle to `safeModeMaxRps` requests per second per instance (default: 50). Excess calls are denied.
`last_policy`	Keep enforcing the last cached policy as-is. Gate checks continue using stale thresholds until sync resumes.

In all strategies:

gate() returns reason: lease_expired for all calls (both allowed and denied).
A [WAITSTATE-FATAL] log message fires once when safe mode activates.
When the SDK successfully syncs again, normal policy-based gating resumes.

Why not just fail-open forever?

Pure fail-open means a prolonged control plane outage removes WaitState's intelligence layer entirely. Safe mode provides a hard floor: your API keeps serving traffic, but at a rate that won't overwhelm your backend. Your existing rate limiters and circuit breakers still apply either way.

Bootstrap grace period

The lease clock only starts after the first successful sync. During bootstrap (before the SDK has ever reached the control plane), gate() is fully fail-open. This prevents false safe-mode activations during cold starts or deployment rollouts.

Design principle: WaitState fails open during transient issues and transitions to safe mode during prolonged outages. You control the strategy via safeModeStrategy in the constructor—open to fail fully open, fixed_rps to throttle, or last_policy to keep enforcing stale rules.

Telemetry & Metrics

Every SDK and the agent expose the same telemetry methods. These feed into the pulse payload so the reflex engine can react to backend health. Without telemetry, reflex rules on latency and errors evaluate against zero.

Built-in metrics

All SDKs provide methods for reporting latency and errors. Method names follow each language's conventions (reportLatency in JS, report_latency in Rust/Python) but the semantics and aggregation behavior are identical.

Method	Description
`reportLatency(ms, tag?)`	Record a latency observation in milliseconds. Averaged per-tag at pulse time.
`reportError(tag?)`	Increment the error counter for a tag. Summed at pulse time.
`startTimer(tag?)`	Returns a stop function. Calling stop reports elapsed ms as latency via `reportLatency`.

// Timer pattern (recommended)
const stop = ws.startTimer('search');
const result = await doSearch(query);
stop(); // reports elapsed ms as latency

// Direct reporting
ws.reportLatency(42.5, 'search');  // ms
ws.reportError('search');

Custom metrics: report(metric, value, tag?)

Report a custom metric value. Each metric is aggregated as min/max/sum/count per pulse window, giving you average, minimum, and maximum values in the control plane. Create reflex rules on any custom metric name to trigger automated responses.

// Report a custom metric (min/max/sum/count aggregated per pulse)
ws.report('queue_depth', 87, 'search');
ws.report('cache_hit_rate', 0.92, 'search');

// Without a tag, defaults to '__default__'
ws.report('active_connections', 142);

Param	Type	Default	Description
`metric`	string	–	Metric name. Lowercase letters, digits, and underscores. Must start with a letter. Max 63 characters. Cannot be a reserved name (`latency`, `errors`, `p50_latency`, `p95_latency`, `p99_latency`).
`value`	number	–	The metric value to record.
`tag`	string	`__default__`	Tag to associate the metric with.

Custom metric reflex rule

Custom metrics work with the same reflex rule engine as latency and errors. The reflex engine evaluates the metric's average (sum/count) against your threshold.

// Block search traffic when queue depth exceeds 100
{
  "tagName": "search",
  "metric": "queue_depth",
  "operator": "gt",
  "threshold": 100,
  "action": "block",
  "actionValue": null,
  "enabled": true,
  "priority": 1
}

Plan limits

Plan	Custom metrics per tag
Hobby	–
Pro	5
Enterprise	Unlimited

Metrics reported via report() always flow through the pipeline. The plan limit controls how many distinct custom metric names can be used in reflex rules. Excess metrics in the data pipeline are silently trimmed (alphabetical order, kept first N).

All SDKs, same interface. The method names follow each language's conventions (reportLatency in JS, report_latency in Rust/Python) but the semantics, aggregation behavior, and naming rules are identical across every SDK and the agent.

Use Cases

WaitState works anywhere you need to shed traffic intelligently under pressure: GraphQL APIs, AI gateways, e-commerce, multi-tenant SaaS, fintech, gaming, IoT, and more. Each use case includes integration code, reflex rules, and a walkthrough of what happens under load.

See all use cases with integration examples →

JavaScript / TypeScript SDK

The official JS/TS SDK. Zero dependencies. All SDKs share the same gate(tag, weight) interface and pulse/policy protocol.

Package: @waitstate/sdk
Runtime: Node.js ≥ 20.3.0
Module: ESM and CJS (dual exports)

Install

npm install @waitstate/sdk

Constructor

const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
  siteId: 'site-prod',            // optional, shard key (defaults to orgId)
  instanceId: 'web-01',           // optional, defaults to randomUUID()
  pulseInterval: 5000,            // optional, ms between pulses (server can override)
  safeModeStrategy: 'open',         // optional, 'open' | 'fixed_rps' | 'last_policy'
  safeModeMaxRps: 50,               // optional, max req/sec in safe mode (default: 50)
  baseUrl: 'https://api.waitstate.io',  // optional
  onError: (err) => console.error('[waitstate]', err),
});

Option	Type	Default	Description
`publishKey`	string	–	Required. Your publish key from the dashboard.
`secretKey`	string	–	Required. Your secret key. Store in env vars.
`siteId`	string	orgId	Shard key for aggregator routing. Use to isolate sites within the same org.
`instanceId`	string	`randomUUID()`	Identifies this SDK instance in telemetry.
`pulseInterval`	number	`5000`	Initial pulse interval in ms. Server can override via policy.
`baseUrl`	string	`https://api.waitstate.io`	Control plane URL.
`safeModeStrategy`	string	`open`	Behavior when the lease expires. `open` = fail fully open, `fixed_rps` = throttle to `safeModeMaxRps`, `last_policy` = keep enforcing the last cached policy.
`safeModeMaxRps`	number	`50`	Max requests per second in safe mode. Only used when `safeModeStrategy` is `fixed_rps`.
`onError`	function	–	Error callback. Init failures, pulse failures, auth failures.

gate(tag?, weight?)

Synchronous. Returns a GateResult. Never throws.

// Tag-based gating with weight
const d1 = ws.gate('enterprise', 10);  // high-priority traffic
const d2 = ws.gate('free', 1);         // low-priority traffic
const d3 = ws.gate();                  // no tag, weight defaults to 1

// Check the reason for denial
if (!decision.allowed) {
  switch (decision.reason) {
    case 'tag_blocked':   // this tag is explicitly blocked by policy
    case 'over_weight':   // weight exceeds global max
    case 'global_block':  // all traffic blocked
    case 'kill_signal':   // emergency kill from control plane
    case 'lease_expired': // control plane lease expired, safe mode (safeModeMaxRps req/s)
      break;
  }
}

Param	Type	Default	Description
`tag`	string	`__default__`	Traffic tag (e.g. "free", "pro", "enterprise").
`weight`	number	`1`	Request weight. Higher = higher priority.

GateResult

// Allowed
{ "allowed": true, "reason": "allowed" }

// Denied
{ "allowed": false, "reason": "tag_blocked" }
{ "allowed": false, "reason": "over_weight" }
{ "allowed": false, "reason": "global_block" }
{ "allowed": false, "reason": "kill_signal" }

// Safe mode (lease expired, behavior depends on safeModeStrategy)
{ "allowed": true,  "reason": "lease_expired" }
{ "allowed": false, "reason": "lease_expired" }

Field	Type	Description
`allowed`	boolean	Whether the request should proceed.
`reason`	string	`allowed`, `over_weight`, `tag_blocked`, `global_block`, `kill_signal`, or `lease_expired`.

Telemetry & custom metrics

The JS SDK exposes reportLatency(), reportError(), startTimer(), and report() for custom metrics. These are platform-wide features available in every SDK and the agent. See Telemetry & Metrics for full documentation, naming rules, and plan limits.

shutdown()

Async. Stops the pulse timer, clears the auth refresh timer, and sends one final pulse to flush pending telemetry.

// Flush pending pulses on graceful shutdown
process.on('SIGTERM', async () => {
  await ws.shutdown();
  process.exit(0);
});

Advanced exports

For low-level use, the SDK also exports individual components:

gate(policy, tag?, weight?) -standalone pure function, takes a BouncerPolicy directly
PolicyCache -in-memory policy store (get(), update(), reset())
TelemetryCollector -accumulates gate metrics between pulses
signPulse(body, timestamp, secretKey) -returns HMAC-SHA256 hex digest
createSignedHeaders(body, publishKey, secretKey) -returns signed header object
AuthManager -manages JWT token lifecycle
WaitStateError -error class with status, code, message

Example: Express middleware

import express from 'express';
import { WaitState } from '@waitstate/sdk';

const app = express();
const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

app.use((req, res, next) => {
  const tag = req.user?.plan ?? 'free';
  const weight = tag === 'enterprise' ? 10 : tag === 'pro' ? 5 : 1;
  const decision = ws.gate(tag, weight);

  if (!decision.allowed) {
    return res.status(429).json({ error: 'rate_limited' });
  }

  // Track latency and errors for the reflex engine
  const stop = ws.startTimer(tag);
  res.on('finish', () => {
    stop();
    if (res.statusCode >= 500) ws.reportError(tag);
  });
  next();
});

Framework middleware

Drop-in middleware for popular frameworks. Each is a separate subpath export with zero runtime dependencies on the framework — types are structural.

import express from 'express';
import { WaitState } from '@waitstate/sdk';
import { createMiddleware } from '@waitstate/sdk/express';

const app = express();
const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

app.use(createMiddleware({ waitstate: ws }));

import Fastify from 'fastify';
import { WaitState } from '@waitstate/sdk';
import { createPlugin } from '@waitstate/sdk/fastify';

const app = Fastify();
const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

createPlugin({ waitstate: ws })(app);

import { Hono } from 'hono';
import { WaitState } from '@waitstate/sdk';
import { createMiddleware } from '@waitstate/sdk/hono';

const app = new Hono();
const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

app.use('*', createMiddleware({ waitstate: ws }));

import { WaitState } from '@waitstate/sdk';
import { withGate } from '@waitstate/sdk/nextjs';

const ws = new WaitState({
  publishKey: process.env.WAITSTATE_PUBLISH_KEY,
  secretKey: process.env.WAITSTATE_SECRET_KEY,
});

export const GET = withGate({ waitstate: ws }, async (req) => {
  return Response.json({ ok: true });
});

Middleware options

All middleware accept the same options:

createMiddleware({
  waitstate: ws,
  tagFrom: 'x-waitstate-tag',         // header name, or (req) => string
  weightFrom: 'x-waitstate-weight',   // header name, or (req) => number
  retryAfter: 60,                     // Retry-After header value (seconds)
  onDenied: (req, res, result) => {   // custom deny handler
    res.statusCode = 429;
    res.end('Too many requests');
  },
});

Option	Type	Default	Description
`waitstate`	WaitState	–	Required. Your WaitState instance.
`tagFrom`	string \| function	`x-waitstate-tag`	Header name or function returning the tag.
`weightFrom`	string \| function	–	Header name or function returning the weight.
`retryAfter`	number	`60`	Retry-After header value in seconds on 429.
`onDenied`	function	–	Custom deny handler. Overrides the default 429 response.

Each middleware automatically reports latency (on response) and errors (on 5xx) to the reflex engine.

Gotchas

Node.js only. The JS/TS SDK uses node:crypto for HMAC signing. It does not run in browsers or edge runtimes. For Cloudflare Workers, Fastly Compute, and Vercel Edge, use the Edge SDK (@waitstate/edge), which is compiled from the Rust core to WASM.

Constructor is async internally. The new WaitState() call returns immediately, but fires off auth + first pulse in the background. If these fail, errors go to onError and the SDK still starts pulsing (fail-open). There is no await on the constructor.

Pulse interval is server-controlled. You set an initial pulseInterval, but the control plane can override it via the pulse response. The SDK will automatically adjust.

One instance per process. Each WaitState instance creates its own pulse timer and auth lifecycle. In most apps, create one instance at startup and share it.

Always call shutdown(). Without it, the final pulse won't flush and you'll lose telemetry from the last interval. Wire it to SIGTERM / SIGINT.

Rust SDK

The Rust SDK (waitstate-rs) provides the same gate(tag, weight) interface with lock-free, zero-allocation gate checks. Built for high-throughput services where microsecond-level overhead matters.

Crate: waitstate-rs
Rust: 1.93+, edition 2024
TLS: rustls (no OpenSSL dependency)

Install

cargo add waitstate

Methods

Method	Description
`gate(tag, weight)`	Lock-free gate check. Single-digit microseconds. Same semantics as the JS SDK.
`report_latency(ms, tag)`	Record a latency observation. Truncated to integer ms, averaged per-tag at pulse time.
`report_error(tag)`	Increment the error counter for a tag. Summed at pulse time.
`report(metric, value, tag)`	Report a custom metric. Aggregated as min/max/sum/count per pulse. Same naming rules as JS SDK.
`set_policy(policy)`	Override the current policy (for testing).
`shutdown()`	Abort background tasks.

Internals

Policy stored in ArcSwap for lock-free reads on the gate path.
Per-tag counters use DashMap with AtomicU64 for contention-free telemetry accumulation.
Background tasks (pulse every 20s, sync every 30s) run on Tokio and never block the gate call.
HMAC-SHA256 signing via hmac/sha2 crates. JWT auth for policy reads.
Same fail-open and safe-mode guarantees as the JS SDK. If the lease expires, gate() throttles to 50 req/sec.

Agent (Kubernetes DaemonSet)

The WaitState Agent is a standalone Rust binary that runs as a Kubernetes DaemonSet sidecar. Instead of embedding the SDK in every service, you deploy the agent once per node and call it over localhost.

Binary: waitstate-agent
Port: 9000
Image: ghcr.io/waitstate-io/engine/waitstate-agent

Routes

Method	Path	Description
POST	`/gate`	Gate check. Pass `tag`, `weight`, and optionally `latency_ms` and `error` in the JSON body. Returns allow/deny.
POST	`/coprocess`	Apollo Router coprocessor. Optionally include `latencyMs` and `error` in the body to report metrics. The agent bundles them into the next pulse.
POST	`/report-latency`	Report a latency observation. Pass `ms` (number) and optional `tag`. Returns 204.
POST	`/report-error`	Report an error. Pass optional `tag`. Returns 204.
POST	`/report-metric`	Report a custom metric. Pass `metric` (name), `value` (number), and optional `tag`. Returns 204.
GET	`/health`	Health check for k8s liveness/readiness probes.

How it works

The agent wraps the Rust SDK internally. It collects metrics from all pods on the node via /coprocess, aggregates them, and sends a single pulse to the control plane on behalf of every service on that node. This means:

Your services don't need the WaitState SDK as a dependency. Just POST to localhost:9000/gate.
Metrics from all pods on the node are bundled into one pulse, giving the reflex engine a node-level view of health.
Gate checks are still sub-millisecond (localhost HTTP call + in-memory policy lookup in the agent).
Works with any language. If your service can make an HTTP call, it can use WaitState.

When to use the agent vs. the SDK

Use the SDK when	Use the agent when
A native SDK exists for your language	Your services are polyglot or no native SDK exists yet
You want zero network hops on the gate path	You want a single deployment for the whole node
You need per-process latency/error metrics	You want node-level metric aggregation
You're running outside Kubernetes	You're already running DaemonSets (Datadog, Fluentd, etc.)

Edge SDK (WASM)

The Edge SDK is the Rust core compiled to WebAssembly. It provides the same gate(tag, weight) interface for edge runtimes where Node.js APIs aren't available.

Package: @waitstate/edge
Runtime: Cloudflare Workers, Fastly Compute, Vercel Edge
Engine: waitstate-wasm (compiled from the Rust core via wasm-pack)

Install

npm install @waitstate/edge

Exported functions

Function	Description
`gate(policy, tag?, weight?)`	Pure gate check against a policy object. Returns `GateResult`.
`sign_pulse(body, timestamp, secretKey)`	HMAC-SHA256 signing for pulse payloads. Returns hex digest.

The Edge SDK is lower-level than the Node.js SDK. You manage the pulse/policy lifecycle yourself (typically via a Durable Object or KV cache). The gate() function is a pure, synchronous check against a policy you provide.

Python SDK

Native extension compiled from the Rust core via UniFFI. The same lock-free gate engine, exposed as a Python package with framework middleware for FastAPI, Flask, and Django.

Package: waitstate
Python: ≥ 3.8
Modes: Embedded (full client) or Agent (HTTP to local sidecar)

Install

pip install waitstate

Basic usage

import os
from waitstate import WaitstateClient, WaitstateConfig

client = WaitstateClient(WaitstateConfig(
    publish_key=os.environ["WAITSTATE_PUBLISH_KEY"],
    secret_key=os.environ["WAITSTATE_SECRET_KEY"],
))

result = client.gate("search", 1.0)
if result.allowed:
    # proceed with request
    pass

Framework middleware

import os
from fastapi import FastAPI, Depends, Request
from waitstate import WaitstateClient, WaitstateConfig
from waitstate.middleware.fastapi import create_dependency, create_response_hook

app = FastAPI()
client = WaitstateClient(WaitstateConfig(
    publish_key=os.environ["WAITSTATE_PUBLISH_KEY"],
    secret_key=os.environ["WAITSTATE_SECRET_KEY"],
))

gate = create_dependency(client)
app.middleware("http")(create_response_hook(client))

@app.get("/search")
async def search(request: Request, _=Depends(gate)):
    return {"results": []}

from flask import Flask
from waitstate import WaitstateClient, WaitstateConfig
from waitstate.middleware.flask import init_app

app = Flask(__name__)
client = WaitstateClient(WaitstateConfig(
    publish_key="ws_pub_xxx",
    secret_key="ws_sec_xxx",
))

init_app(app, client)

@app.route("/search")
def search():
    return {"results": []}

from waitstate import WaitstateClient, WaitstateConfig

WAITSTATE_CLIENT = WaitstateClient(WaitstateConfig(
    publish_key="ws_pub_xxx",
    secret_key="ws_sec_xxx",
))

MIDDLEWARE = [
    "waitstate.middleware.django.WaitStateMiddleware",
    # ...
]

Agent mode

If you're using the agent sidecar, the Python SDK can talk to it over localhost instead of running the full Rust client in-process:

from waitstate import WaitstateClient, WaitstateConfig

# Agent mode: no keys needed, talks to local sidecar
client = WaitstateClient(WaitstateConfig(
    publish_key="",
    secret_key="",
    agent_url="http://localhost:9000",
))

Custom metrics

The Python SDK supports custom metrics via report(metric, value, tag). Same naming rules and plan limits as the JS and Rust SDKs.

Go SDK

Coming soon. Static binary compiled from the Rust core via CGo. In the meantime, use the agent to gate Go services over localhost.

Java SDK

Coming soon. JNI bridge to the Rust core. Spring Boot starter included. In the meantime, use the agent to gate Java services over localhost.

C# / .NET SDK

Coming soon. Native bindings via P/Invoke from the Rust core. ASP.NET Core middleware included. In the meantime, use the agent to gate .NET services over localhost.

Ruby SDK

Coming soon. Native gem compiled from the Rust core via Magnus. Rails, Sinatra, and Hanami middleware included. In the meantime, use the agent to gate Ruby services over localhost.

Elixir SDK

Coming soon. Rust NIF via Rustler. Plug middleware included. In the meantime, use the agent to gate Elixir services over localhost.

PHP SDK

Coming soon. Native extension compiled from the Rust core via ext-php-rs. Laravel, Symfony, and Slim middleware included. In the meantime, use the agent to gate PHP services over localhost.

API Reference

The control plane exposes these endpoints. The SDK handles pulse and policy automatically. Management endpoints are for dashboard or CI use.

SDK endpoints (handled by SDK)

Method	Path	Auth	Description
POST	`/v1/auth/token`	publishKey + secretKey	Exchange keys for a JWT. Returns token, expiresAt, orgId.
POST	`/v1/pulse`	HMAC-SHA256	Send telemetry pulse. Response contains updated policy.
GET	`/v1/policy/:orgId`	Bearer JWT	Fetch current policy (edge-cached).

Management endpoints

Method	Path	Description
GET	`/v1/organizations`	List organizations.
POST	`/v1/organizations`	Create organization.
GET	`/v1/api-keys`	List API key pairs.
POST	`/v1/api-keys`	Create key pair.
DELETE	`/v1/api-keys/:id`	Revoke key pair.
POST	`/v1/api-keys/:id/rotate`	Rotate key pair.
GET	`/v1/tags`	List tags.
POST	`/v1/tags`	Create tag.
DELETE	`/v1/tags/:id`	Delete tag.
GET	`/v1/sites`	List sites.
DELETE	`/v1/sites/:id`	Delete site.
GET	`/v1/reflex-rules`	List reflex rules.
POST	`/v1/reflex-rules`	Create reflex rule.
PATCH	`/v1/reflex-rules/:id`	Update reflex rule.
DELETE	`/v1/reflex-rules/:id`	Delete reflex rule.
GET	`/v1/usage`	Get usage ledger.
GET	`/v1/telemetry`	Get telemetry rollups.

Example: Create API key pair

curl -X POST https://api.waitstate.io/v1/api-keys \
  -H "Authorization: Bearer <management-token>" \
  -H "Content-Type: application/json" \
  -d '{ "environment": "live" }'

{
  "id": "key_xxx",
  "publishKey": "ws_pub_xxx",
  "secretKey": "ws_sec_xxx",
  "environment": "live"
}

Glossary

Canonical definitions for terms used across WaitState SDKs, the control plane, and this documentation.

Term	Definition
Bounced units	Gate calls that were denied (not allowed through the gate). Tracked separately from allowed units in telemetry rollups.
Cap exceeded	State reached when an organization's gate call count hits the plan limit for the current billing period. Behavior depends on plan configuration—calls may be denied or overage-billed.
Control plane	The centralized API server (`api.waitstate.io`) that stores configuration, evaluates gate calls, and serves policies to SDKs. All management and runtime API traffic flows through the control plane.
EdgeGate	The WASM-based SDK variant designed for edge runtimes (Cloudflare Workers, Deno Deploy, etc.). Evaluates policies locally with minimal latency.
Fail-open	Default SDK behavior when the control plane is unreachable or the policy cache has expired: the gate allows all traffic through rather than blocking. Ensures your application keeps running even if WaitState is down.
Gate	A checkpoint in your application code where the SDK evaluates whether a request should proceed. Created by calling `gate()` in the SDK.
Gate call	A single invocation of the gate. Each call consumes one unit against your plan's monthly allowance. This is the primary billing metric.
Kill signal	An emergency control that forces all gates to deny traffic, regardless of normal policy rules. Issued from the dashboard or management API.
Lease duration	How long the SDK caches its local copy of the policy before requiring a fresh fetch from the control plane. If the SDK cannot reach the control plane before the lease expires, it enters safe mode.
Organization	The top-level account entity. Owns sites, API keys, and billing. Each user belongs to one organization.
Policy	The full set of rules and configuration the SDK needs to evaluate gates locally. Fetched from the control plane on each pulse and cached for the lease duration.
Policy cache	The SDK's local, in-memory copy of the policy. Refreshed every pulse interval. When the cache expires (lease duration exceeded) without a successful refresh, the SDK enters safe mode.
Pulse	The periodic heartbeat where the SDK phones home to the control plane to fetch the latest policy and report telemetry. Frequency is set by the pulse interval.
Pulse interval	Time between pulses. Varies by plan: 30s (Hobby), 2s (Pro), sub-second (Enterprise).
Reflex rule	A conditional rule evaluated server-side by the arbiter. Matches gate calls by tag and applies an action (allow, deny, or weight override). Reflex rules take precedence over default policy.
Safe mode	The SDK state entered when the policy cache has expired and the control plane is unreachable. Behavior depends on `safeModeStrategy`: `open` (default) fails fully open, `fixed_rps` throttles to `safeModeMaxRps` (default: 50 req/sec), `last_policy` keeps enforcing the last cached policy.
Site	An isolated environment within an organization (e.g., production, staging). Each site has its own API key pair, tags, rules, and usage counters. Plan limits restrict the number of sites.
Tag	A string label attached to a gate call to categorize it (e.g., `"checkout"`, `"search"`). Tags are used to target reflex rules and filter telemetry. Plan limits restrict the number of unique tags.
Telemetry rollup	Aggregated gate call statistics (allowed, denied, total) grouped by tag and time bucket. Reported by the SDK on each pulse and stored by the control plane.
Token (API key pair)	A client ID and secret used to authenticate SDK and management API requests. Scoped to a single site. Created in the dashboard or via the management API.
Weight	A numeric value passed to `gate(tag, weight)` that represents the cost or priority of a request. The SDK compares it against the policy's `globalMaxWeight` and per-tag `tagMaxWeights` thresholds—if the weight exceeds either limit, the call is denied. Higher weight = harder to pass through during throttling. Default: 1.

Quick Start

1. Create an API key pair

2. Install and configure the SDK

3. Gate traffic

Tags & Weights

How weight-based gating works

Reflex Rules

Rule schema

Throttle vs Block

Throttle example

Scenario: layered rules

Progressive load shedding

Pulse & Policy Flow

Pulses (SDK → Control Plane)

Pulse headers

Pulse payload

Policy (Control Plane → SDK)

Policy shape

Authentication

Pulse auth (HMAC-SHA256)

Policy auth (JWT)

Management auth

Site Sharding

Fail-Open & Safe Mode

Fail-open guarantees

Lease & safe mode

Safe mode behavior

Why not just fail-open forever?

Bootstrap grace period

Telemetry & Metrics

Built-in metrics

Custom metrics: report(metric, value, tag?)

Custom metric reflex rule

Plan limits

Use Cases

JavaScript / TypeScript SDK

Install

Constructor

gate(tag?, weight?)

GateResult

Telemetry & custom metrics

shutdown()

Advanced exports

Example: Express middleware

Framework middleware

Middleware options

Gotchas

Rust SDK

Install

Methods

Internals

Agent (Kubernetes DaemonSet)

Routes

How it works

When to use the agent vs. the SDK

Edge SDK (WASM)

Install

Exported functions

Python SDK

Install

Basic usage

Framework middleware

Agent mode

Custom metrics

Go SDK

Java SDK

C# / .NET SDK

Ruby SDK

Elixir SDK

PHP SDK

API Reference

SDK endpoints (handled by SDK)

Management endpoints

Example: Create API key pair

Glossary

Will your rate limiter know before your customers do?