Enforcement at microsecond speed

Fairvisor is built for one thing: fast decisions under real production load, with sub-millisecond targets in documented deployment patterns.

What Fairvisor Does for SREs

Fast by Design

Fairvisor evaluates allow/throttle/reject in-process and in-memory. Optimized for hot-path performance first. → Performance tuning

Microsecond-Class Decision Time

Docs and benchmarks show a microsecond-class decision path with sub-millisecond targets in typical deployments. Validate latency in your own traffic profile and gateway topology.

Predictable Under Load

Counters and policy checks stay local to the edge process. Stable latency characteristics even during bursts. No remote calls on the hot path.

Policy Propagation Without Hot-Path Penalty

Policies sync asynchronously. Data-plane requests are never blocked waiting for control-plane response.

Fail-Open by Default

If policy data is temporarily unavailable or stale, traffic is allowed by default with explicit telemetry. Enforcement never becomes a hard outage trigger.

Graceful Degradation

No cliff, no thundering herd. Controlled backpressure at 80% (warning header), 95% (throttle with 200–500ms delay), 100% (reject with Retry-After + jitter).

Decision Tracing from 429 to Root Cause

Reject responses include reason/retry metadata. For policy/rule attribution, use debug session headers (X-Fairvisor-Debug-*). → Decision tracing

Prometheus Metrics Out of the Box

fairvisor_decisions_total, fairvisor_decision_duration_seconds, fairvisor_config_version and related metrics are exposed via /metrics. Prometheus scrape/forwarding setup remains part of your infra config. → Metrics reference

Incident Runbook

What the first 10 minutes of a rate limiting incident look like with Fairvisor:

T+0 — Reject spike alert fires. fairvisor_decisions_total{action="reject"} crosses threshold.

T+1 — Check which route and limit key is triggering. fairvisor_decisions_total grouped by route and limit_key shows the source immediately.

T+2 — Pull decision trace for a sample 429. Use X-Fairvisor-Reason/Retry-After, then enable debug session headers for policy/rule attribution. → Debug session docs

T+5 — If abuse confirmed: activate kill-switch for the offending tenant. Propagation is designed to be fast and should be validated against your deployment.

T+10 — Incident contained. Audit log captures operator identity, action, and scope. → Kill-switch runbook

Total investigation time without Fairvisor: 20–40 minutes. With decision tracing: under 5.

Who This Is For

  • SREs and on-call engineers who own API reliability
  • Platform engineers setting SLOs for shared rate limiting infrastructure
  • DevOps teams deploying enforcement as a shared service
  • Teams where enforcement latency affects production p99

FAQ

How much latency does Fairvisor add?

Fairvisor runs in-process and in-memory, with sub-millisecond targets in documented deployment patterns. Actual p95/p99 depends on gateway wiring, workload shape, and environment.

What happens if the policy control plane goes down?

Fail-open by default. If policy data is unavailable or stale, traffic is allowed through with explicit telemetry logged. Enforcement never becomes a hard outage trigger. You can configure fail-closed per route if your use case requires it.

How quickly do policy changes propagate to the edge?

Policy sync is asynchronous and designed for seconds-scale propagation in normal conditions. Validate propagation and alert thresholds in your own environment. → Performance tuning

What Prometheus metrics are available out of the box?

fairvisor_decisions_total (labeled by action, route, limit_key), fairvisor_decision_duration_seconds, fairvisor_config_version, fairvisor_loops_detected_total, fairvisor_circuit_breaker_trips_total and other counters/histograms via /metrics. Prometheus scrape wiring is configured in your stack. → Metrics reference

How does graceful degradation work?

No cliff, no thundering herd. At 80% of limit: warning header. At 95%: throttle with 200–500ms delay. At 100%: reject with Retry-After plus jitter. The jitter prevents synchronized retry storms when all rejected clients see the same Retry-After value.

How do I trace why a specific request was rejected?

Start with reject headers (X-Fairvisor-Reason, Retry-After, RateLimit*). For policy/rule attribution, enable debug session headers (X-Fairvisor-Debug-*). → Decision tracing

What is the kill-switch and when should I use it?

The kill-switch blocks traffic for a specific scope (tenant, route, or descriptor value) and is intended for rapid incident containment. Use it when abuse is confirmed and verify propagation in your deployment runbook. → Kill-switch runbook

Can we scope limits per tenant without creating noisy-neighbor regressions?

Yes. Limits are keyed by tenant/user/route dimensions, so one tenant’s spike does not consume another tenant’s quota. This keeps enforcement isolation aligned with your SLO boundaries.

Why teams choose Fairvisor

100μs decisions that don't eat your latency budget

In-process, in-memory evaluation. Policy enforcement adds microseconds, not milliseconds. Never your bottleneck.

Controlled backpressure, not a cliff

Staged degradation at 80%, 95%, 100% prevents thundering herd on limit breach. Jitter on Retry-After prevents synchronized retries.

Trace from 429 to root cause with deterministic workflow

Reason/retry headers plus debug session attribution (X-Fairvisor-Debug-*) give an operator path from reject to policy/rule without blind log hunting.

Targets

Metric Target
Decision latency p50 Microsecond-class target
Decision latency p99 Sub-millisecond target (deployment-dependent)
Decision latency p99.9 Low-millisecond target (deployment-dependent)
Bundle propagation Seconds-scale target (deployment-dependent)
Kill-switch effect Rapid containment target (deployment-dependent)

Keep your latency budget for your product, not your rate limiter

Deploy Fairvisor edge