概述

Rate Limiting (Deep Workflow)

Rate limits balance fairness, availability, and abuse prevention. Design explicitly: who is throttled, what resource is limited, and how clients should back off.

When to Offer This Workflow

Trigger conditions:

Protecting public APIs, auth endpoints, or expensive operations
Multi-tenant “noisy neighbor” isolation
Retry storms after incidents causing cascading 429/502

Initial offer:

Use six stages: (1) threat & fairness model, (2) dimensions & keys, (3) algorithms & config, (4) distributed enforcement, (5) client protocol & UX, (6) observability & tuning). Confirm enforcement layer (API gateway vs app middleware vs edge).

Stage 1: Threat & Fairness Model

Goal: Distinguish legitimate bursts (batch jobs, mobile retries) from abuse; align limits with product tiers and SLAs.

Exit condition: Written policy: free vs paid limits, partner caps, burst allowances.

Stage 2: Dimensions & Keys

Goal: Choose stable limit keys: authenticated user id > API key > IP (with shared-NAT caveats).

Practices

Per-tenant and global limits; separate expensive routes (exports, search)

Stage 3: Algorithms & Config

Goal: Token bucket / leaky bucket for smooth bursts; sliding window for strict per-minute caps; consider concurrency limits separately from request rate.

Stage 4: Distributed Enforcement

Goal: Central store (Redis, etc.) with atomic increments; handle multi-region (sticky routing vs shared counters); mind clock skew.

Stage 5: Client Protocol & UX

Goal: Consistent 429 responses with Retry-After; document exponential backoff + jitter; optional X-RateLimit-* headers for transparency.

Stage 6: Observability & Tuning

Goal: Metrics on throttles by route and actor class; alerts on abnormal deny spikes (attack vs misconfigured client).

Final Review Checklist

[ ] Policy matches tiers and fairness goals
[ ] Limit keys stable and hard to spoof
[ ] Algorithm matches burst vs sustained semantics
[ ] Distributed correctness considered
[ ] Client-facing 429 behavior documented
[ ] Metrics and tuning loop defined

Tips for Effective Guidance

Coordinate with authentication—anonymous IP limits are coarse.
Don’t throttle health checks in ways that break monitors.
GraphQL: consider query cost / depth limits, not only HTTP count.
WebSockets: separate connection caps from message rate limits.

Handling Deviations

Edge/CDN: limits may differ from origin—document both layers.

版本历史

共 1 个版本

v1.0.0 当前

2026-03-31 08:33 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)