Overview
API rate limiting fails when it's implemented as a uniform limit applied to all consumers regardless of their tier, use case, or the nature of the endpoint. A background batch job legitimately makes 10,000 requests in a minute; a single-user frontend makes 50. Applying the same limit to both either throttles the legitimate batch process or provides insufficient protection against the abusive single user. Rate limiting must be scoped to the consumer identity tier and the endpoint's resource cost.
The API Rate Limiting Framework selects the right algorithm, enforces limits per consumer tier, and communicates limit state to consumers so they can implement backoff correctly.