Jun 5, 2026·4 min read

What Is Rate Limiting?

Rate limiting is a control that caps how many requests a client can make to a server within a defined time window. Exceed the limit and the server returns a 429 (Too Many Requests) response instead of processing the request.

Why it exists

Without rate limiting, a single client (or bot) can send unlimited requests. That creates problems: abuse of login forms for credential stuffing, scraping of entire datasets in seconds, or enough traffic to overwhelm infrastructure.

How it works

The server (or a proxy in front of it) tracks requests by a key: usually IP address, API token, or user ID. A counter increments on each request and resets after the window expires. If the counter exceeds the limit before the reset, requests are rejected until the window clears.

What it protects against

Credential stuffing: limiting login attempts makes bulk password testing uneconomical.
Scraping: caps on read endpoints raise the time cost of extracting large datasets.
API abuse: preventing any single user from monopolizing shared compute capacity.
DDoS contributions: reducing the maximum request rate any single source can achieve.