429 Too Many Requests: What It Means and How To Fix It
An HTTP 429 error means the server understood your request but rate-limited it for sending too many requests too quickly. The tricky part is identifying what hit the limit: an IP, session, account, API key, or concurrent worker pool. Below, you’ll learn what 429 Too Many Requests means, what triggers it, and how to fix it without making the block worse.
TL;DR
- 429 Too Many Requests means the server understood your request but rate-limited the client, IP, session, account, API key, or another request identity.
- Check Retry-After first. If the header is missing, use exponential backoff with jitter instead of retrying immediately.
- Reduce concurrency before changing infrastructure. Burst limits can fail even when total request volume looks reasonable.
- Rotate IPs only when the limit is tied to IP address. Session, cookie, and account limits need different fixes.
- Validate response bodies. Some targets hide rate limits behind empty 200 OK pages, soft blocks, or CAPTCHA redirects.
- For high-volume scraping, size your proxy pool against the target's per-IP threshold and leave room for uneven distribution.
What is the HTTP 429 error?
429 Too Many Requests is the HTTP status code a server returns when a client exceeds a permitted request rate.
The limit can apply to different request identities – an IP address, session, account, API key, cookie, or client fingerprint.
That makes 429 different from nearby status codes:
Status code
Meaning
429 Too Many Requests
You exceeded a request rate or quota. The client can usually slow down and retry later.
503 Service Unavailable
The server is unavailable or overloaded. The problem is usually server-side, although the server may still ask you to retry later.
403 Forbidden
The server understood the request but refused access. There may be no rate-limit signal at all.
In scraping, those boundaries can become a little fuzzy. Some protected targets return 403 or 503 instead of 429 when rate limits are hit because a clear 429 can reveal the threshold. If every excessive request gets the same honest status code, the limit becomes easier to map.
When a server sends 429, it may include a Retry-After header – either a number of seconds or an absolute HTTP date. For example, Retry-After: 120 means wait 120 seconds before sending the next request.
If Retry-After is missing, the server hasn't told you the wait window. Instead, your client has to apply its own retry logic.
Common triggers and scenarios
The obvious cause of a 429 is too many requests, but there are other common causes, too.
Loop bugs and unthrottled scripts
A price monitor might read a spreadsheet, send 1 request per row, and run without any delay between requests. If the sheet has 2,000 products, the script can burn through a per-minute limit in the first burst.
An accidental retry loop is even worse because each failed request creates more pressure.
Shared IP exhaustion
Shared IPs can fail even when your own request volume looks safe. If you route traffic through a public cloud VM, a heavily used VPN exit node, or a crowded datacenter proxy, the target may count all traffic from that IP together.
You can receive a 429 after only a few requests because another user already consumed the quota. Your code looks okay, but the IP starts the run with no capacity left.
Session-level limits
Some targets count requests by authenticated account, cookie, cart session, API token, or another session-level identifier. If the counter is attached to the session, rotating IPs won't reset it. The same cookie jar carries the limit across multiple network identities.
Burst limits vs. sustained limits
A target might allow 500 requests per hour but only 10 requests per second. Your hourly volume can look fine while 20 concurrent workers still trip the burst window. That's why rate limiting belongs in your concurrency model, not only in a single delay statement.
Rate limiting can also sit inside a wider anti-scraping stack. A target may start with 429, then escalate to CAPTCHAs, temporary IP bans, or stricter fingerprint checks if the same client keeps retrying aggressively.
Implications and impact
An unhandled 429 can easily become a data quality problem. If your scraper stops halfway through a run, downstream systems may treat missing rows as valid zeros, stale carry-forward values, or normal gaps. That's worse than an obvious failure because the dataset can look complete while being completely wrong.
It can also make future requests harder. Retrying immediately after a rate-limit response tells the target that your client won't slow down, so a short-lived limit can turn into CAPTCHAs, soft blocks, stricter fingerprint checks, or temporary IP bans.
Status codes alone aren't enough. A target may return 200 OK with an empty result set, a soft-block page, or a CAPTCHA document instead of an explicit 429. If your parser only checks response.status_code, it can save blocked pages as valid results.
Validate the response body against the expected shape. A product page should contain product fields. A search response should contain result containers. An API payload should match the expected schema. When those checks fail together with unusual timing, repeated empty responses, or sudden content changes, you may be looking at rate limiting even without a visible 429.
For protected targets, this escalation usually comes from anti-bot systems, not from the status code itself. The 429 is the warning. Your retry behavior decides whether it stays a warning.
How servers implement rate limits
AÂ rate limit is a server-side rule that controls how many requests a client can make within a time period. Understanding the algorithm helps you predict whether a retry will work.
Fixed window
A fixed window limit counts requests inside a defined time window. For example, a server may allow 60 requests from 09:00:00 to 09:00:59, then reset the counter at 09:01:00.
If you receive 429 at 09:00:40, waiting until the next boundary may be enough. The trade-off is that fixed windows can create edge effects. A client may send requests near the end of one window and again at the start of the next.
Sliding window
A sliding window limit uses a rolling lookback period instead. If the rule is 60 requests per 60 seconds, the server counts requests made during the last 60 seconds from the current moment.
There is no clean reset boundary. Capacity returns only as older requests fall out of the lookback window.
Token bucket
A token bucket limit uses tokens that refill at a steady rate. Each request consumes a token. When the bucket is empty, the server returns 429.
Short waits may be enough because you only need enough tokens for the next request, not a fully refilled quota.
How to identify which algorithm applies
You can often infer the model from response behavior:
- If Retry-After is constant and lines up with a clock boundary, it may be a fixed window.
- If it changes depending on when you retry, it's more likely a sliding window or token bucket.
API documentation sometimes names the algorithm directly, which is always better than reverse-engineering it from failures.
How to fix and prevent 429 errors
Treat 429 resolution as an escalation checklist. Start with the signal the server gives you, then change your client behavior, then change proxy infrastructure only when the symptom points to an IP-level limit.
Fix 1: Respect Retry-After
Start with the response headers. If Retry-After is present, honor it. If Retry-After is present, honor it. For a seconds-based value, wait for that duration. For an HTTP-date value, convert it to a timestamp and wait until that time.
This is the safest first move because the server has already told you when to retry.
Fix 2: Use exponential backoff with jitter
When Retry-After is missing, don't retry on a fixed 1-second loop. Use retry logic with exponential backoff, then add jitter so multiple workers don't retry at the same instant.
If you're using Python Requests, deeper retry patterns are covered in our blog post on failed retries.
Fix 3: Reduce concurrency and add delays
Check whether your workers share the same rate limit. 10 workers with a 1-second delay can still produce 10 requests per second from the same IP, account, or session.
Cap concurrency per identity, not only globally. If the limit is per IP, limit concurrent requests per IP. If it's per account or cookie, limit concurrent requests per session.
For many public APIs and protected targets, a short delay between requests per IP is a safer starting point than sending requests as fast as the network allows. Treat 1–3 seconds as an initial test range, then tune based on response headers, error rate, and successful response validation.
Fix 4: Rotate IPs via residential proxies
If the limit is tied to the IP address, distribute requests across more clean IPs. Rotating proxies help because each request, or each short session, can use a different exit IP.
For targets that rate-limit by IP reputation, residential proxies can help because they route traffic through real ISP networks rather than obvious datacenter ranges. Decodo residential proxies list a 115M+ real-user, ethically-sourced residential IP pool.
In Python Requests, the structure looks like this:
Fix 5: Align proxy pool size to request volume
IP rotation only works if the pool is large enough for your target rate. If a target allows 10 requests per minute per IP and you need 1,000 requests per minute, you need at least 100 active IPs just to meet the mathematical threshold.
In practice, you need headroom because requests won't distribute perfectly. Some IPs may cool down, fail validation, or hit stricter local thresholds.
At scale, rotating proxies remove the manual work of maintaining proxy lists, health checks, and replacement logic.
Fix 6: Escalate when rate limits combine with bot defenses
If you're seeing 429 together with CAPTCHAs, soft blocks, fingerprint failures, or JavaScript challenges, IP rotation alone may not be enough. That's when you need a managed unblocking layer rather than more raw IPs. Decodo Site Unblocker combines proxy rotation with CAPTCHA handling, browser fingerprinting, JavaScript rendering, and proxy pool management.