403 Forbidden Error and How to Avoid It
A 403 Forbidden error is a rejection signal you'll often hit when sending automated scraping requests. It means the server understood your request and decided not to serve it. In scraping workflows, that often happens because your request looks like a bot.
The fix depends on the signal that triggered the block. A missing User-Agent needs a different fix than a blocklisted proxy IP. A TLS fingerprint mismatch needs a different fix than a broken cookie session.
TL;DR
- In web scraping, a 403 usually means your request looked automated, not that the page disappeared.
- Start with diagnosis: headers, cookies, timing, TLS fingerprint, IP reputation, and location can all trigger 403.
- Don't treat proxies as a universal fix. They help when the IP is the problem and hurt when they break session continuity.
- Fix the cheapest layer first: headers, cookies, and request pacing. Escalate to better proxies or browser-like clients only when the symptom points there.
What is a 403 Forbidden error?
403 Forbidden means the server understood the request but refused access to the resource. That makes it different from similar HTTP status codes:
Status code
Meaning
401 Unauthorized
Authentication is required or failed.
403 Forbidden
The request was understood, but access was refused.
404 Not Found
The resource doesn't exist at that URL, or the server hides whether it exists.
429 Too Many Requests
The client sent too many requests in a given period.
Some protected sites return 403 instead of 429 when they detect scraping. That hides the rate-limit threshold. If every excessive request returned 429, the limit would be easier to reverse-engineer. 403 is less helpful on purpose.
In practice, 403 has 2 main causes:
- Access restriction. The page may require a logged-in session, a specific account role, or traffic from an allowed country.
- Bot detection. Public pages can reject automated traffic based on IP reputation, HTTP headers, TLS fingerprint, cookie state, or behavior patterns.
Diagnosing your 403: three symptom patterns
Don't try fixing a 403 by randomly stacking headers, delays, and proxies. Instead, start by looking at when the failure happens. The pattern usually tells you which layer to inspect first.
Symptom
Likely cause
First fix to try
403 on the first request
Missing headers, a bot-like User-Agent, or poor IP reputation
Send browser-like headers and test with a cleaner IP
403 after several successful responses
Rate limiting, broken session continuity, or missing cookies
Slow down, persist cookies, and reuse the same session
Page loads in a browser but returns 403 in a script
Browser fingerprint mismatch, missing JavaScript execution, or incomplete headers
Try curl_cffi impersonation or a headless browser
In a first-request failure, the server hasn't seen your behavior yet, so it's judging the request profile: IP address, User-Agent, and headers.
Mid-session failures are more behavioral. If 5 pages work and the 6th fails, check timing, cookies, and whether your scraper changed IPs halfway through the same session.
Browser-only failures usually mean your script says "Chrome" but doesn't behave like Chrome. A real browser sends a broader request profile than a basic HTTP client. That includes headers such as Sec-Fetch-Site, Sec-Fetch-Mode, and Accept-Language. It also includes a TLS handshake profile that some bot systems compare against the browser you claim to be.
Before changing code, inspect what your client actually sends. An echo endpoint such as httpbin.org/headers shows your request headers:
It's also important to choose the right client you use when sending requests. Different Python clients such as Requests, httpx, aiohttp, and other browser-like clients send different default headers and have different TLS behavior.
Why proxies sometimes cause 403 errors instead of fixing them
A proxy doesn't make a bad request look human. It changes the network path, IP identity, and sometimes location. That helps only when the 403 comes from IP reputation or geo access.
Datacenter IP reputation
Many datacenter ranges are easy to identify because they belong to hosting providers. If a site rejects known cloud or hosting CIDR blocks, better headers won't save the request.
If IP reputation is the issue, residential proxies or ISP proxies are usually a better fit than datacenter proxies. Residential IPs are assigned by ISPs, so they're less likely to fail checks that reject obvious datacenter networks. ISP proxies can help when the target also expects a stable IP across the same session.
Proxy headers
A normal request to the target might look like this:
A misconfigured proxy path can leak headers that identify the request as proxied:
X-Forwarded-For, Via, and Proxy-Authorization shouldn't reach the target server in a normal scraping request. If they appear upstream, the proxy path is exposing more than it should.
Rotating proxies
Rotating proxies help distribute requests, but rotating every request can fragment sessions. The site sees one cookie jar jumping across many IPs, while real browsing sessions usually keep the same network identity for at least a short period.
Geo-mismatch
If the site only serves a page in certain countries, a proxy in the wrong location can trigger 403.
How to fix 403 Forbidden errors: an escalation checklist
Treat 403 fixes as an escalation path. Start with the lowest-cost layer: headers, cookies, and request pacing. Move to proxies, TLS impersonation, or browsers only when the symptom points there.
Fix 1 – set a realistic User-Agent and complete headers
Python requests library doesn't send a full browser profile by default. Many sites will accept a basic HTTP client, but more protected sites often won't:
Use this set of headers as a baseline, then mirror what your own browser sends to the same target.
Fix 2 – add request delays and avoid patterns
Use randomized delays and back off after failures. For many sites, a 2-5 second minimum delay is a reasonable starting point, then adjust based on the target's response behavior:
Fix 3 – use a session object and persist cookies
Bare requests.get() calls don't preserve cookies across requests unless you manage them yourself. A requests.Session() keeps cookies and connection state together:
This matters when a site issues a session cookie on the first page and expects it on the next one.
Fix 4 – route through residential proxies
If the same request works from your browser but fails from a server or datacenter proxy, IP reputation is a likely cause. In that case, the fix is to use a better proxy.
Here's the basic shape in Python Requests:
Fix 5 – match TLS fingerprints with curl_cffi
If a page loads in a normal browser but returns 403 from your script, even with good headers, the problem may be TLS fingerprinting. The target may see a script claiming to be Chrome in the User-Agent, but sending a TLS handshake that Chrome wouldn’t send. curl_cffi can impersonate browser TLS behavior:
Fix 6 – escalate to a headless browser
Use a headless browser when the site needs JavaScript execution, browser storage, dynamic tokens, or a more complete browsing context. Playwright is usually the practical next step.
Choose the right proxy type for 403 errors
The right proxy type depends on what triggered the 403. Don't start with the most expensive setup. Start with the simplest proxy that clears the actual check.
Proxy type
403 risk
Best fit
Trade-off
Highest on protected targets
Fast access to low-protection sites
Easy to block based on IP reputation
Lower
Targets where IP reputation matters
Rotation still requires cookie and session continuity
Lower for session-sensitive targets
Stable sessions on targets that monitor IP continuity
Less rotation flexibility
Depends on the rotation strategy
Distributing requests across many IPs
Over-rotation can break sessions
Lowest manual tuning required
Targets that need managed proxy, browser, and anti-bot handling
Less low-level control
Use proxies for IP reputation, location, and continuity problems. Don't use them to paper over broken headers, missing cookies, predictable timing, or TLS mismatch. A clean IP won't solve broken cookies. Perfect headers won't fix a banned subnet.