How to Bypass Cloudflare: Complete Guide to Anti-Bot Evasion
Cloudflare is a massive global cloud network that sits firmly between your scraper and the data you need, blocking all requests that fail its multi-layered detection system. It powers nearly 21% of all websites globally, meaning that 1-in-5 sites rely on this network. Therefore, knowing how to bypass it is essential for serious scrapers. This practical walkthrough covers detection methods, tools like Puppeteer and Playwright, and both DIY approaches and managed solutions, including proxy strategies and web scraping APIs.
Vilius Sakutis
Last updated: Jun 02, 2026
10 min read

TL;DR
- Cloudflare assesses your requests across several layers, and you must "pass" them all
- DIY bypassing relies on tools like Puppeteer, Playwright, stealth plugins, Undetected ChromeDriver, and residential proxy rotation to mimic real users
- Residential proxies and session consistency are crucial
- For fewer than 1,000 requests/day, Puppeteer with stealth plugins may suffice, but beyond that scale, you need a managed web scraping API
- Web scraping APIs are often the most practical solution for large-scale scraping because they eliminate maintenance overhead
How Cloudflare detects and blocks automated traffic
Cloudflare assesses your requests across several layers, and getting familiarized with them will help you understand why simple HTTP requests fail instantly. Cloudflare needs microseconds to identify automated traffic as it evaluates every HTTP request via a multi-layered scoring system.
Pre-connection level: Network and protocol fingerprinting
Your scraper hasn't even sent an HTTP request, but Cloudflare has already started analyzing certain network-level signals. First, it looks at TLS fingerprinting, examining the "Client Hello" message during the SSL/TLS handshake (JA3 or JA4 hashes). Each client type has distinct connection characteristics, so its fingerprint becomes an identifier. This means that, in web scraping and automation, even if you rotate IPs, your TLS stack stays the same. Python Requests' TLS signatures are clearly different from a real browser, and their fingerprints scream "bots." curl-impersonate and Curlium-style HTTP clients exist to spoof browser TLS signatures.
Cloudflare will check HTTP/2 Fingerprinting, looking for differences between standard libraries and real browsers. It will also inspect the IP reputation to verify that the IP isn't associated with data centers or malicious botnets. As Cloudflare's global network feeds IP reputation data, an IP flagged on one site affects requests everywhere.
Therefore, inconsistency becomes the biggest red flag, and consistency becomes key. Your HTTP layer may say "Chrome", but your TLS fingerprint shows Requests. Using Playwright to rely on a real browser TLS stack is a good option in this case.
Execution level: Browser and device fingerprinting
Browser fingerprinting is an advanced security tool that detects and blocks web scrapers. When Cloudflare-protected pages load in a browser, invisible and lightweight JavaScript starts running to collect hardware and software metrics in great detail. Scrapers that don't execute JS fail immediately.
Canvas rendering, WebGL metadata, font enumeration, audio context, and navigator properties create a unique device fingerprint. The mechanism renders a hidden image or a 3D scene to measure how your graphics card, driver, and anti-aliasing process the pixels. It scans the fonts installed on the OS and measures audio output signatures. It also checks the number of logical processor cores and device memory constraints, and evaluates properties like plugins, screen resolution, time zone, and language settings.
Behavioral analysis
Cloudflare constantly observes your behavior, namely mouse movements, keystrokes, scroll depth, and touch events.
"Checking your browser" interstitial: runs complex client-side JS challenges and browser fingerprinting instead of traditional cryptographic Proof-of-Work (PoW). These challenges work as a puzzle that a human browser solves in milliseconds, but scrapers fail.
Turnstile: a CAPTCHA-free alternative introduced in 2022, which verifies actual users via non-interactive background JavaScript challenges instead of making them solve visual puzzles. There are three Turnstile modes: non-interactive (invisible), managed (invisible with a brief check), and interactive (always displays a click-to-proceed prompt).
Methods and tools to bypass Cloudflare protection
Here we'll look into specific tools that help get around Cloudflare, specifically Puppeteer with stealth plugins, Playwright, Python, and proxy rotation.
Puppeteer with stealth plugins
To bypass Cloudflare protection with Puppeteer, you'll need to install puppeteer-extra and puppeteer-extra-plugin-stealth. Puppeteer Extra is the base framework, while puppeteer-extra-plugin-stealth is an individual plugin that relies on puppeteer-extra to patch navigator properties, spoof user agents, and hide the window.navigator.webdriver flag.
Installation:
Basic Setup:
Here's a simple example showing how developers reduce automated browser detection on publicly accessible websites using these tools. It launches Chrome in headed mode (headless: false), randomizes viewport size and the user agent, visits a public employment portal protected by Cloudflare, and extracts job titles.
Run the script:
Note: While this plugin is a standard first step, modern Cloudflare security often involves additional layers like proxy rotation and behavioral simulation.
Playwright for cross-browser automation
Cloudflare uses advanced profiling to detect headless or automated environments, and scrapers are in a dangerous territory here. Playwright supports Chromium, Firefox, and WebKit, but stock engines leak automation signals. Strategically rotating browsers through the Playwright API enables you to bypass this detection by presenting a profile that matches the specific browser signatures Cloudflare expects to see.
Also, use stealth libraries. To remove headless headers and overwrite automation artifacts like navigator.webdriver, use playwright-stealth package in Python, which provides nearly identical fingerprint masking to puppeteer-extra-plugin-stealth.
Finally, persistent browser contexts in Playwright save cookies, cache, and local storage to disk, meaning they can maintain session state across multiple pages. You only have to complete manual login challenges once, launch a persistent context with a directory path to store the session (instead of browser.newContext()), and future script runs will bypass login pages.
Undetected ChromeDriver (Python)
The Undetected ChromeDriver (UC) is a popular Python library that patches Selenium's ChromeDriver to remove automation flags at the binary level and appear more like a regular human user, so it won't trigger anti-bot services. The setup is simpler than Puppeteer for a Python-native workflow.
To get started, install the package via pip:
This basic script will initialize a protected session:
This approach still requires proxy rotation for scale, as IP bans accumulate fast.
Proxy rotation
Using residential proxies is a must when bypassing Cloudflare protection. They utilize IP addresses assigned to real household devices so that your traffic looks like a real human user, granting them high trust scores. High reputation and rotating IPs are vital for bypassing IP reputation checks. Decodo's residential proxies offer 115M+ ethically-sourced IPs in 195+ locations. Users can select IPs from specific countries to bypass geo-fencing or region-specific Cloudflare firewall rules. Datacenter proxies, while cheaper, are not a good idea in this scenario, because they're easily detected, flagged, and blocked.
Proxy rotation allows bots to harvest data from sites with strict anti-bot protections without triggering CAPTCHAs or IP bans. Rotate IPs per request for general scraping to avoid rate limits, or use sticky sessions (keeping the same IP for 10-30 minutes) for multi-page flows like logins. Decodo's rotating proxies give your project fresh IPs for every single request, enabling you to bypass blocks, CAPTCHAs, and geo-limits.
Also, match your proxy's IP location to the target site's primary audience to avoid geographic mismatches. If a site's primary audience is in the United States, your proxy must route through a US residential IP address.
To rotate per request, configure your HTTP client (like Python Requests) with a backconnect proxy string, and to use sticky sessions, append the sticky argument (e.g., sessionId=12345) to your proxy username so that the backconnect server holds the same exit node. Alternatively, you can set the port to 7000 so that the proxy rotates on every request without any need for manual configuration. Decodo provides you with both rotating and sticky session options.
Cloudflare handled for you
Decodo's Web Scraping API bypasses Cloudflare's bot detection, JS challenges, and CAPTCHAs automatically. Your code makes one request and gets clean HTML back.
Handling CAPTCHAs, JavaScript challenges, and Turnstile
Bypassing Cloudflare's interactive anti-bot barriers is tricky. You can choose to go with one of the DIY methods or leave it all to a specialized API. Keep in mind that DIY methods are a good choice for small-scale work, but ultimately come with a massive maintenance burden.
Handling CAPTCHAs and challenges manually (aka DIY) requires creating automated requests that can't be distinguished from humans. Methods include:
- Impersonating a real browser's TLS fingerprint
- JavaScript and challenge handling
- Sending requests directly to the server's origin IP instead of the domain name
- Using fortified headless browsers (e.g., Puppeteer with stealth plugins)
JavaScript challenge pages
Cloudflare's "Checking your browser" screen is an automated validation sequence you'll commonly see when automating with headless browsers like Playwright. During the spinner phase, a hidden script checks for hardware acceleration, rendering speeds, and tests if drawing to an HTML5 canvas produces a unique, device-specific fingerprint. It validates native APIs, looking for inconsistencies that typically reveal headless automation frameworks like Puppeteer. Moreover, it validates TLS/SSL handshakes, HTTP/2 signatures, and browser headers.
To handle this hurdle:
- Avoid page.waitForNavigation() alone, as it often times out/fails
- Use page.waitForSelector('.main-content') or page.waitForFunction() to detect when the site content appears
- Use a randomized interval if using a delay (3-5 seconds)
Once you have a cf_clearance cookie, reuse it to maintain session persistence and avoid repeated Cloudflare challenges across the following requests. But be careful: Cloudflare will easily notice any mismatch and will invalidate the token immediately, returning you to the "checking your browser" screen. To avoid this, make sure the IP address, user-agent, and TLS fingerprinting align completely.
Cloudflare Turnstile challenges
Cloudflare Turnstile replaced traditional CAPTCHAs with non-interactive JavaScript challenges and only occasionally prompts users with a simple checkbox to proceed. It analyzes web APIs and browser characteristics and runs lightweight PoW tests in the background. To detect a Cloudflare Turnstile widget, you can evaluate the webpage DOM for the specific cf-turnstile class or look for iframes pointing to Cloudflare's challenge infrastructure.
Here are the specific selectors to target:
1. iframe src attribute.
Scan for iframes with URLs hosted on the Cloudflare challenges domain:
2. Turnstile class names (the primary and most important class name is cf-turnstile).
Scan for the primary container class used by Cloudflare for rendering the widget:
You can programmatically detect these elements using standard web scraping or automation tools. Here are some detection implementation examples.
JavaScript/Puppeteer:
Python/Playwright:
However, DIY solving is impractical. Turnstile requires behavioral signals that scripted interactions simply can't reliably produce.
When DIY fails: CAPTCHA solving services
Services like 2Captcha, CapMonster, Anti-Captcha, and CapSolver follow a specific token-based model for challenges like reCAPTCHA, hCaptcha, and Cloudflare Turnstile. You provide the sitekey and URL, their solvers (human workers or ML algorithms) process the challenge, and they return a long alphanumeric string. This token must then be injected into the target webpage or request to bypass the challenge.
CAPTCHA solving adds 5-30 seconds per request and $2-$3 per 1,000 solves. Therefore, manual or brute-force CAPTCHA solving is a bottleneck and is not viable at scale.
The most effective approach: Managed API
The best approach to scaling automation is to prevent CAPTCHAs from rendering in the first place. We can accomplish this by using web scraping APIs, which significantly reduce both cost and latency. They handle CAPTCHA solving, JavaScript rendering, and fingerprint management in one request. You don't need to manage solver integrations, browser infrastructure, or proxy pools separately.
Decodo's Web Scraping API enables JavaScript rendering and stealth mode with simple parameters, without a headless browser setup required. It comes with built-in, automatic proxy handling, retries, and CAPTCHA bypassing. Use it for sites with strong anti-bot technology and for browser rendering when you need a fast, all-in-one, "skip-the-complexity" solution that manages parsing, browser rendering, anti-bot bypass, and proxy rotation. Decodo's API needs approximately 8 lines of code, compared to 60 for the manual stack. Also, you pay for successful requests only.
Common pitfalls when bypassing Cloudflare and how to avoid them
This section will provide the typical mistakes you're likely to encounter when bypassing Cloudflare while web scraping, as well as fixes for each.
Pitfall 1: Using default HTTP client headers
Issue: Default headers don't have the metadata and complexity of real browsers. Standard libraries like Python's requests, Node.js' axios, or basic curl commands send distinct default headers, triggering instant detection.
Fix: Your client must accurately mimic a legitimate browser environment. Replace default strings with a real browser User-Agent, and use tools like curl-impersonate to help match a browser's TLS signature. Add Accept, Accept-Language, Accept-Encoding, and Referer headers to match what a real user would send during a session. Additionally, use automated bypass tools, such as Playwright.
Go further: Ensure header order matches real browsers. Chromium sends headers in a specific sequence (e.g. :authority, :method, :path, :scheme, accept, accept-encoding, accept-language, user-agent, sec-ch-ua, etc.), so any deviation will raise a flag.
Pitfall 2: Ignoring TLS fingerprint mismatches
Issue: Ignoring or misconfiguring handshake flags your requests immediately – for example, your headers say "Chrome 120" but your TLS handshake says "Python 3.11." This leads to blocks or 403 errors.
Fix: Use curl-impersonate and libraries that allow you to spoof the TLS fingerprints of legitimate browsers (curl_cffi or tls-client for Python, and tls-client for Node.js). Use httpx with custom SSL contexts, or delegate to a browser/API that handles TLS correctly.
Go further: Rotate your IP addresses. Instead of sending raw HTTP requests, use browser automation tools with stealth patches.
Pitfall 3: Sending requests too fast
Issue: Cloudflare flags you with a "429 Too Many Requests" error or forces a verification challenge, because it identified your request volume, speed, and/or HTTP fingerprint as bot behavior. 100 requests/second from one IP triggers rate limiting before any fingerprint analysis happens.
Fix: Implement exponential backoff and increase your wait time exponentially after each failure (e.g., 1s, 2s, 4s, 8s, 16s). Add "jitter", meaning include random delays (2-8 seconds between requests). Spread traffic across the IP pool.
Go further: Respect the "retry-after" header. If you pass a Cloudflare challenge, capture the cf_clearance cookie to continue the session.
Pitfall 4: Not persisting session cookies
Issue: Cloudflare relies on two cookies to remember a solved challenge: cf_clearance and __cf_bm:. If you forget to persist session cookies (preserving the right session tokens), you will be flagged as a new visitor on every subsequent request, triggering a new challenge.
Fix: Store and reuse cookies across requests. In Playwright / Puppeteer, store them in local JSON, and reload them for subsequent script runs. Use page.cookies() to export and page.setCookie() to import.
Go further: Sync TLS and network fingerprints, because any mismatch and change (in request's User-Agent, Accept headers, or TLS signatures), the session cookie will be invalidated.
Pitfall 5: Using free or blacklisted proxies
Issue: Free or blacklisted proxies have a poor reputation and a lack of rotation, leading to blocks and persistent CAPTCHAs. Free proxy lists are harvested and overused, so those IPs are already flagged in Cloudflare's database.
Fix: Use residential or ISP proxies from reputable providers with ethically sourced IP pools.
Go further: Use Decodo residential proxies with automatic IP rotation. It offers a pool of 115M+ ethically-sourced residential IPs across 195+ locations worldwide, and the best response time in the market. Decodo residential proxies also integrate with any Java HTTP client.
To set up Decodo residential proxies with rotating IPs, follow these steps:
- Register or log in to the Decodo dashboard.
- Navigate to find residential proxies, choose a subscription, or start a 3-day free trial.
- Go to Proxy setup.
- Select a location or choose Random.
- Set the rotating session type and choose a protocol (HTTP(S) or SOCKS5).
- Choose the authentication type.
- Download the generated endpoint and credentials or copy them into your scraper, browser, or software.
When bypass techniques fail: Troubleshooting and fallbacks
DIY methods are hard work and are bound to hit walls, encountering common failure signals. You'll need to get familiarized with specific error codes and how to solve them.
403 Forbidden
Meaning: IP blocked or fingerprint rejected.
Solution: Try different proxies and check TLS fingerprints. You may need to fix a misconfiguration or solve a bot challenge. If a specific bot or user is being blocked, add their IP to the IP Access Rules in your Cloudflare dashboard and set the action to Allow.
429 Too Many Requests
Meaning: Rate limit triggered.
Solution: Slow down, rotate IPs, and implement backoff. Manage the rate limit thresholds on and make sure your code respects the Retry-After header. Switch networks to get a new IP address, and use rotating residential proxies combined with HTTP header rotation to mask your browsing fingerprints.
503 Service Unavailable
Meaning: Cloudflare can't reach the origin because your request failed server-level checks, the origin server is temporarily overloaded, or is down for maintenance.
Solution: Mimic human behavior by setting a realistic User-Agent, introducing random delays, and using specialized libraries. Products like Decodo's Web Scraping API automatically manage proxies, headers, and headless browsing to bypass 503 errors on your behalf.
1010 Access Denied
Meaning: Your headless browser is detected, and your access is blocked based on your browser's signature. Cloudflare Error 1010 often happens when using automation tools (Puppeteer, Selenium, or Playwright) because they lack the fingerprints of human-operated browsers.
Solution: Use stealth plugins to patch the specific fingerprints Cloudflare looks for. Simulate human behavior by including random mouse movements, delays between actions, and scrolling.
Endless "Checking your browser" loop
Meaning: JavaScript challenge never resolves, meaning that the browser fingerprint is failing. Cloudflare has flagged your automated request because it's showing missing/irregular browser fingerprints or is originating from a known datacenter IP address.
Solution: Use Puppeteer with stealth plugins, route your requests through residential proxies, and once a Cloudflare challenge is solved, extract and cache the cf_clearance cookie issued by the server to reuse it.
Fallback solutions
- Switch browser engine. If Chromium is flagged, try Firefox via Playwright. Combine the Firefox engine with dedicated or residential proxy pools. Before starting, first ensure the Playwright Firefox browser binary is downloaded.
- Upgrade proxy quality. Move from datacenter to residential or ISP proxies. Use proxy rotation so that repeated requests appear to come from different, real-world internet connections.
- Delegate to a managed API. When DIY maintenance exceeds the value of the data, offload to a web scraping API. For Java developers who focus on data extraction rather than infrastructure, Decodo Web Scraping API is the recommended solution for complex targets.
Using a web scraping API to bypass Cloudflare at scale
Based on what we've seen in this guide, Decodo Web Scraping API is the most practical and comprehensive solution for production workloads, large-scale data extraction, and complex targets. It takes the complexity out of users' hands and handles residential proxy rotation, headless browser rendering, and anti-bot bypass. You send an HTTP request and get clean HTML or JSON back.
This service features advanced anti-bot evasion, a vast proxy network, and AI-optimized data output, supports complex, dynamic content rendering, and provides pre-built templates for major platforms.
Why APIs beat DIY at scale
Building a DIY scraping stack for Cloudflare-protected websites is complicated and difficult to maintain, because developers need to do everything on their own. They have to combine headless browsers like Puppeteer or Playwright with stealth plugins, proxy rotation systems, CAPTCHA solvers, session persistence, and retry logic – all this just to have operational scrapers. Maintaining all this and working to beat the constantly evolving anti-bot systems becomes a full-time job, turning developers' focus on infrastructure instead of the actual scraping. Any DIY setup requires continuous monitoring, debugging, and patching to stay functional.
A web scraping API is often much simpler than maintaining your own browser automation stack. It reduces this operational burden by handling the infrastructure automatically, as the provider maintains all the systems for you so you can focus on extracting and processing data. It features automatic JavaScript rendering, handles advanced browser impersonation techniques behind the scenes, and provides built-in residential proxy rotation with geotargeting support.
In the example below, the API handles JavaScript rendering, browser fingerprinting, proxy rotation, and anti-bot protection automatically while you scrape a public travel comparison page (US geotargeting and anti-bot bypass mode enabled).
Before running the script, install Requests (Python library) on your own computer using your system terminal (Command Prompt, PowerShell, or Terminal app).
This downloads the Requests library from Python's package repository (PyPI) and installs it on your computer.
Full Python script:
Run the script from the terminal and press enter.
The output appears directly inside the terminal window.
A successful HTTP 200 response combined with a large HTML response size usually confirms that the request returned the real rendered page instead of a Cloudflare challenge page or CAPTCHA screen.
When to choose API vs. DIY
Key factors
Choose DIY Scraping
Choose a Web Scraping API
Best for
Learning projects and experimentation
Production systems and business use
Request volume
Low-volume scraping
Medium to large workloads
Target protection
Basic anti-bot protection
Cloudflare/CAPTCHA-heavy sites
Technical complexity
More hands-on setup
Faster deployment, less effort
Infrastructure management
You manage proxies, browsers, and retries
Provider handles infrastructure
Maintenance effort
Very high
Very low
Cost structure
Lower upfront cost, more time investment
Usage-based pricing
Reliability
Can become unstable at scale
Built for scalability and reliability
Common tools
Playwright, Puppeteer, SeleniumBase
Web Scraping APIs
Main advantage
Customization and learning value
Simplicity and operational stability
Final thoughts
Bypassing the super-sensitive Cloudflare protection requires understanding how modern anti-bot systems analyze traffic across different detection layers: TLS fingerprints, browser behavior, JavaScript execution, and session consistency. All of these must align, or the request gets flagged.
DIY approaches using Puppeteer, Playwright, stealth plugins, and residential proxies work well for learning and smaller scraping workloads, but maintaining them at scale quickly turns time-consuming and operationally expensive. For production pipelines and heavily protected targets, managed solutions, specifically dedicated APIs, are the more practical choice, allowing developers to focus on extracting data rather than maintaining anti-bot infrastructure.
Stop reverse-engineering Cloudflare
Challenge tokens change, fingerprint checks evolve, and your workaround breaks again next week. Decodo stays ahead of Cloudflare, so you don't have to.
About the author

Vilius Sakutis
Head of Partnerships
Vilius leads performance marketing initiatives with expertize rooted in affiliates and SaaS marketing strategies. Armed with a Master's in International Marketing and Management, he combines academic insight with hands-on experience to drive measurable results in digital marketing campaigns.
Connect with Vilius via LinkedIn
All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.


