How to Scrape Google AI Mode: Methods, Tools, and Best Practices

Google AI Mode was launched as a Search Labs experiment in March 2025. It's powered by Gemini 2.5, which synthesizes answers from multiple sources and allows you to ask follow-up questions. Google AI Mode isn't the same as Google search results; it's an entirely full-page conversational interface using different URL parameters, rendering pipelines, and scraping logic. This guide provides a walkthrough of two different approaches: a working Playwright script you can execute right away, and the Decodo Web Scraping API for production.

Dominykas Niaura

Last updated: Apr 03, 2026

10 min read

JSON 'results' with 'status_code': 200 and 'url': 'https://example.com' on scraping UI showing 'Start scraping'

TL;DR

Google AI Mode is a conversational search interface that generates synthesized answers with cited sources instead of traditional ranked results. This guide explains how to scrape AI Mode data using Playwright for manual browser automation or Decodo Web Scraping API for production-scale extraction, while addressing challenges such as JavaScript rendering, streaming responses, DOM changes, and Google’s anti-bot protections.

Why scrape Google AI Mode?

AI mode data is an entirely different proposition from standard SERP data. Here are some of the key factors why it's worth collecting and what you can specifically do with it.

SEO and content strategy

Citation tracking. Monitor which domains Google’s AI references for your target keywords, how frequently, and in which order. Appearing in an AI mode citation is equivalent to appearing in a top-10 organic result set and is less analyzed by competitors right now.
Content gap identification. Compare what the AI Mode says about a topic against your existing content. It helps you to find missing angles, data points, or structured formats containing FAQs and comparison tables. AI Mode tends to tell you by citing someone else instead.
Featured content shifts. Track how AI mode answers evolve over time for the same query. Longitudinal tracking is all about finding which content formats and sources are gaining or losing authority.

Competitive intelligence

Brand mentions monitoring. Capture when and how competitors appear in AI responses for your target keywords and in what context. Being cited as a cautionary example is entirely different from being cited as the recommended solution.
Product positioning. This is applicable for e-commerce and SaaS, understanding which attributes AI Mode highlights when recommending products in your category. AI Mode structures product comparisons with attributes, pricing, and ratings.
Market narrative tracking. AI Mode frames how Google understands your industry. Watching that framing shift over quarters is early-warning intelligence for positioning and messaging.

Research and data enrichment

Building datasets for RAG. AI Mode gives you synthesized summaries alongside their cited sources – ideal labeled data for retrieval-augmented generation applications.
Academic and market research. Use AI Mode as a pre-processed synthesis layer instead of manually analyzing hundreds of search results. The citations point you directly to primary sources for verification.
Training data curation. AI Mode responses with citations serve as structured, labeled data for fine-tuning domain-specific question-answering models.

Challenges and anti-scraping measures in Google AI Mode

Before we provide details about the code, it's important to understand what you're up against. AI Mode is harder to scrape than a standard SERP – both in terms of design and system architecture.

Let's review some of the technical challenges, anti-bot measures, and scaling difficulties.

Technical challenges

JavaScript-heavy rendering. AI Mode content is generated dynamically as a raw HTTP request to the search URL returns empty containers. The AI response does not exist in the initial HTML and is fetched and rendered by JavaScript after page load. You need a full browser engine.
Streaming and delayed content. AI responses stream in progressively as they are not delivered all at once. It requires careful wait strategies rather than simple page-load detection. Google streams HTML fragments via /async/folwr using chunked transfer encoding with Brotli compression.
Nested DOM structures. The AI Mode container (div [data-subtree=”aimc”]) holds complex nested elements (citations, follow-up suggestions, product cards) that require sophisticated parsing. Google updates this structure regularly.
Frequent layout changes. Google regularly updates its AI Mode interface, which can break selectors and parsing logic overnight. Google runs 816+ active experiments on AI Mode simultaneously. Selectors that work today can break without warning.

Anti-bot measures

Bot detection. Google flags automated traffic patterns after a limited number of requests, triggering CAPTCHAs or blocking access entirely. See our guide on anti-scraping techniques to learn how to avoid them.
Browser fingerprinting. Google detects headless browsers through WebDriver flags, missing browser plugins, and other fingerprinting signals. For CAPTCHA strategies, see our blog post on Google CAPTCHAs.
Rate limiting. Aggressive request patterns result in temporary or permanent IP blocks. Proxy rotation and pacing are non-negotiable.
Geographic restrictions. AI Mode availability varies by region, and some proxy IPs may route to regions where AI Mode is not active. AI Mode is now in more than 200 countries. Always verify that your proxy geography matches your target.

Scaling difficulties

Running headful browser instances consumes significant memory and CPU, making bulk queries (hundreds or thousands of keywords) impractical without serious infrastructure.
Maintaining proxy rotation, user agent pools, and fingerprinting evasion adds substantial development and operational overhead that compounds at scale.
Monitoring and adapting to Google’s changes requires ongoing engineering effort. The 816+ simultaneous A/B experiments mean your selectors can change at any time.

Playwright is an excellent option for prototyping and low-volume monitoring. Decodo Web Scraping API eliminates the infrastructure problems so you can focus on the data if you are handling above a few hundred queries per day.

Custom scraping with Playwright

Let's walk through building a working Playwright scraper for Google AI Mode, from environment setup to extracting and saving the response. We'll go over the main functions step by step, and you'll find the full script further below.

For more information on the JavaScript rendering context, see our guide on how to scrape websites with dynamic content.

Prerequisites

Before writing or running the script, let's properly set up the environment.

Python. Make sure you've got Python 3.8+ installed on your system.
Playwright. Install Playwright and the Chromium browser library with the following commands in your terminal:

pip install playwright
playwright install chromium

Development environment. Use a code editor or IDE like Visual Studio Code, or any text editor paired with a terminal. Make sure your terminal uses the same Python environment where Playwright is installed.

Proxy setup for scraping

For real-world scraping, using proxies is essential. Residential proxies route your traffic through real user devices, making requests appear more natural and helping avoid blocks, rate limits, and anti-bot systems. They are especially important when working with sites that actively monitor traffic patterns.

Decodo offers high-performance residential proxies with a 99.92% success rate, response times under 0.6 seconds, and geo-targeting across 195+ locations. Here's how to get started:

Create your account. Sign up at the Decodo dashboard.
Select a proxy plan. Choose a subscription that suits your needs or start with a 3-day free trial.
Configure proxy settings. Set up your proxies with rotating sessions for maximum effectiveness.
Select locations. Target specific regions based on your data requirements or keep it set to Random.
Copy your credentials. You'll need your proxy username, password, and server endpoint to integrate into your scraping script.

Get residential proxies for scraping AI Mode

Unlock superior scraping performance with a free 3-day trial of Decodo's residential proxy network.

Start free trial

Imports and configuration

These imports provide the core functionality for browser automation, timing, and data handling:

from playwright.sync_api import sync_playwright
import random
import time
import json
from datetime import datetime

sync_playwright. Controls the browser
random & time. Simulate human-like interaction
json & datetime. Handle structured output and timestamps

Below the imports are the configuration variables:

PROXY_USERNAME = "YOUR_PROXY_USERNAME"  
PROXY_PASSWORD = "YOUR_PROXY_PASSWORD"  

SEARCH_QUERY = "best proxies"  
HEADLESS = True

Update these before running the script:

Proxy credentials. Replace with your own from the dashboard
Search query. The term you want to send to Google AI Mode
Headless mode. Set to False if you want to see the browser for debugging

Simulating human behavior

The first helper function introduces small, random interactions to make the session look less automated:

def generate_human_behavior(page):

Its full implementation handles:

Random mouse movement
Small scroll actions
Short, randomized delays

These interactions run throughout the session to reduce bot-like patterns.

The next helper function handles Google’s cookie consent popup. Its definition is:

def handle_consent(page):

It loops through several possible button selectors and clicks the first visible one. If no popup appears, execution continues without interruption.

Launching the browser and setting up the session

This line defines the function of the main scraping logic:

def scrape_ai_mode(query: str) -> str:

The full implementation:

Launches a Chromium browser
Applies proxy routing
Configures the browsing context
Executes the scraping flow

The browser is launched with:

headless=HEADLESS. Runs in the background unless disabled
slow_mo=50. Adds slight delays between actions
--disable-blink-features=AutomationControlled. Reduces automation signals

The proxy configuration routes all traffic through the residential endpoint:

"server": "http://gate.decodo.com:7000"

The browser context is configured to look realistic:

Standard viewpoint size
Chrome user agent
US locale for consistent results

An additional script removes the navigator.webdriver flag, which helps reduce bot detection.

Typing the query and entering AI Mode

The script starts by opening Google with fixed language and region parameters to keep results consistent:

page.goto("https://www.google.com/?hl=en&gl=us", ...)

Instead of relying on a single selector, it tries multiple options to locate the search box. This makes the script more resilient to UI changes.

Typing is done character by character with random delays to mimic real user input:

for char in query:
    search_box.type(char)

After submitting the search, the script looks for the AI Mode tab in the upper navigation bar and clicks it. If the tab isn’t available, the script exits cleanly without crashing.

Extracting the AI response

The script waits for the AI response container to appear:

page.wait_for_selector("div.mZJni", timeout=20000)

Once detected, it waits a bit longer to allow the full response to finish generating before extracting the text. This extra delay is important because AI responses stream progressively.

Cleaning and saving the output

The first utility function prepares the raw response. Its function definition is:

def clean_response(text):

The full function:

Replaces tab characters with a readable divider
Removes empty lines

This next function saves the result to a JSON file containing the original query, a timestamp, the cleaned response:

def save_to_json(query, text, filename=None):

The full Playwright script

Save the full implementation below to a .py file and run it with: python filename.py

This triggers the scraping flow by sending the query, extracting the AI response, printing the result, and saving it to a JSON file:

from playwright.sync_api import sync_playwright
import random
import time
import json
from datetime import datetime

PROXY_USERNAME = "YOUR_PROXY_USERNAME"  # Replace
PROXY_PASSWORD = "YOUR_PROXY_PASSWORD"  # Replace

SEARCH_QUERY = "best proxies"  # Replace
HEADLESS = True


def generate_human_behavior(page):
   try:
       page.mouse.move(random.randint(100, 800), random.randint(100, 600))
       page.evaluate(f'window.scrollBy(0, {random.randint(-300, 300)})')
       time.sleep(random.uniform(0.3, 0.7))
   except:
       pass


def handle_consent(page):
   selectors = [
       'button#L2AGLb',
       'form[action*="consent"] button',
       'button:has-text("Accept all")',
       'button:has-text("I agree")'
   ]
   for selector in selectors:
       try:
           element = page.locator(selector).first
           if element.is_visible(timeout=2000):
               element.click()
               page.wait_for_timeout(1000)
               return
       except:
           continue


def scrape_ai_mode(query: str) -> str:
   with sync_playwright() as playwright:
       browser = playwright.chromium.launch(
           headless=HEADLESS,
           slow_mo=50,
           args=['--no-sandbox', '--disable-blink-features=AutomationControlled'],
           proxy={
               "server": "http://gate.decodo.com:7000",
               "username": PROXY_USERNAME,
               "password": PROXY_PASSWORD
           }
       )

       context = browser.new_context(
           viewport={'width': 1280, 'height': 800},
           user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
           locale='en-US'
       )
       context.add_init_script("() => { Object.defineProperty(navigator, 'webdriver', { get: () => undefined }); }")

       page = context.new_page()
       generate_human_behavior(page)

       page.goto("https://www.google.com/?hl=en&gl=us", wait_until='domcontentloaded', timeout=15000)
       page.wait_for_timeout(2000)
       handle_consent(page)
       generate_human_behavior(page)

       # Type search query
       for selector in ['input[name="q"]', 'textarea[name="q"]', '#APjFqb']:
           try:
               search_box = page.locator(selector).first
               if search_box.is_visible(timeout=2000):
                   break
           except:
               continue

       search_box.click()
       page.wait_for_timeout(random.randint(500, 1200))

       for char in query:
           search_box.type(char)
           time.sleep(random.uniform(0.05, 0.15))

       page.wait_for_timeout(random.randint(1000, 2000))
       search_box.press("Enter")
       page.wait_for_timeout(4000)

       # Click AI Mode tab
       try:
           ai_mode_btn = page.locator('span.R1QWuf:has-text("AI Mode")').first
           ai_mode_btn.wait_for(timeout=8000)
           ai_mode_btn.click()
           print("Clicked AI Mode.")
       except Exception as e:
           print(f"Could not click AI Mode: {e}")
           browser.close()
           return ""

       # Wait for AI response to generate
       try:
           page.wait_for_selector("div.mZJni", timeout=20000)
           page.wait_for_timeout(3000)  # Extra buffer for full generation
           response_text = page.locator("div.mZJni").inner_text()
       except Exception as e:
           print(f"Could not scrape AI response: {e}")
           response_text = ""

       browser.close()
       return response_text


def clean_response(text):
   # Replace common table/separator characters with a readable divider
   text = text.replace("\t", " | ")    # tab (table column divider)
   # Split into lines, dropping blank-only lines
   return [line for line in text.splitlines() if line.strip()]


def save_to_json(query, text, filename=None):
   if not filename:
       timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
       filename = f"ai_mode_{timestamp}.json"
   data = {
       "query": query,
       "scraped_at": datetime.now().isoformat(),
       "response": clean_response(text)
   }
   with open(filename, "w", encoding="utf-8") as f:
       json.dump(data, f, indent=2, ensure_ascii=False)
   print(f"\nSaved to {filename}")


if __name__ == "__main__":
   print(f"Searching for: {SEARCH_QUERY}\n")
   result = scrape_ai_mode(SEARCH_QUERY)
   if result:
       print("\nAI Mode response:\n")
       print(result)
       save_to_json(SEARCH_QUERY, result)
   else:
       print("No response scraped.")

Practical tips:

Test selectors in browser DevTools first. Google changes its DOM frequently
Use HEADLESS = False during development to see what’s happening
Add retry logic if responses take longer to load
Always close browser instances to avoid memory leaks

Playwright code example output

Below is an example of the response printed in the terminal:

Searching for: best proxies

Clicked AI Mode.

AI Mode response:

In 2026, the best proxy providers are categorized by their scale, speed, and specific use cases like web scraping or account management. Oxylabs and Bright Data are the top choices for enterprise-level operations, while Decodo (formerly Smartproxy) and Webshare are highly recommended for smaller businesses and budget-conscious users. 
CNET
 +1
Top Proxy Providers by Category
The following services are consistently ranked at the top of expert reviews from sources like CNET, PCMag, and Proxyway. 
Oxylabs (Best for Enterprise): Praised for its massive pool of over 175 million IPs and a near-perfect success rate (99.95%). It is ideal for large-scale web scraping and market research due to its advanced geo-targeting (city, ZIP, and ASN level) and dedicated account management.
Decodo (Best Overall Value): Formerly known as Smartproxy, Decodo is widely considered the "sweet spot" for performance and price. It offers over 125 million IPs and is noted for having some of the fastest residential response times (approx. 0.63s) and a user-friendly dashboard.
Bright Data (Best for Large-Scale Data): A massive infrastructure provider with over 150 million IPs. It is highly regarded for its robust "Web Unlocker" tool and precise targeting but has a steeper learning curve and stricter KYC (Know Your Customer) requirements.
Webshare (Best for Budget): The top choice for those seeking the lowest prices, particularly for datacenter proxies, which can start as low as $0.03 per IP. While its IPs may be flagged more easily by strict sites, it is excellent for high-volume, lower-risk tasks.
IPRoyal (Best for Non-Expiring Traffic): Unique for its "non-expiring" data model, allowing users to keep purchased bandwidth until it is used rather than losing it at the end of the month. It is popular for sporadic scraping or social media management. 
Bright Data
 +9
Performance and Pricing Comparison (2026)
According to benchmarks from AIMultiple and CNET, here is how the leading providers compare on key metrics. 
CNET
 +1
Provider        Best For        Residential Pool        Starting Price (Residential)
Oxylabs Enterprises     175M+ IPs       From $2/GB
Decodo  Small/Mid Business      125M+ IPs       From $1.50/GB
Bright Data     Complex Scaling 150M+ IPs       From $2.50/GB
Webshare        Budget/Testing  80M+ IPs        From $1.40/GB
SOAX    Geo-precision   155M+ IPs       From $2/GB
Specialized Proxy Use Cases
Social Media & Multi-Accounting: SX.ORG and NodeMaven are frequently recommended for these tasks due to their focus on clean residential/mobile IPs and long-lived "sticky" sessions that mimic real user behavior.
Speed-Sensitive Tasks: For high-velocity scraping or gaming where latency is critical, Webshare and Decodo consistently outperform others in response time benchmarks.
Hard-to-Scrape Sites: Oxylabs and ZenRows are noted for their ability to bypass advanced anti-bot protections like Cloudflare and Akamai using AI-driven browser fingerprinting. 
Reddit
 +5

Saved to ai_mode_20260402_142016.json

Scraping Google AI Mode with Web Scraping API

Playwright gives you full control and a deeper understanding of how the scraping process works. Decodo’s Web Scraping API, on the other hand, is a more straightforward option for those who want to scale without dealing with proxy management, CAPTCHA solving, fingerprinting, or ongoing DOM maintenance. With minimal code required, it can be a simpler way to get reliable results.

Why use an API instead of DIY scraping

Factor

DIY Playwright

Decodo Web Scraping API

Proxy management

Requires external proxy setup

Handled automatically

CAPTCHA handling

Required custom implementation

Built-in handling

Browser fingerprinting

Requires manual tuning and maintenance

Managed by the platform

Output format

Raw HTML response

Structured JSON, ready to use

Scalability (1000+ queries)

Requires infrastructure scaling and monitoring

Predictable per-request scaling

Geo-targeting support

Depends on proxy provider and configuration

Controlled via a single parameter

Time to first result

Setup can take hours

Typically ready within minutes

Getting started with Decodo’s Web Scraping API

Decodo Web Scraping API works as a primary tool for structured Google AI Mode data extraction. It routes your requests through the same 125M+ proxy IPs as Decodo’s proxy service.

Create your account. Sign up in the Decodo dashboard.
Select a plan. Go to the Web Scraping API pricing section and choose a subscription, or start with a free plan.
Set the target. In the Target template dropdown, select Google AI Mode.
Configure parameters. Enter your search query, choose the output format, location, and adjust any additional settings.
Copy or send a request. Copy the generated request code in cURL, Node.js, or Python to use in your environment, or click Send request to run it directly in the dashboard.

Configuration and request handling

Start by defining the request payload. This includes the target, query, parsing behavior, and optional geo-location settings.

Enabling parsing ("parse": True) returns a structured JSON response instead of raw HTML, which makes the data easier to work with. You can also specify location parameters if you need region-specific results.

Sending requests and retrieving results

The API is accessed via a POST request with a JSON payload. Below is a complete working example:

import requests
import json
  
url = "https://scraper-api.decodo.com/v2/scrape"
  
payload = {
      "target": "google_ai_mode",
      "query": "best residential proxies for web scraping",
      "parse": True
}
  
headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "authorization": "Basic [YOUR_BASE64_ENCODED_CREDENTIALS]"
}
  
response = requests.post(url, json=payload, headers=headers)
  
data = response.json()
print(json.dumps(data, indent=2))

Web Scraping API example output

Below is an example of the response returned by the Web Scraping API when querying Google AI Mode. Notice how the data is already structured and grouped into fields like citations and links, making it easier to work with compared to the raw output from the custom Playwright scraper:

"root":{
"results":{
"citations":[
0:{
"text":"In 2026, the best proxy providers are categorized by their suitability for either high-scale enterprise needs or budget-conscious individual projects. Oxylabs and Bright Data are the industry leaders for large-scale data gathering, while Decodo and Webshare are the top-rated choices for smaller businesses and developers seeking value. CNET +3"
"urls":[
0:"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
1:"https://cybernews.com/best-proxy/"
2:"https://decodo.com/best/mobile-proxies#:~:text=Let's%20explore%20the%20leading%20providers,rate%20and%20extended%20sticky%20sessions."
3:"https://multilogin.com/blog/cheap-residential-proxies/#:~:text=Finding%20cheap%20residential%20proxies%20in%202026%20is,even%20solopreneurs%20and%20small%20teams%20can%20afford."
]
}
1:{
"text":"Expert reviews from sources like CNET and PCMag consistently rank these providers at the top for their network size and reliability. CNET +1"
"urls":[
0:"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
1:"https://me.pcmag.com/en/proxy/22143/the-best-proxies-for-2024"
]
}
2:{
"text":"Oxylabs : Widely considered the best for large businesses and enterprise clients . It boasts the largest residential pool on the market with over 175 million IPs across 195+ countries. It is praised for its high success rates and advanced features like ASN and ZIP code targeting. Bright Data : Best for enterprise-level compliance and advanced proxy management. It offers a massive network of over 150 million IPs and is known for its strict ethical sourcing and powerful "Web Unlocker" tool that handles CAPTCHAs automatically. Decodo (formerly Smartproxy) : The best choice for small to medium businesses (SMBs) . It is frequently cited as the fastest provider in benchmarks, with average response times under 0.6 seconds. It offers a user-friendly dashboard and competitive pay-as-you-go pricing starting around $3.50/GB. CNET +7"
"urls":[
0:"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
1:"https://me.pcmag.com/en/proxy/22143/the-best-proxies-for-2024"
2:"https://cybernews.com/best-proxy/"
3:"https://aimultiple.com/proxy-providers"
4:"https://brightdata.com/blog/proxy-101/best-residential-proxy-providers"
5:"https://medium.com/@onlineproxypmm/top-10-proxy-services-for-scraping-in-2026-price-functional-comparison-381471ac6d05#:~:text=In%20a%20market%20saturated%20with,per%2DGB%20in%20the%20industry."
6:"https://ourcodeworld.com/articles/read/2879/top-5-proxy-providers-in-2026#:~:text=1.,response%20times%20to%20your%20requests."
7:"https://www.reddit.com/r/WebDataDiggers/comments/1qir76d/best_residential_proxies_2026_i_spent_500_testing/#:~:text=Avoid%20these.,Open%20comment%20sort%20options"
8:"https://decodo.com/best/mobile-proxies#:~:text=Let's%20explore%20the%20leading%20providers,rate%20and%20extended%20sticky%20sessions."
]
}
3:{
"text":"Different providers excel depending on whether you prioritize cost, speed, or specific IP types."
"urls":[
0:"https://geonix.com/company/articles/general-articles/best-proxy-providers-2026-expert-reviewed-compared/#:~:text=Geonix%20focuses%20on%20clean%20setup,automate%20renewals%20and%20lifecycle%20tasks."
]
}
4:{
"text":"Webshare : The top budget-friendly option . It is ideal for developers and smaller projects, offering a limited free plan with 10 datacenter IPs and highly customizable paid plans starting as low as $1.40/GB. IPRoyal : Best for irregular workloads due to its non-expiring traffic model. Unlike most providers, the data you purchase does not disappear at the end of the month. SOAX : Recommended for precise geo-targeting and clean IP pools. It allows granular filtering down to the city and ISP level at no extra cost, which is often a premium feature elsewhere. NetNut : The best for high-speed stable connections . It uses direct ISP connectivity instead of peer-to-peer networks, which results in lower latency and more reliable sessions for long-running automation. The Jerusalem Post +8"
"urls":[
0:"https://www.jpost.com/consumerism/article-889499#:~:text=Frequent%20IP%20blocks%2C%20slow%20proxy,latency%20and%20minimizes%20random%20disconnects."
1:"https://cybernews.com/best-proxy/"
2:"https://www.reddit.com/r/WebDataDiggers/comments/1rke51u/who_actually_has_the_best_us_residential_proxies/#:~:text=If%20you%20need%20a%20reliable,IPRoyal%20is%20the%20logical%20choice."
3:"https://www.reddit.com/r/PrivatePackets/comments/1qg3bp7/top_5_proxy_providers_tested_in_2026/#:~:text=1.,As%2DYou%2DGo%20option."
4:"https://www.reddit.com/r/WebDataDiggers/comments/1qir76d/best_residential_proxies_2026_i_spent_500_testing/#:~:text=Final%20Verdict,backup%20for%20specific%20geo%2Dlocations."
5:"https://www.proxyrack.com/blog/best-residential-proxy-providers-in-2026/#:~:text=Oxylabs,Best%20for:"
6:"https://www.designrush.com/agency/it-services/trends/best-mobile-proxies#:~:text=Best%20Mobile%20Proxy%20Providers:%20Key,Best%20ISP%20Proxy%20Providers"
7:"https://www.quora.com/What-are-some-good-proxy-sites"
8:"https://medium.com/@onlineproxypmm/top-10-proxy-services-for-scraping-in-2026-price-functional-comparison-381471ac6d05#:~:text=In%20a%20market%20saturated%20with,per%2DGB%20in%20the%20industry."
]
}
5:{
"text":"According to 2026 benchmarks from AIMultiple and Proxyway , here is how the leading residential proxy providers compare. Proxyway +1"
"urls":[
0:"https://proxyway.com/best/residential-proxies#:~:text=The%20Best%20Premium%20Residential%20Proxies,largest%20pool%20and%20top%20performance.&text=for%206%20months%3E-,2.,with%20advanced%20IP%20filteri..."
1:"https://aimultiple.com/proxy-providers"
]
}
6:{
"text":"Provider"
"urls":[
0:"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
1:"https://me.pcmag.com/en/proxy/22143/the-best-proxies-for-2024"
2:"https://brightdata.com/blog/proxy-101/best-enterprise-proxy-providers"
3:"https://brightdata.com/blog/proxy-101/best-residential-proxy-providers"
4:"https://www.reddit.com/r/WebDataDiggers/comments/1rke51u/who_actually_has_the_best_us_residential_proxies/#:~:text=If%20you%20need%20a%20reliable,IPRoyal%20is%20the%20logical%20choice."
5:"https://cybernews.com/best-proxy/"
6:"https://www.proxyrack.com/blog/best-residential-proxy-providers-in-2026/#:~:text=Oxylabs,Best%20for:"
]
}
7:{
"text":"Note : Prices and pool sizes are based on reported data from early 2026 and may fluctuate based on current promotions or volume discounts. CNET +1"
"urls":[
0:"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
1:"https://decodo.com/best/mobile-proxies#:~:text=Let's%20explore%20the%20leading%20providers,rate%20and%20extended%20sticky%20sessions."
]
}
]
"links":[
0:{
"publisher":"Cybernews"
"text":"Best Proxy Server Providers Tested in 2026 - Cybernews"
"url":"https://cybernews.com/best-proxy/"
}
1:{
"publisher":"Cybernews"
"text":"Best Proxy Server Providers Tested in 2026 - Cybernews"
"url":"https://cybernews.com/best-proxy/"
}
2:{
"publisher":"CNET"
"text":"Best Proxy Servers for 2026 - CNET"
"url":"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
}
3:{
"publisher":"CNET"
"text":"Best Proxy Servers for 2026 - CNET"
"url":"https://www.cnet.com/tech/services-and-software/best-proxy-servers/#:~:text=Proxy%20server%20FAQs-,What%20is%20the%20best%20proxy%20server%20in%202026?,spend%20with%20Decodo%20and%20Oxylabs."
}
4:{
"publisher":"PCMag Middle East"
"text":"The Best Proxy Services for 2026 - PCMag Middle East"
"url":"https://me.pcmag.com/en/proxy/22143/the-best-proxies-for-2024"
}
5:{
"publisher":"Proxyrack"
"text":"Best Residential Proxy Providers in 2026 - Proxyrack"
"url":"https://www.proxyrack.com/blog/best-residential-proxy-providers-in-2026/#:~:text=Oxylabs,Best%20for:"
}
6:{
"publisher":"Reddit"
"text":"Best residential proxies 2026: I spent $500 testing the top ..."
"url":"https://www.reddit.com/r/WebDataDiggers/comments/1qir76d/best_residential_proxies_2026_i_spent_500_testing/#:~:text=Avoid%20these.,Open%20comment%20sort%20options"
}
7:{
"publisher":"Bright Data"
"text":"The 10 Best Residential Proxy Providers in 2026 - Bright Data"
"url":"https://brightdata.com/blog/proxy-101/best-residential-proxy-providers"
}
8:{
"publisher":"Reddit"
"text":"Best residential proxies 2026: I spent $500 testing the top providers"
"url":"https://www.reddit.com/r/WebDataDiggers/comments/1qir76d/best_residential_proxies_2026_i_spent_500_testing/#:~:text=Final%20Verdict,backup%20for%20specific%20geo%2Dlocations."
}
9:{
"publisher":"Reddit"
"text":"Who actually has the best US residential proxies? - Reddit"
"url":"https://www.reddit.com/r/WebDataDiggers/comments/1rke51u/who_actually_has_the_best_us_residential_proxies/#:~:text=If%20you%20need%20a%20reliable,IPRoyal%20is%20the%20logical%20choice."
}
10:{
"publisher":"Decodo"
"text":"Top 8 Best Mobile Proxy Service Providers in 2026 🏆 - Decodo"
"url":"https://decodo.com/best/mobile-proxies#:~:text=Let's%20explore%20the%20leading%20providers,rate%20and%20extended%20sticky%20sessions."
}
11:{
"publisher":"Medium"
"text":"Top 10 Proxy Services for Scraping in 2026 - Medium"
"url":"https://medium.com/@onlineproxypmm/top-10-proxy-services-for-scraping-in-2026-price-functional-comparison-381471ac6d05#:~:text=In%20a%20market%20saturated%20with,per%2DGB%20in%20the%20industry."
}
12:{
"publisher":"AIMultiple"
"text":"10 Best Proxy Providers for 2026: Tested & Ranked - AIMultiple"
"url":"https://aimultiple.com/proxy-providers"
}
13:{
"publisher":"The Jerusalem Post"
"text":"Top residential proxy platforms for 2026 with NetNut leading way"
"url":"https://www.jpost.com/consumerism/article-889499#:~:text=Frequent%20IP%20blocks%2C%20slow%20proxy,latency%20and%20minimizes%20random%20disconnects."
}
14:{
"publisher":"Proxyway"
"text":"The Best Residential Proxies of 2026: Tested & Ranked - Proxyway"
"url":"https://proxyway.com/best/residential-proxies#:~:text=The%20Best%20Premium%20Residential%20Proxies,largest%20pool%20and%20top%20performance.&text=for%206%20months%3E-,2.,with%20advanced%20IP%20filteri..."
}
15:{
"publisher":"Geonix"
"text":"Best Proxy Providers 2026: Expert-Reviewed & Compared"
"url":"https://geonix.com/company/articles/general-articles/best-proxy-providers-2026-expert-reviewed-compared/#:~:text=Geonix%20focuses%20on%20clean%20setup,automate%20renewals%20and%20lifecycle%20tasks."
}
16:{
"publisher":"DesignRush"
"text":"10 Best Mobile Proxy Providers for 2026 - DesignRush"
"url":"https://www.designrush.com/agency/it-services/trends/best-mobile-proxies#:~:text=Best%20Mobile%20Proxy%20Providers:%20Key,Best%20ISP%20Proxy%20Providers"
}
17:{
"publisher":"Reddit"
"text":"Top 5 proxy providers tested in 2026 : r/PrivatePackets - Reddit"
"url":"https://www.reddit.com/r/PrivatePackets/comments/1qg3bp7/top_5_proxy_providers_tested_in_2026/#:~:text=1.,As%2DYou%2DGo%20option."
}
18:{
"publisher":"Bright Data"
"text":"Best Enterprise Proxy Services 2026: Comparison & Reviews"
"url":"https://brightdata.com/blog/proxy-101/best-enterprise-proxy-providers"
}
19:{
"publisher":"Quora"
"text":"What are some good proxy sites? - Quora"
"url":"https://www.quora.com/What-are-some-good-proxy-sites"
}
20:{
"publisher":"Our Code World"
"text":"Top 5 Proxy Providers in 2026 | Our Code World"
"url":"https://ourcodeworld.com/articles/read/2879/top-5-proxy-providers-in-2026#:~:text=1.,response%20times%20to%20your%20requests."
}
21:{
"publisher":"Multilogin"
"text":"10 Best Cheap Residential Proxies That Actually Work in 2026"
"url":"https://multilogin.com/blog/cheap-residential-proxies/#:~:text=Finding%20cheap%20residential%20proxies%20in%202026%20is,even%20solopreneurs%20and%20small%20teams%20can%20afford."
}
]
"parse_status_code":12000
"prompt":"best proxies"
"response_text":"In 2026, the best proxy providers are categorized by their suitability for either high-scale enterprise needs or budget-conscious individual projects. Oxylabs and Bright Data are the industry leaders for large-scale data gathering, while Decodo and Webshare are the top-rated choices for smaller businesses and developers seeking value. CNET +3 Top Overall Proxy Providers Expert reviews from sources like CNET and PCMag consistently rank these providers at the top for their network size and reliability. CNET +1 Oxylabs : Widely considered the best for large businesses and enterprise clients . It boasts the largest residential pool on the market with over 175 million IPs across 195+ countries. It is praised for its high success rates and advanced features like ASN and ZIP code targeting. Bright Data : Best for enterprise-level compliance and advanced proxy management. It offers a massive network of over 150 million IPs and is known for its strict ethical sourcing and powerful "Web Unlocker" tool that handles CAPTCHAs automatically. Decodo (formerly Smartproxy) : The best choice for small to medium businesses (SMBs) . It is frequently cited as the fastest provider in benchmarks, with average response times under 0.6 seconds. It offers a user-friendly dashboard and competitive pay-as-you-go pricing starting around $3.50/GB. CNET +7 Best for Specific Use Cases Different providers excel depending on whether you prioritize cost, speed, or specific IP types. Webshare : The top budget-friendly option . It is ideal for developers and smaller projects, offering a limited free plan with 10 datacenter IPs and highly customizable paid plans starting as low as $1.40/GB. IPRoyal : Best for irregular workloads due to its non-expiring traffic model. Unlike most providers, the data you purchase does not disappear at the end of the month. SOAX : Recommended for precise geo-targeting and clean IP pools. It allows granular filtering down to the city and ISP level at no extra cost, which is often a premium feature elsewhere. NetNut : The best for high-speed stable connections . It uses direct ISP connectivity instead of peer-to-peer networks, which results in lower latency and more reliable sessions for long-running automation. The Jerusalem Post +8 Performance Comparison Table According to 2026 benchmarks from AIMultiple and Proxyway , here is how the leading residential proxy providers compare. Proxyway +1 Provider Residential IP Pool Best Known For Starting Price Oxylabs 175M+ Enterprise Scale ~$4.00/GB Bright Data 150M+ Compliance & Features ~$8.40/GB (PAYG) Decodo 115M+ Speed & Value ~$3.50/GB Webshare 80M+ Low-cost entry ~$1.40/GB SOAX 155M+ Precise Targeting ~$2.00/GB Note : Prices and pool sizes are based on reported data from early 2026 and may fluctuate based on current promotions or volume discounts. CNET +1 Are you looking for residential proxies for high anonymity, or datacenter proxies for maximum speed and lower cost?"
}
"errors":[]
"status_code":12000
"task_id":"7445746459372199937"
}

Try Web Scraping API for free

Activate your free plan with 1K requests and scrape structured public data at scale.

Start now

Advantages of scraping Google AI Mode vs. traditional SERP scraping

Not all Google data is equal. Here's what makes AI Mode data structurally different and even more valuable than standard SERP results.

What makes AI Mode data different

Pre-synthesized intelligence. Standard SERP gives you ten ranked URLs. AI Mode gives you an aggregated synthesis of what those sources say. That is a different signal entirely – not a noisier version of the same one.
Citation graph. Every AI response comes with explicit source citations. The citation list is a structured, machine-readable map of what Google currently considers authoritative for each query.
Intent signals. The follow-up questions AI Mode suggests reveal what users are likely to ask next. That is content planning intelligence that PAA boxes only partially surface.
Product and entity structuring. For commercial queries, AI Mode often includes structured product comparisons, pricing, and ratings. It is far richer than organic snippets.

Comparison with other Google scraping targets

Data source

Unique value proposition

Missing aspects

AI Mode

Synthesized answers, citation links, and entity-level context

Raw per-URL ranking data

Organic SERP

Ranked links, snippets, and domain-level visibility

Synthesis, intent understanding, source relationships

AI Overviews

Concise summaries embedded in search results

Depth, multi-turn context, follow-up signals

Combined scraping strategies

AI Mode data is most powerful when it's paired with other Google data sources:

AI Mode + traditional SERP. See which URLs rank organically and which get cited in AI responses. They are not always the same set.
AI Mode + Google Trends. Trends shows what is being searched, while AI Mode shows how Gemini frames the answer. Together, they reveal both demand and narrative. Learn more in our guide to scraping Google Trends.
AI Mode + Google News. Track how breaking news is reflected in AI Mode responses in near real time. Explore how to scrape Google News with Python.
AI Mode + Google Scholar. For research-heavy queries, this pairing helps identify which academic sources Gemini relies on. See our guide on scraping Google Scholar with Python.

Best practices for reliable Google AI Mode scraping

To keep your scraping workflows stable and maintainable over time, it’s important to follow a few practical guidelines. These help reduce breakage, improve consistency, and make your setup easier to scale as your needs grow.

Respecting boundaries

Implement random delays between requests – not a fixed interval, but a range (for example, 3 to 8 seconds) that replicates organic browsing behavior.
Honoring robots.txt guidelines
Understanding terms of service implications
Identifying your scraper with a clear user agent when possible
Review the legal context before scraping at scale. See our accessible overview of key considerations on whether web scraping is legal.

Proxy strategy

Residential over datacenter. Residential IPs have genuine ISP assignments and realistic traffic patterns. Google’s detection systems treat datacenter ranges with far more suspicion.
Geographic selection. AI Mode is now available in 200+ countries, but response quality and content can vary significantly by region. Match your proxy location to your target audience’s geography.
Per-request rotation. Rotate your IP on every request. Sticky sessions – using the same IP across multiple requests, which increases the risk of a single IP being fingerprinted or blocked.
Decodo residential proxies. The 115M+ IPs across 195+ locations use the same infrastructure that powers the Decodo SERP API, available directly for custom proxy configurations.

Data quality and validation

Verify the container is non-empty.
Handle edge-case scenarios explicitly where AI Mode returns no response, partial responses, or falls back to standard results.
Validate extracted citation URLs.
Timestamp and geo-tag all scraped data for accurate longitudinal analysis.

Scaling and monitoring

Log success rates, response times, and error types. A sudden drop in success rate is your first signal that Google has changed something.
Setting up alerts for sudden drops in success rate. If your AI Mode container selector returns zero results across multiple consecutive queries, the DOM has changed, and your parser needs updating.
Use the Decodo dashboard for monitoring scraping job status and quota usage.
Troubleshoot IP bans. See our guide to IP bans for troubleshooting guidance.

Final thoughts

Google AI Mode marks a shift from ranked links to synthesized, citation-backed answers, introducing a new layer of data for SEO, analytics, and research workflows.

If you want full control and a deeper understanding of how the data is generated, the Playwright with proxies approach is a solid starting point. If you need something that works reliably at scale with minimal setup, Web Scraping API offers a simpler path by handling the underlying infrastructure for you.

About the author

Dominykas Niaura

Technical Copywriter

Dominykas brings a unique blend of philosophical insight and technical expertise to his writing. Starting his career as a film critic and music industry copywriter, he's now an expert in making complex proxy and web scraping concepts accessible to everyone.

Connect with Dominykas via LinkedIn

All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.

In this article

Industry-leading residential proxies

Access 115M+ residential IPs with fast response times and high success rates.

Start free trial

Frequently asked questions

What is the difference between Google AI Mode and Google AI Overviews?

Google AI Overviews are short,embedded summaries that appear at the top of the standard Google results page for certain queries. Google AI Mode is an entirely separate interface – a fully conversational experience as it delivers deeper, longer responses powered by Gemini 2.5, supporting multi-turn follow-up questions, and renders through a different backend pipeline.

Why does my scraper return empty content from AI Mode?

Scraper returns empty content in one of the three scenarios:

AI response is still streaming. AI Mode content streams via /async/folwr and takes about 6.5 seconds on average to completely render. A 3-second network idle wait is often not enough. You need to increase your buffer to 5+ seconds or intercept the async endpoint directly to know when the response is complete.
The AI Mode container is empty. Check that (div [data-subtree=”aimc”]) exists and contains text, not just that it is present in the DOM. If it's present but empty, AI Mode rendered its shell, but the streaming content didn’t complete.
Your IP or fingerprint has been flagged. Run with headless=False and navigate to the same URL manually in the same browser session. If you see a CAPTCHA or a consent wall, your session is blocked.

How do I scrape Google AI Mode at scale without getting blocked?

To scrape Google AI Mode at scale without running into blocks, you need to make your requests look as close to real user traffic as possible.

Using residential proxies is a good starting point. They route traffic through real user devices, helping reduce detection. On top of that, rotate sessions regularly, introduce realistic delays, and mimic natural browser behavior to avoid triggering anti-bot systems.

If you want to avoid managing all of that yourself, you can use Decodo Web Scraping API. It handles proxy rotation, CAPTCHA solving, fingerprinting, and response parsing for you, making it easier to run large-scale scraping workflows with fewer moving parts.

Browser window showing Google search for 'Data' with Dictionary definition, on dark patterned background with teal bug outline

DATA COLLECTION

SEARCH ENGINE OPTIMIZATION

PYTHON

How to Scrape Google Search Data

Google search results are one of the essential ways to track rankings, ads, SERP features, and shifts in search intent at scale. The hard part is collecting that data reliably, because modern SERPs vary by query, location, device, and result type. In this guide, you'll learn 3 practical ways to scrape Google search results: lightweight extraction, a custom-built scraper, and a managed SERP API.

Dominykas Niaura

Last updated: Mar 30, 2026

5 min read

DATA COLLECTION

PYTHON

UNBLOCK

How to Bypass Google CAPTCHA: Expert Scraping Guide 2026

Scraping Google can quickly turn frustrating when you're repeatedly met with CAPTCHA challenges. Google's CAPTCHA system is notoriously advanced, but it’s not impossible to avoid. In this guide, we’ll explain how to bypass Google CAPTCHA verification reliably, why steering clear of Selenium is critical, and what tools and techniques actually work in 2026.

Dominykas Niaura

Last updated: Apr 08, 2026

10 min read

Google Finance card displaying 'Scrape Google Finance' with browser UI showing market tickers and prices on dark background.

PYTHON

DATA COLLECTION

How to Scrape Google Finance

Google Finance is one of the most comprehensive financial data platforms, offering real-time stock prices, market analytics, and company insights. Scraping Google Finance provides access to valuable data streams that can transform your analysis capabilities. In this guide, we'll walk through building a robust Google Finance scraper using Python, handling anti-bot measures, and implementing best practices for reliable data extraction.

Dominykas Niaura

Last updated: Jun 25, 2025

10 min read

How to Scrape Google AI Mode: Methods, Tools, and Best Practices

TL;DR

Why scrape Google AI Mode?

SEO and content strategy

Competitive intelligence

Research and data enrichment

Challenges and anti-scraping measures in Google AI Mode

Technical challenges

Anti-bot measures

Scaling difficulties

Custom scraping with Playwright

Prerequisites

Proxy setup for scraping

Imports and configuration

Simulating human behavior

Launching the browser and setting up the session

Typing the query and entering AI Mode

Extracting the AI response

Cleaning and saving the output

The full Playwright script

Playwright code example output

Scraping Google AI Mode with Web Scraping API

Why use an API instead of DIY scraping

Getting started with Decodo’s Web Scraping API

Configuration and request handling

Sending requests and retrieving results

Web Scraping API example output

Advantages of scraping Google AI Mode vs. traditional SERP scraping

What makes AI Mode data different

Comparison with other Google scraping targets

Combined scraping strategies

Best practices for reliable Google AI Mode scraping

Respecting boundaries

Proxy strategy

Data quality and validation

Scaling and monitoring

Final thoughts

Frequently asked questions

What is the difference between Google AI Mode and Google AI Overviews?

Why does my scraper return empty content from AI Mode?

How do I scrape Google AI Mode at scale without getting blocked?

Related articles