Back to blog

How To Scrape Websites With Dynamic Content Using Python

You've mastered static HTML scraping, but now you're staring at a site where Requests + Beautiful Soup returns nothing but an empty <div> and <script> tags. Welcome to JavaScript-rendered content, where you get the material after the initial request. In this guide, we'll tackle dynamic sites using Python and Selenium (plus a Beautiful Soup alternative).

Justinas Tamasevicius

Dec 16, 2025

12 min read

What's dynamic content?

In web scraping terms, "dynamic content" refers to content that's rendered client-side by JavaScript after the initial page load.

Here's what happens when you visit a modern website:

  1. Your browser requests a URL.
  2. The server responds with minimal HTML, often just a few div and script tags.
  3. JavaScript makes asynchronous requests to fetch data from the server.
  4. JavaScript manipulates the DOM to inject and render that data.
  5. You see a fully populated page.

The scraping challenge: Tools like Requests only capture step 2. They can't execute JavaScript, so they miss steps 3-5 entirely.

This architecture dominates modern web development. Single-Page Applications (SPAs) built with React, Vue, Angular, or Svelte all work this way. Of course, it's not all done to interrupt scrapers or complicating things for no reason – It offers real advantages: smoother interactions without full page reloads, reduced server load, tailored content for users, and better separation between frontend and backend.

Static vs. dynamic content

To help understand the differences better, here are the key features of each type of content:

Static (server-rendered) content

  • Complete HTML is generated on the server before sending the response
  • Inspect Element shows the same content you see in the browser
  • Each request triggers a full page reload
  • Content is embedded directly in the HTML document
  • Straightforward to scrape with basic HTTP libraries

Dynamic (client-rendered) content

  • Server sends a minimal HTML shell with JavaScript bundles
  • Content is fetched and rendered in the browser via API calls
  • DOM is manipulated after page load without full refreshes
  • Requires JavaScript execution, headless browsers, or API reverse-engineering to scrape

Benefits of dynamic websites

Modern frameworks have made client-side rendering the default for several practical reasons:

  • No full page reloads. Navigation between views is instant by swapping components instead of requesting new HTML.
  • Better perceived performance. Shows a loading skeleton immediately, then loads in data progressively.
  • Richer interactions. Allows building complex, stateful UIs (real-time updates, drag-and-drop, infinite scroll).
  • Reduced server load. Offloads rendering work to clients, letting servers focus on data delivery.
  • Developer experience. Component-based architecture with reloading makes development faster and more maintainable.

Common examples of dynamic content

  • Social media feeds. X, Facebook, LinkedIn (infinite scroll, real-time updates).
  • eCommerce product listings. Filters, sorting, lazy-loaded images.
  • Dashboards and analytics. Charts, graphs, live data visualization.
  • Search results. Google, Bing (autocomplete, instant results).
  • Single-page applications. Gmail, Notion, Figma.
  • Content platforms. YouTube, Netflix, Spotify (recommendations, player interfaces).

Challenges of web scraping dynamic content

Scraping dynamically rendered sites presents distinct technical challenges compared to static HTML extraction:

  • Asynchronous content loading. Data appears after the initial DOM loads, often triggered by user actions or timed events.
  • Fragile element selectors. JavaScript frameworks often generate dynamic class names or restructure the DOM between renders.
  • Timing and race conditions. Multiple API calls might populate the page in an unpredictable order.
  • Resource overhead. Running a headless browser consumes significantly more memory and CPU than simple HTTP requests.
  • Steeper learning curve. You need to understand browser automation, async JavaScript execution, the DOM lifecycle, and debugging tools.

Scraping time – what you'll need to get started

Before starting to scrape dynamic content, you'll need a few prerequisites:

  • Python 3.9 or higher
  • Selenium, Beautiful Soup 4, lxml, Requests, and Webdriver-manager libraries. You can install them all with the pip package manager (comes with Python) through the command terminal:
pip install selenium beautifulsoup4 lxml requests webdriver-manager

But why Requests and Beautiful Soup?

You might've noticed that Requests and Beautiful Soup were a part of the installed libraries, despite the target being a dynamic website. The reason is that there's one more way to extract dynamic data – grabbing it "midway".

When JavaScript renders content, it's fetching that data from somewhere. Most modern sites make XHR (XMLHttpRequest) or Fetch API calls to backend endpoints, typically returning JSON or HTML fragments. Instead of running an entire browser to execute JavaScript, you can identify these requests and hit the endpoints directly.

icon_check-circle

Skip the Selenium setup

Let Decodo's Web Scraping API handle JavaScript rendering, CAPTCHAs, and rate limits while you focus on extracting the data.

Scraping dynamic content with Python and Selenium: A step-by-step guide

Method 1: Using Selenium (browser automation)

When you need complete JavaScript execution and DOM interaction, Selenium is your tool.

  1. Set up the driver.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
import time
def setup_driver():
"""Configure Chrome for scraping"""
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)
return driver

2. Navigate and wait for content.

def scrape_multiple_years(start_year, end_year):
"""Scrape films across multiple years"""
all_films = []
for year in range(start_year, end_year + 1):
print(f"Scraping {year}...")
films = scrape_films_requests(year)
all_films.extend(films)
# Be polite - add a small delay
time.sleep(0.5)
return all_films
# Get films from 2010-2015
films = scrape_multiple_years(2010, 2015)
print(f"Total films scraped: {len(films)}")

The script:

  • Navigates to the page and wait for elements with the class film to appear
  • Executes all JavaScript, triggering the AJAX request
  • Extracts data from the rendered DOM
  • Prints the first 5 film data

Method 2: Using Requests (API interception)

This is where things get interesting. Instead of running a browser, let's find what the JavaScript is actually requesting.

Step 1: Inspect network traffic

  1. Open scrapethissite.com/pages/ajax-javascript.
  2. Choose a year to view films (for this example, 2015).
  3. Open Chrome DevTools (F12) (or your browser equivalent).
  4. Navigate to the Network tab.
  5. Reload the page.
  6. Filter by Fetch/XHR to see API calls.
  7. Click through responses to find your target data.

You'll discover a request to: ajax-javascript/?ajax=true&year=2015

Click on it and check the Preview or Response tab that contains JSON.

Step 2: Replicate the request

import requests
import json
def scrape_films_requests(year=2015):
"""
Scrape film data by hitting the API directly
"""
base_url = 'https://www.scrapethissite.com/pages/ajax-javascript/'
# These are the parameters the JavaScript sends
params = {
'ajax': 'true',
'year': year
}
# Add headers to look like a real browser
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'application/json',
'Referer': 'https://www.scrapethissite.com/pages/ajax-javascript/'
}
try:
response = requests.get(base_url, params=params, headers=headers)
response.raise_for_status()
# Parse the JSON response
films_data = response.json()
# Clean and structure the data
films = []
for film in films_data:
films.append({
'title': film.get('title', '').strip(),
'year': film.get('year', ''),
'awards': film.get('awards', ''),
'nominations': film.get('nominations', '')
})
return films
except requests.RequestException as e:
print(f"Request failed: {e}")
return []
except json.JSONDecodeError as e:
print(f"Failed to parse JSON: {e}")
return []
# Run it
if __name__ == '__main__':
results = scrape_films_requests(year=2015)
for film in results[:5]: # Print first 5
print(film)

Performance comparison

So how did both of these scripts perform?

Selenium approach:

  • ~5-10 seconds per page
  • ~150-200MB memory per browser instance
  • Handles any JavaScript complexity
  • Harder to scale

Requests approach:

  • ~0.5-1 second per request
  • ~10-20MB memory
  • Only works if you can find the API
  • Easy to scale

For this specific site, the Requests method is 10x faster and uses 90% less memory.

When to use each method

Use Selenium when:

  • You can't find or replicate the API calls
  • The site requires complex interactions (clicking, scrolling, form submissions)
  • Authentication involves cookies or tokens set by JavaScript
  • You're prototyping and need something working quickly

Use Requests when:

  • You've identified the data endpoints
  • The API doesn't have complex authentication
  • You need to scrape at scale
  • Performance and resource usage matter

For production scraping, it's recommended to start with 10 minutes of network inspection. If you can find the API, you'll save hours of execution time down the line.

Using proxies

For production scraping, proxies help you avoid rate limits and IP blocks. Here's how to integrate Decodo's residential proxies into both methods.

Selenium

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from seleniumwire import webdriver # pip install selenium-wire
def setup_driver_with_proxy():
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
# Configure proxy with authentication
proxy_options = {
'proxy': {
'http': 'http://user:[email protected]:7000',
'https': 'http://user:[email protected]:7000',
}
}
driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=proxy_options)
return driver

Requests

import requests
def scrape_with_proxy(url):
# Format: https://username:password@proxy:port
proxies = {
'http': 'http://user:[email protected]:7000',
'https': 'http://user:[email protected]:7000'
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
response = requests.get(url, proxies=proxies, headers=headers)
return response
# Usage
response = scrape_with_proxy('https://www.scrapethissite.com/pages/ajax-javascript/?ajax=true&year=2015')
data = response.json()

Conclusion

Scraping dynamically-rendered content comes down to understanding how modern web apps actually work. Start by checking the Network tab – if you can intercept the API calls, you'll save yourself the overhead of running a headless browser. When that's not possible, Selenium gives you the complete JavaScript execution you need, just at a higher resource cost. The techniques here work for most dynamic sites, but expect to adapt your approach as you encounter different architectures and anti-scraping measures.

Scale without getting blocked

Decodo's residential proxies rotate through 115M+ real IPs to keep your scrapers running smoothly at any scale.

About the author

Justinas Tamasevicius

Head of Engineering

Justinas Tamaševičius is Head of Engineering with over two decades of expertize in software development. What started as a self-taught passion during his school years has evolved into a distinguished career spanning backend engineering, system architecture, and infrastructure development.


Connect with Justinas via LinkedIn.

All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.

Frequently asked questions

Which Python module is best for web scraping dynamic pages?

Selenium is the most reliable for web scraping dynamic websites. Playwright offers better performance for dynamic web scraping Python projects, while Requests + Beautiful Soup is faster if you can intercept the API endpoints directly for Python scraping dynamic website content.

Is Python good for web scraping?

Yes, Python is excellent for web scraping with libraries like Selenium, Beautiful Soup, Scrapy, and Playwright covering every scenario. Its clear syntax, strong community support, and seamless integration with data processing tools make it the top choice for most scraping projects.

How’s a dynamic website different from dynamic content?

In web scraping terms, they're essentially the same thing. Both refer to content rendered by JavaScript after the initial page load, requiring either browser automation or API interception to extract the data.

Can I use a different element selector?

There are many selectors, so you can choose whichever you want and feel the most comfortable with. Better known ones would be XPath and CSS. You can read about them in our other blog post.

Python Tutorial: How To Scrape Images From Websites

How to Scrape Images From Any Website With Python

If you need a bunch of images and the thought of saving them one by one already feels tedious, you're not alone. This can be especially draining when you're preparing a dataset for a machine learning project. The good news is that web scraping makes the whole process faster and far more manageable by letting you collect large quantities of images in just a few steps. In this blog post, we'll walk you through a straightforward way to grab images from a static website. We'll use Python, a few handy libraries, and proxies to keep things running smoothly.

Dominykas Niaura

Nov 20, 2025

10 min read

Choosing between XPath and CSS

How To Choose The Right Selector For Web Scraping: XPath vs CSS

If you're fresh-new to data scraping, you may not be familiar with selectors yet. Let us introduce ya – selectors are objects that find and return web items on a page. These pieces are an essential part of a scraper, as they affect your tests' outcome, efficiency, and speed.

Yep, understanding the idea of a selector isn't that complicated. Finding the right selector itself might be. To be honest, even the two languages that define them, XPath and CSS, have their own pros and cons. So it can quickly become a headache to choose one of them. But here's some good news – we're here to help! Let's explore it together.

James Keenan

Dec 21, 2021

11 min read

The Pricing Game Begins: How US Retailers and Buyers Prepare for Black Friday

As the biggest shopping festival of the year approaches, a dynamic shift occurs across thousands of online storefronts. Prices fluctuate by the hour, algorithms compete for consumer attention, and shoppers equip themselves with AI-powered price trackers and browser extensions. Black Friday has grown from a single-day shopping event into a months-long strategic tango between retailers and increasingly savvy consumers.


An exclusive analysis by web data company Decodo of 150K+ price observations across 37 major US eCommerce retailers, tracked throughout the year, reveals the complex strategies both sides deploy. The data paints a picture of a retail ecosystem in constant flux, where timing, category selection, and pricing algorithms can mean the difference between a genuine deal and clever marketing tricks.

Benediktas Kazlauskas

Nov 13, 2025

9 min read

© 2018-2025 decodo.com (formerly smartproxy.com). All Rights Reserved