How to Scrape Images From Any Website With Python
If you need a bunch of images and the thought of saving them one by one already feels tedious, you're not alone. This can be especially draining when you're preparing a dataset for a machine learning project. The good news is that web scraping makes the whole process faster and far more manageable by letting you collect large quantities of images in just a few steps. In this blog post, we'll walk you through a straightforward way to grab images from a static website. We'll use Python, a few handy libraries, and proxies to keep things running smoothly.
Dominykas Niaura
Nov 20, 2025
10 min read

How Python image scraping works
Before scraping images, it helps to understand what's actually happening under the hood. In most cases, the workflow looks like this:
- Access the target page using an HTTP request.
- Parse the HTML to extract image URLs.
- Download the images to your machine for later use.
This flow is straightforward when a site serves the same HTML to every visitor. Things get trickier when parts of the page are generated by JavaScript, which brings us to an important distinction.
Static vs. dynamic websites
A static website delivers fixed HTML. What you see is exactly what's stored on the server, and everyone else sees the same thing. This makes static sites ideal for scraping as the HTML already contains all the image URLs you need.
A dynamic website generates content on the fly. The server or client-side JavaScript tailors the page to each visitor, often based on factors such as account data, browsing history, location, or real-time information like weather or stock updates. Dynamic websites require a different approach, as these pages may not expose image URLs in the initial HTML, which means you'll need tools that can fully render the page before you extract anything.
Both types are common, and knowing the difference early will help you choose the right approach for your scraper.
Determining whether a website is static or dynamic
A quick way to spot a dynamic site is when it greets you with personalised touches – anything from "Welcome back" to reminders about items you viewed earlier. That kind of tailored behaviour signals that the page is being generated on the fly rather than served as fixed HTML.
More generally, you can look at a few simple indicators. Static sites tend to deliver the same unchanging content to every visitor, usually stored as straightforward HTML files. Dynamic sites mix in elements like user logins, personalised recommendations, search-driven results, or forms that update based on what you enter. You might also notice differences in how URLs behave: static pages usually keep the same address, while dynamic pages often generate new query parameters or change the URL as you interact with them.
It also helps to consider the nature of the content itself. Pages that update frequently (such as weather services, news feeds, or stock information) are almost always dynamic, since the data is pulled from a database or API each time. In contrast, static sites only change when someone updates them manually.
Choosing the right tools for Python image scraping
Python gives you several ways to fetch and process images. Each tool has its own strengths, so picking the right one depends on how the website behaves. Let's overview a few popular libraries and when to use them.
Requests + Beautiful Soup
This pairing is usually the most efficient choice for scraping static websites. Requests fetches the page's HTML, and Beautiful Soup makes it easy to navigate that HTML and pull out image URLs directly. The process is quick, lightweight, and ideal when the content you need is already present in the initial source code without any JavaScript manipulation.
urllib
If you want to keep dependencies to a minimum, urllib can handle both page requests and file downloads using only Python's standard library. It's not as streamlined as requests and BeautifulSoup, but it gets the job done when you need a simple way to access a page and save images without bringing in additional packages.
Selenium
Dynamic or JavaScript-driven websites often load images only after scripts run or user actions take place. In these cases, Selenium is the most reliable solution. It automates a real browser environment, allowing the page to fully render before you extract image URLs. This makes it suitable for more complex scraping tasks where requests alone won't reveal the necessary content.
Playwright
Playwright is a more modern alternative for browser automation and is particularly handy for dynamic, JavaScript-heavy pages. Like Selenium, it controls a real browser, but it offers faster execution, built-in support for headless browsing, and a cleaner API for tasks such as waiting for network activity, handling multiple pages, and working with authenticated proxies. If you're starting from scratch or care about reliability and developer experience, Playwright is often the better choice for scraping images from dynamic sites.
Pillow
Pillow isn't a scraping tool, but it's useful once your images are downloaded. You can use it to resize, convert formats, inspect dimensions, or make other adjustments before storing or processing the files further. It's entirely optional, yet helpful if your workflow involves preparing images for datasets, machine learning models, or further analysis.
What you need for a simple Python image scraper
Let's build two scripts using Python: one for scraping images from static websites and another for collecting them from dynamic pages. But before running the scripts, you'll need a few essentials to get started. Here's what to have ready:
- Python 3.7 or higher. Make sure Python is installed on your system. You can download it from the official Python website. To verify installation, open your terminal and run:
- A text editor or IDE. You'll need somewhere to write and run your code. Visual Studio Code, PyCharm, or even a simple text editor paired with your system's terminal works fine.
- Requests. This library sends HTTP requests to fetch web pages. It's lightweight and perfect for grabbing HTML from static sites.
- Beautiful Soup. Once you have the HTML, Beautiful Soup parses it and lets you extract specific elements like image tags. It's the go-to tool for navigating HTML structure.
- Playwright. For dynamic websites that load content with JavaScript, Playwright automates a real browser session. It renders pages fully before you scrape them, ensuring you capture images that wouldn't show up in raw HTML.
Install these three libraries with a single command:
After installing Playwright, you'll also need to download the browser binaries it uses:
- Proxies. Both scripts use residential proxies to mask your IP address and avoid getting blocked when scraping multiple pages. Without proxies, websites may detect your automated activity and limit access. Residential or rotating proxies work best for this.
Decodo offers residential proxies with a 99.86% success rate, average response times under 0.6 seconds, and a 3-day free trial. Here's how to get started:
- Create an account on the Decodo dashboard.
- On the left panel, select Residential proxies.
- Choose a subscription, Pay As You Go plan, or claim a 3-day free trial.
- In the Proxy setup tab, configure your location and session preferences.
- Copy your proxy credentials for integration into your scraping script.
Get residential proxies
Claim your 3-day free trial of residential proxies and explore full features with unrestricted access.
How to scrape images from static websites
We'll begin with a basic scraper designed for pages that openly display their images without any scripting involved. This approach uses lightweight tools like Requests to fetch the page and Beautiful Soup to parse it, making it a clean, beginner-friendly starting point. Since nothing needs to load dynamically, it's a straightforward way to learn how image extraction works.
Inspecting the target website
Static sites are straightforward because everything is already baked into the page's HTML. To confirm this, you can open the site in your browser, right-click an image, and inspect its <img> tag using the developer tools. Check if it includes attributes like src or srcset, and whether the displayed image uses a full URL or a relative path. With static pages, what you see in DevTools is exactly what you'll scrape.
Extracting image URLs
The static script sends a simple GET request to the website, then feeds its HTML into a parser that searches specifically for <img> tags. Each tag contains attributes that point to where the image file is hosted. The script loops over each of these tags and collects their URLs. Nothing has to load dynamically, so there's no need to render the page or trigger JavaScript – the HTML source already tells us everything.
Downloading the images
After collecting the URLs, the script moves through them one by one and downloads each image to your machine. It builds the file name from the URL, makes another HTTP request to fetch the image data, and stores it in a local folder. If the site uses relative paths, the script combines them with the main website address to ensure every image link becomes valid. The end result is a folder full of images taken directly from the static page.
Handling common issues
Even static pages can throw curveballs. Some servers reject automated scraping unless a User-Agent header is sent, so adding a short browser-like identifier can help avoid 4xx errors. A better solution for such errors is using residential proxies. Other things to keep in mind include broken image tags, duplicate image names, and missing URLs that need to be skipped gracefully. These checks are light and usually enough to scrape successfully as long as the site isn't actively blocking bots.
The full image scraper code for static websites
Save the following code as a .py file and run it in your terminal or preferred IDE. Add your proxy details and paste in the target URL you want to scrape, then start the script. It will scan the site, extract every image source it finds, and print those URLs directly in your terminal. From there, it automatically downloads each file and saves it to a local folder, giving you a clean collection of images ready to use.
How to scrape images from dynamic websites
For sites that rely on JavaScript, scraping requires a bit more muscle. Images often appear only after the page has fully rendered, so a standard HTTP request won't be enough. Here we'll use a browser automation library Playwright to load the content, interact when needed, and then extract the image sources just as a user would see them.
Inspecting the target website
Dynamic pages often disguise their images behind JavaScript. Opening DevTools allows you to see that <img> tags only appear after the page loads fully, or that the source field doesn't contain a conventional URL. It's common to see base64-encoded blobs in the src attribute, meaning the image is embedded inside the page instead of being hosted separately. Inspecting the page reveals whether the script must wait for elements to appear or decode the image manually.
Extracting image URLs
Unlike static sites, you can't simply download the HTML and parse it. Instead, the dynamic script launches a browser session in the background, visits the page, and waits for network activity to settle so all JavaScript content has finished loading. Once the page is fully rendered, it queries all <img> elements, just as a user would see them. Some images return normal URLs, while others are base64 strings embedded into the document. The script categorizes both types so they can be handled differently.
Downloading images
In this script, an HTTP request saves the images to a local folder. Embedded base64 images require a different approach: instead of making a network request, the script decodes the base64 data directly and writes it to disk as a binary file. Both methods end up in the same image directory, but one comes from the network while the other comes from the page's encoded contents.
Handling common issues
Dynamic sites come with their own challenges. Because they rely on JavaScript, the script may need to wait longer for images to load, retry rendering, or pause for network requests to finish. Some pages use anti-bot logic, so scraping through a proxy or using realistic browser headers helps blend in with normal traffic. Missing or broken image sources and repeated filenames are also common, so the script checks for duplicates and skips unusable entries. With these safeguards, even heavily scripted pages become scrape-friendly.
The full image scraper code for dynamic websites
Save the script as a .py file, add your proxy details and target URL, and run it from your terminal or IDE. This script opens the page in a headless browser, waits for the content to render, and then collects every image source it finds. During execution, the script reports its progress – first showing the page it's loading, then how many images were detected, and finally printing each downloaded filename as it's saved. When the job finishes, it confirms how many files were successfully stored and where they can be found on your computer.
Advanced tips and best practices
Once you have a basic scraper working, you can make it more powerful and easier to maintain by adding a few extra features and safeguards.
Saving your scraped data to a CSV file or database helps you keep track of what you have collected and reuse the data later. Instead of only downloading images, you can store each image URL along with fields such as the page it came from, a timestamp, and any metadata you captured. For simple projects, a CSV file is often enough. For larger workflows or integrations with other tools, pushing records into a database makes filtering and querying much easier.
Scraping image metadata can be just as valuable as the images themselves. While you are already looping through <img> elements, you can also read attributes like alt, title, dimensions, or custom data attributes. If the page includes captions, photographer names, tags, or category labels near the image, these can be collected too by inspecting neighbouring elements in the HTML. This is especially useful when you are building datasets for machine learning or want better search and filtering later on.
Many modern sites rely on infinite scroll or lazy loading, which means new images only appear after you scroll or interact with the page. Browser automation tools let you simulate these actions by scrolling in steps, clicking Load more buttons, or waiting for new elements to appear. In practice, the script performs the same actions a user would, then repeats the extraction logic on the newly loaded content.
To be a responsible scraper, it is important to add rate limiting and avoid overwhelming the website. Introducing small delays between requests, limiting the number of pages you fetch in a single run, and reusing existing connections reduces load on the server and makes your scraper less likely to be blocked. Using realistic headers and not hammering endpoints with rapid-fire requests goes a long way.
Finally, solid error handling and logging turn a fragile script into a reliable tool. Network timeouts, missing attributes, redirects, and unexpected HTML changes are all common. Wrapping your requests and parsing logic in try-except blocks, logging failed URLs, and printing clear messages when something goes wrong will help you debug quickly. Over time, these logs become a useful record of how your scraper behaves and which parts of the site are more prone to issues.
Sample projects and use cases
Image scraping is useful far beyond simple downloading. Once you start gathering images at scale, the same techniques can power a wide range of practical projects.
One common application is building datasets for machine learning. Many computer vision models rely on large, well-organized image collections that include not just files, but also metadata such as labels, descriptions, or categories. By scraping thousands of visuals from curated sites and pairing them with tags or alt text, you can quickly assemble training data for tasks like object recognition, style detection, or recommendation systems.
In marketing research, image scraping helps teams track trends, competitor branding, visual ad themes, or product design developments. Collecting images from landing pages, eCommerce listings, or social campaigns makes it easier to analyze how brands present themselves, what styles get reused across industries, or how visual messaging changes over time. This data can be turned into insights for brand strategy, creative direction, or product positioning.
Content aggregation is another practical use case. Publishers and community platforms can pull visuals from multiple sources and curate them into galleries, feeds, newsletters, or inspiration boards. When combined with metadata such as titles, authors, or categories, it becomes possible to create searchable archives or automatically update content streams. In these contexts, scraped images become more than raw files – they form structured collections that support new products, discovery tools, and editorial work.
On a final note
Scraping images with Python can be as simple or as advanced as the website demands. Static pages let you fetch visuals quickly with tools like Requests and Beautiful Soup, while dynamic, JavaScript-driven platforms call for browser automation with Playwright. Once you learn to inspect pages, extract sources, and save the files, you become capable of gathering visual data efficiently and adapting your approach to virtually any site.
From here, the real value lies in how you use the images you've collected. You might build datasets for machine learning, analyze visuals for marketing or research, or organize them into searchable archives. As your needs grow, you can store metadata in databases, automate repeat scraping, or scale with proxies and cloud tools – turning a simple script into a powerful, reusable workflow.
Scrape smarter with proxies
Boost your scraper with Decodo’s residential proxies and capture images with outstanding success.
About the author

Dominykas Niaura
Technical Copywriter
Dominykas brings a unique blend of philosophical insight and technical expertise to his writing. Starting his career as a film critic and music industry copywriter, he's now an expert in making complex proxy and web scraping concepts accessible to everyone.
Connect with Dominykas via LinkedIn
All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.


