Back to blog

Selecting Elements by Class in XPath: Syntax, Examples, and Pitfalls

Class names are often the quickest way to target elements when you scrape a page. But in XPath, they are not as simple as they look. Because HTML stores multiple classes inside a single space-separated attribute value, a selector that seems correct can still match the wrong element or miss the right one entirely. In this blog post, you’ll learn how to select elements by class in XPath, when to use exact or partial matching, and how to avoid common class matching pitfalls.

Selecting Elements by Class in XPath

TL;DR

  • XPath can match class attributes with either exact matching or contains()
  • Exact matching only works when the class value is an exact full-string match
  • contains(@class, "value") is flexible, but it can match unintended substrings
  • The safest pattern for full class matching is contains(concat(" ", normalize-space(@class), " "), " classname ")
  • When class names are unstable or generated dynamically, structural selectors or other attributes are often a better choice

How to write XPath for class name

There are 2 main ways to select elements by class in XPath: exact matching and partial matching. Both work, but they solve slightly different problems.

The most direct version is an exact match:

//element[@class="classname"]

This only works when the class attribute is exactly classname and nothing else. That means it is only reliable when the element has one class, or when you know the full class string exactly as it appears in the HTML.

For example, this XPath matches:

<div class="product-card"></div>

But it does not match:

<div class="product-card featured"></div>

That's where many XPath selectors start to fail. In HTML, multiple classes are stored as a single space-separated string within the same class attribute. XPath does not treat them as separate tokens automatically. It just sees one attribute value.

The more flexible option is partial matching with contains():

//element[contains(@class, "classname")]

This matches whenever the class attribute contains that text anywhere inside the string. It is much more practical when elements carry multiple classes, and you only need to target one of them.

So the short version is this:

  • Use @class="classname" when you need an exact full-string match.
  • Use contains(@class, "classname") when the element may have multiple classes.

That is why partial matching is more common in real-world scraping. Exact matching is cleaner, but it breaks quickly as the class list grows or the order changes.

If you are weighing XPath against another selector style, it also helps to understand how to choose between XPath and CSS selectors for web scraping:

XPath:

//div[contains(concat(" ", normalize-space(@class), " "), " product-card ")]

CSS:

.product-card

Skip the boilerplate

Decodo's Web Scraping API handles proxies, CAPTCHAs, and anti-bot detection so your code stays short and your requests actually land.

How to use partial class name in XPath

If the element you want can have multiple classes, contains() is usually the first XPath pattern people reach for.

The basic syntax looks like this:

//element[contains(@class, "value")]

This checks whether the class attribute contains the target text anywhere inside the string. That makes it much more flexible than an exact match.

For example, take a look at this XPath:

//div[contains(@class, "card")]

It would match all of these:

<div class="card"></div>
<div class="card featured"></div>
<div class="featured card"></div>

That is why contains() is so common in scraping and automation. It still works when:

  • the element has multiple classes
  • the class order changes
  • the target class appears alongside utility or framework classes

In real pages, that's often the normal case rather than the exception.

The catch is that contains() matches text, not whole class names. So, if you search for:

//button[contains(@class, "btn")]

You may match more than you intended, including classes like:

  • btn
  • btn-primary
  • submit-btn

That makes contains() useful, but not always precise. It's a strong default when you need flexibility, but it can create false positives if the class name you are targeting is short or shared across other class names.

So, the practical rule is simple: use contains() when exact matching is too brittle, but don't assume it's safe just because it returns results.

Selecting elements with multiple classes

Once an element has more than one class, plain contains(@class, "value") becomes risky. It still works in many cases, but it can also match substrings you didn't intend. The safer approach is to match the class name as a whole token, not just as text inside the attribute string.

The most reliable XPath pattern for that is the space-padding technique:

//div[contains(concat(" ", normalize-space(@class), " "), " classname ")]

It looks heavier than a basic contains() call, but it solves an important problem. normalize-space(@class) cleans up extra whitespace, concat(" ", ..., " ") adds padding around the full class string, and the final match checks for " classname " as a complete class name. That means it will match card without accidentally matching card-body or flashcard.

If you need to match a specific class combination, you can chain conditions with and:

//div[contains(@class, "card") and contains(@class, "featured")]

That works when you know the element should contain both class names, regardless of order. The key thing to remember is that the order of classes in HTML doesn't matter. class="card featured" and class="featured card" mean the same thing, so the goal isn't to match the full string in a specific order. It's to identify the class names that actually make the element unique.

In practice, the safest rule is simple. If you only need a quick loose match, contains() is fine. If precision matters, especially when the class name is short or likely to appear inside other class names, use the space-padding pattern instead.

Practical examples and use cases

The easiest way to make these XPath patterns stick is to see them against real HTML. Below are 3 common cases using Python and lxml. If you want a deeper refresher on the parser itself, the lxml tutorial on parsing HTML and XML is helpful. 

If you're using browser automation instead of plain parsing, our Selenium and Python scraping guide is the closest parallel.

Example 1: Extract all product cards

This is the common "find every card on the page" case:

from lxml import html
markup = """
<div class="product-card featured">Laptop</div>
<div class="product-card">Phone</div>
<div class="product-banner">Sale</div>
"""
tree = html.fromstring(markup)
cards = tree.xpath('//div[contains(concat(" ", normalize-space(@class), " "), " product-card ")]')
for card in cards:
print(card.text_content().strip())

This returns the 2 product cards and skips the banner. That's the main benefit of the whole-class match pattern: it stays precise even when class names are similar.

Navigation links often carry multiple classes, so this is a good case for token-aware matching.

from lxml import html
markup = """
<nav>
<a class="nav-link active" href="/home">Home</a>
<a class="nav-link" href="/pricing">Pricing</a>
<a class="footer-link" href="/contact">Contact</a>
</nav>
"""
tree = html.fromstring(markup)
nav_links = tree.xpath('//a[contains(concat(" ", normalize-space(@class), " "), " nav-link ")]')
for link in nav_links:
print(link.get("href"), link.text_content().strip())

This gives you only the navigation links, even though one of them has an extra active class.

Example 3: target elements with a dynamic class prefix

Sometimes the class is partly stable and partly generated, like product-123 or product-987. In that case, a partial match is still useful.

from lxml import html
markup = """
<div class="product-123 card">Item A</div>
<div class="product-456 card">Item B</div>
<div class="profile-123 card">User A</div>
"""
tree = html.fromstring(markup)
products = tree.xpath('//div[contains(@class, "product-")]')
for product in products:
print(product.text_content().strip())

This is one of the cases where a broader contains() match makes sense, because the changing part is expected. The tradeoff is that you should only use it when the prefix is distinctive enough not to collide with unrelated classes.

These examples show the real pattern behind class selection in XPath. Start loose when the class signal is stable and broad matching is acceptable. Switch to the whole-class technique when precision matters and substring matches become a problem.

Common pitfalls and best practices

Selecting by class in XPath looks easy until the selector starts matching the wrong elements or stops working after a small frontend change. Most of the trouble comes from a few repeated mistakes.

Don't trust plain contains() too much

The biggest trap is substring matching. A selector like the one below can match btnbtn-primary, and submit-btn. That may be fine in a loose search, but it's not precise:

//button[contains(@class, "btn")]

If you need an exact class token, use the safer pattern:

//button[contains(concat(" ", normalize-space(@class), " "), " btn ")]

That checks the full class name rather than any matching substring.

Do not use exact matching on multi-class elements

This is another common failure point:

//div[@class="card"]

It only works if the element’s class attribute is exactly card. The moment the HTML becomes this, the selector stops matching:

<div class="card featured"></div>

If the element can carry multiple classes, exact matching is usually too brittle.

Be careful with dynamic classes

Modern sites often generate class names that are unstable, hashed, or build-specific, such as css-1a2b3c. Those may work today and break on the next deployment.

When that happens, class-based XPath is usually the wrong anchor. A better option is to look for:

  • data- attributes
  • stable IDs
  • nearby text
  • a stronger structural path

This is also where it helps to know when a CSS selector might be simpler than an XPath expression, especially if the page structure is straightforward.

Match the exact case

XPath class matching is case-sensitive. If the HTML says ProductCard and your selector looks for productcard, it won't match.

That sounds obvious, but it's easy to miss when class names are long, framework-generated, or mixed-case.

Test selectors before you put them in code

A good XPath can still fail if you write it against the wrong DOM state or make a small syntax mistake. The fastest way to catch that early is to test it in the browser first.

In DevTools, you can use:

$x("//your/xpath")

That gives you quick feedback before you move the selector into Python, Selenium, or another automation tool. If you're testing selectors against pages that depend on browser rendering, it also helps to understand how Playwright web scraping fits into the workflow.

The short version is simple: use exact matching only when the full class value is stable, use the space-padding pattern when precision matters, and don't build your selector around class names that look auto-generated or temporary.

Final thoughts

Selecting by class in XPath looks simple, but the details matter. Exact matching with @class="value" only works when the full class string matches exactly. contains(@class, "value") is more flexible, but it can also match the wrong elements if you're not careful.

That's why the safest pattern for precise class matching is usually the space-padding approach with concat(" ", normalize-space(@class), " "). It's a little longer, but it avoids the substring trap that breaks many otherwise “working” selectors.

The bigger lesson is to treat class names as signals, not guarantees. If the classes are stable, XPath can work very well. If they are generated, inconsistent, or likely to change, you're usually better off combining class checks with structure, text, or more stable attributes.

Enhance your web scraper with proxies

Claim your 3-day free trial of residential proxies and use 115M+ ethically-sourced IPs, advanced geo-targeting options, a 99.86% success rate, and more.

About the author

Mykolas Juodis

Head of Marketing

Mykolas is a seasoned digital marketing professional with over a decade of experience, currently leading Marketing department in the web data gathering industry. His extensive background in digital marketing, combined with his deep understanding of proxies and web scraping technologies, allows him to bridge the gap between technical solutions and practical business applications.

Connect with Mykolas via LinkedIn.

All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.

Frequently asked questions

How do you write XPath for class name?

There are 2 common approaches. The first is an exact class match, which only works when the element has exactly that class value and no others. The second is a partial match using contains(), which is more flexible and works better when the element has multiple classes.

How do you use partial class name in XPath?

You use contains() to check whether the class attribute includes a target value anywhere in the string. That makes it useful when elements have more than one class or when class order changes. The trade-off is that it can also match unintended substrings, so if precision matters, it is better to use the whole-class matching pattern rather than a loose partial match.

Choosing between XPath and CSS

How To Choose The Right Selector For Web Scraping: XPath vs CSS

If you're fresh-new to data scraping, you may not be familiar with selectors yet. Let us introduce ya – selectors are objects that find and return web items on a page. These pieces are an essential part of a scraper, as they affect your tests' outcome, efficiency, and speed.

Yep, understanding the idea of a selector isn't that complicated. Finding the right selector itself might be. To be honest, even the two languages that define them, XPath and CSS, have their own pros and cons. So it can quickly become a headache to choose one of them. But here's some good news – we're here to help! Let's explore it together.

Playwright Web Scraping: A Practical Tutorial

Web scraping can feel like directing a play without a script – unpredictable and chaotic. That’s where Playwright steps in: a powerful, headless browser automation tool that makes scraping modern, dynamic websites smoother than ever. In this practical tutorial, you’ll learn how to use Playwright to reliably extract data from any web page.

Scraping the Web with Selenium and Python: A Step-By-Step Tutorial

Modern websites rely heavily on JavaScript and anti-bot measures, making data extraction a challenge. Basic tools fail with dynamic content loaded after the initial page, but Selenium with Python can automate browsers to execute JavaScript and interact with pages like a user. In this tutorial, you'll learn to build scrapers that collect clean, structured data from even the most complex websites.

© 2018-2026 decodo.com (formerly smartproxy.com). All Rights Reserved