CodeGym /Java Course /Python SELF EN /Using CSS Selectors to Find Elements on a Web Page

Using CSS Selectors to Find Elements on a Web Page

Python SELF EN
Level 31 , Lesson 4
Available

1. Remembering CSS Selectors

Welcome to our world where HTML pages reveal their secrets not with a snap of a finger, but with a sharp CSS selector. If you think CSS selectors are only for page styling (you know, so your site doesn't look like a scribbled school notebook), it's time to open your third scraper eye. Today we'll look at how CSS selectors can become your favorite tool for finding and extracting data.

CSS selectors, like an affectionate nickname, let us target specific HTML elements. They help define which elements on the page you want to work with. If an HTML page is a maze, then CSS selectors are the red thread that helps you find your way out.

Examples of CSS Selectors

  • Tag: p — selects all <p> elements (paragraphs).
  • Class: .classname — selects all elements with a specific class.
  • ID: #idname — selects the element with a specific ID.
  • Combinations: div > p — selects all <p> elements that are direct children of <div>.

2. Using Selectors in BeautifulSoup

Goodbye boring life without CSS selectors in BeautifulSoup! It's time to refresh our approach. Picture this: you stumble upon a website and just have to extract all the quotes from great thinkers to impress at your next interview. For this, we use the select() method, which works specifically with CSS selectors.

Methods select() and select_one()

The select() method will return you a list of all elements matching your selector. Meanwhile, select_one() will grab the very first element matching the selector—like a search engine that gives you exactly what you need instead of a mile-long list of irrelevant links.

Say you have an HTML page containing quotes:

HTML

<div class="quote">
    <h2 class="author">Pushkin</h2>
    <p class="text">Oh Pushkin.</p>
    <a href="https://example.com" class="link">Read more</a>
</div>
<div class="quote">
    <h2 class="author">Lenin</h2>
    <p class="text">Learn, learn, and learn again.</p>
    <a href="https://example.com" class="link">Read more</a>
</div>
<div class="quote">
    <h2 class="author">Stalin</h2>
    <p class="text">No man - no problem.</p>
    <a href="https://example.com" class="link">Read more</a>
</div>

Here's how we can grab them:

Python

from bs4 import BeautifulSoup
import requests

# Get the HTML code of the page
response = requests.get('http://quotes.toscrape.com/')
soup = BeautifulSoup(response.text, 'html.parser')

# Find all quotes using CSS selectors
quotes = soup.select('.quote')

for quote in quotes:
    text = quote.select_one('.text').get_text()
    author = quote.select_one('.author').get_text()
    print(f'Quote: {text}\\nAuthor: {author}\\n')

Isn't it almost magical? The .quote class helps us fetch all elements labeled as quotes, while .text and .author are child elements from which we extract the quote's text and the author's name.

3. Examples of Searching with CSS Selectors

Let's practice with some examples so your clever brain knows what to do when it sees a div with ten classes. Selectors can be used for more targeted data searches on pages. You can combine them to get exactly what you need.

Selector by Class and Tag

Python

# Find all links in the menu block
menu_links = soup.select('nav.menu a')

for link in menu_links:
    print(link['href'])

Selector by ID

Python

# Extract the main heading of the page
main_heading = soup.select_one('#main-heading')
print(main_heading.text)

Combining Selectors

Python

# Find all sentences in the highlighted section
highlighted_sentences = soup.select('.highlighted p')

for sentence in highlighted_sentences:
    print(sentence.text)

4. Errors and How to Avoid Them

Your job as a scraper won't always be as easy as a cup of coffee. There are times when CSS selectors might not work if:

  • The page has dynamic content, and the required elements are loaded via JavaScript.
  • You're referencing a selector that doesn't exist (e.g., a typo in the class or ID name).
  • The HTML structure changes, leading to a "horror movie" scene where you can't find your elements.

To avoid such errors, make sure you're working with an up-to-date and static version of the HTML document and double-check your selector syntax.

Practical Application

Now you have the ability to use CSS selectors in real-world data extraction projects. This skill will come in handy for building tools to analyze and monitor prices, gather news, and even track changes on websites. The beauty of this approach is that even if a site changes its CSS-based appearance, your code remains functional because it relies on the HTML structure, not the styling.

1
Task
Python SELF EN, level 31, lesson 4
Locked
Basics of Working with CSS Selectors
Basics of Working with CSS Selectors
2
Task
Python SELF EN, level 31, lesson 4
Locked
Extracting a list of elements
Extracting a list of elements
3
Task
Python SELF EN, level 31, lesson 4
Locked
Complex HTML Element Search
Complex HTML Element Search
4
Task
Python SELF EN, level 31, lesson 4
Locked
Designing a Web Scraper for Price Analysis
Designing a Web Scraper for Price Analysis
1
Опрос
Introduction to BeautifulSoup,  31 уровень,  4 лекция
недоступен
Introduction to BeautifulSoup
Introduction to BeautifulSoup
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION