CodeGym /Courses /Frontend SELF EN /Elements for Dialog Markup

Elements for Dialog Markup

Frontend SELF EN
Level 10 , Lesson 4
Available

1. Basic CSS Concepts for Web Scraping

For successful web scraping, understanding the structure of HTML and CSS classes on a page is key. Knowing how page elements are styled and structured using CSS allows you to more accurately select and extract the desired data. Let's see how linking CSS to HTML, using selectors, as well as the attributes style, class, id, and name assist in working with the structure of web pages for scraping.

CSS is responsible for styling web pages. However, for web scraping purposes, we can consider CSS as a tool for understanding the structure and selecting elements. Let's look at some key CSS concepts that are important for scraping:

  • Selectors — are rules that point to specific HTML elements. Using them helps precisely identify the desired data.
  • Attributes class, id, and name — they are unique identifiers that help highlight and differentiate elements. For scraping, they are especially useful because they help isolate the necessary elements, simplifying data extraction.

2. Linking CSS to an HTML Document

CSS can be linked to HTML in various ways. Understanding these methods is essential for navigating elements and determining their styles and classes, as this will help isolate target data.

External File

CSS is often linked as an external file, which can be seen in an HTML document through the <link> tag in the <head> section. External CSS files define styles for the entire page, including identifiers and classes, which makes navigation easier when scraping.

HTML

<head>
    <link rel="stylesheet" href="styles.css">
</head>
    

Internal Styles

Sometimes styles can be defined within a page using the <style> tag. Internal styles can be found in the page's <head> and used as a clue to understand the classes and identifiers used to select necessary elements.

HTML

<head>
<style>
  .price {
    color: red;
  }
</style>
</head>

Inline Styles (attribute style)

Inline styles are directly in the HTML tags and affect only the specific element. The style attribute often contains unique properties that can be helpful for identifying target data.

HTML

<p style="color: red; font-size: 18px;">Text with inline style</p>
HTML

<p style="color: red; font-size: 18px;">Text with inline style</p>

3. Selectors in CSS

Selectors in CSS are used to apply styles to elements, but for web scraping, their main use is to precisely select elements that contain the data you need. Let's look at the main types of selectors that can be used in web scraping.

Main Types of Selectors

Tag Selector: This selector picks all elements of a certain tag (e.g., <p> or <div>). In web scraping, tag selectors are helpful for extracting information from tags that may contain text, images, and other information.

CSS

p {
  color: blue;
}
    

Class Selector: This selector chooses elements with a specific class attribute value. A class is designated by a period (.) before the name. In web scraping, classes are particularly useful as they can identify elements with the same styles, like a list of products.

CSS

.price {
    color: red;
  }
CSS

.price {
    color: red;
  }
HTML

<p class="price">Price: $99</p>

ID Selector: This selector chooses an element with a unique id attribute, marked by the # symbol. In web scraping, id is especially useful for selecting unique elements, such as a headline or a button on the page.

CSS

#product-title {
  font-size: 24px;
}
    
HTML

<h1 id="product-title">Product Name</h1>
    

Attribute Selectors: These selectors pick elements based on specific attributes like name, type, and more. In web scraping, attribute selectors are useful for selecting form elements or specific fields, for instance, selecting fields with a particular name.

CSS

input[name="email"] {
  border: 2px solid blue;
}
    

Combined Selectors: These selectors allow you to precisely pick elements by combining multiple criteria. For example, .product-list .price will select only product prices inside a product-list container.

You'll learn more about attribute and combined selectors in the upcoming lectures.

4. Attributes style, class, id and name

Attribute style

The style attribute is used to add inline styles to elements, which can serve as a distinguisher for elements that are difficult to identify by other attributes. In web scraping, it can be used as an additional filter to find specific elements on a page.

HTML

<p style="color: red; font-size: 18px;">This text is highlighted with inline style</p>
    

Attribute class

The class attribute labels a group of elements with the same styles, such as products, prices, or descriptions. When scraping, class helps select a group of elements with the same visual structure, making bulk data extraction easier.

HTML

<p class="price">Price: $99</p>
<p class="price">Price: $89</p>
    
CSS

.price {
  color: red;
}
    

Attribute id

The id attribute is unique for each element, making it valuable for extracting unique data. For example, a product title on a page may have a unique id, and that identifier can be used for precise selection of that title.

HTML

<h1 id="main-title">Product Name</h1>
    
CSS

#main-title {
  font-size: 30px;
}
    

Attribute name

The name attribute is often used in form elements and can be applied for precise selection of input fields, such as fields for email or phone number. For web scraping, name is helpful when extracting data from forms.

HTML

<input type="text" name="username" placeholder="Enter your username">
    
CSS

input[name="username"] {
  border: 1px solid #333;
}
    

5. Example of a Page Using CSS and HTML

Below is an example of an HTML document utilizing various selectors and attributes to highlight and structure the elements that can be useful for web scraping.

HTML

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Web Scraping Example Page</title>
  <link rel="stylesheet" href="styles.css">
  <style>
    .price {
      color: red;
      font-weight: bold;
    }
  </style>
</head>
<body>
  <h1 id="main-title">Product of the Week</h1>
  <p class="price">Price: $99</p>
  <p class="description">This is a unique product with excellent features.</p>
  
  <form action="/submit" method="post">
    <label for="username">Username:</label>
    <input type="text" id="username" name="username">
    
    <label for="email">Email:</label>
    <input type="email" id="email" name="email">
    
    <button type="submit">Submit</button>
  </form>
</body>
</html>
    
HTML

    <!DOCTYPE html>
    <html lang="en">
    <head>
      <meta charset="UTF-8">
      <title>Web Scraping Example Page</title>
      <link rel="stylesheet" href="styles.css">
      <style>
        .price {
          color: red;
          font-weight: bold;
        }
      </style>
    </head>
    <body>
      <h1 id="main-title">Product of the Week</h1>
      <p class="price">Price: $99</p>
      <p class="description">This is a unique product with excellent features.</p>
      
      <form action="/submit" method="post">
        <label for="username">Username:</label>
        <input type="text" id="username" name="username">
        
        <label for="email">Email:</label>
        <input type="email" id="email" name="email">
        
        <button type="submit">Submit</button>
      </form>
    </body>
    </html>
        
CSS

#main-title {
  font-size: 24px;
  color: green;
}

input[name="username"] {
  border: 1px solid #333;
  padding: 5px;
}
    
1
Task
Frontend SELF EN, level 10, lesson 4
Locked
Simple Dialog Box
Simple Dialog Box
1
Task
Frontend SELF EN, level 10, lesson 4
Locked
Attributes of <dialog>
Attributes of <dialog>
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION