CodeGym /์ž๋ฐ” ์ฝ”์Šค /Python SELF KO /XPath์™€ CSS Selector ์†Œ๊ฐœ

XPath์™€ CSS Selector ์†Œ๊ฐœ

Python SELF KO
๋ ˆ๋ฒจ 35 , ๋ ˆ์Šจ 4
์‚ฌ์šฉ ๊ฐ€๋Šฅ

1. XPath์™€ CSS Selector๋ž€?

์˜ค๋Š˜์€ Selenium์„ ํ™œ์šฉํ•œ ์›น ์ž๋™ํ™”์˜ ์„ธ๊ณ„๋กœ ํ•œ ๊ฑธ์Œ ๋” ๋‚˜์•„๊ฐ€์„œ XPath์™€ CSS Selector๋ฅผ ๋ฐฐ์›Œ๋ณผ ๊ฑฐ์•ผ. ์ด ์ž‘์ง€๋งŒ ๊ฐ•๋ ฅํ•œ ๋„๊ตฌ๋“ค์€ ์›น ํŽ˜์ด์ง€์—์„œ ์š”์†Œ๋ฅผ ์ฐพ์„ ๋•Œ ์ •๋ง ์œ ์šฉํ•œ ์นœ๊ตฌ๋“ค์ด ๋  ๊ฑฐ์•ผ. ์…€๋ ‰ํ„ฐ์˜ ์„ธ๊ณ„๋กœ ํ•จ๊ป˜ ๋“ค์–ด๊ฐ€์„œ ์ •ํ™•ํ•˜๊ณ  ํšจ์œจ์ ์ธ ๊ฒ€์ƒ‰์„ ์œ„ํ•œ ์‚ฌ์šฉ๋ฒ•์„ ์•Œ์•„๋ณด์ž.

๋งŒ์•ฝ HTML ๋ฌธ์„œ๊ฐ€ ๋นฝ๋นฝํ•œ ์ˆฒ์ด๋ผ๋ฉด, XPath์™€ CSS Selector๋Š” ๋„ˆ์˜ ์ง€๋„๋ฅผ ๋Œ€์‹ ํ•  ๊ฑฐ์•ผ. ์ด ๋„๊ตฌ๋“ค์„ ํ†ตํ•ด ํ•„์š”ํ•œ ๋‚˜๋ฌด(ํ˜น์€ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ์˜ ์š”์†Œ)๋ฅผ ์ฐพ์•„๊ฐˆ ์ˆ˜ ์žˆ์–ด. HTML ์ฝ”๋“œ์™€ ์ˆ˜๋งŽ์€ ํƒœ๊ทธ๋“ค ์†์—์„œ๋„ ์š”์†Œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ๋„์™€์ค„ ๊ฑฐ์•ผ.

XPath

XPath(XML Path Language)๋Š” XML ๋ฌธ์„œ๋ฅผ ํƒ์ƒ‰ํ•˜๊ธฐ ์œ„ํ•œ ์–ธ์–ด์•ผ. ํ•˜์ง€๋งŒ ๋ˆ„๊ฐ€ ๊ทธ๋Ÿฌ๋”๋ผ, HTML์€ XML์˜ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ๋ผ๊ณ ? ๊ทธ๋ž˜์„œ HTML ๋ฌธ์„œ์—๋„ XPath๋ฅผ ์ ์šฉํ•˜๊ธฐ ํŽธ๋ฆฌํ•œ ๊ฑฐ์•ผ. XPath๋Š” ๋„ค๊ฐ€ ์ง€์ •ํ•œ ๊ฒฝ๋กœ๋ฅผ ๋”ฐ๋ผ ์š”์†Œ์— ๋„๋‹ฌํ•  ์ˆ˜ ์žˆ์–ด.

CSS Selector

CSS Selector๋Š” Cascading Style Sheets์˜ ์„ธ๊ณ„์—์„œ ์™”์–ด. ๊ฑฑ์ • ๋งˆ, ์ด๊ฑธ ๋ฐฐ์šด๋‹ค๊ณ  ๋””์ž์ด๋„ˆ๊ฐ€ ๋  ํ•„์š”๋Š” ์—†์–ด! ์ด๊ฑด ํƒ€์ž…, ํด๋ž˜์Šค, ์•„์ด๋””, ์ƒํƒœ ๋“ฑ์œผ๋กœ ์š”์†Œ๋ฅผ ์ •ํ™•ํžˆ ์„ ํƒํ•˜๊ธฐ ์œ„ํ•œ ๊ฑฐ์•ผ. ์ƒ๊ฐ๋ณด๋‹ค ๊ฐ„๋‹จํ•ด, ์ด๋ฏธ ์›น ํŽ˜์ด์ง€ ์Šคํƒ€์ผ๋ง์„ ํ•  ๋•Œ ๋ฌด์˜์‹์ ์œผ๋กœ ์‚ฌ์šฉํ•ด๋ดค์„ ์ˆ˜๋„ ์žˆ์–ด.

2. ์ฝ”๋“œ์—์„œ์˜ XPath์™€ CSS Selector ํ™œ์šฉ

์ด์ œ ์…€๋ ‰ํ„ฐ๊ฐ€ ๋ฌด์—‡์ธ์ง€ ์•Œ์•˜์œผ๋‹ˆ ๋ฐ”๋กœ ์‹ค์Šต์œผ๋กœ ๊ฐ€๋ณด์ž. Selenium๊ณผ ํ•จ๊ป˜ ์ด๊ฑธ ํ™œ์šฉํ•ด์„œ ์–ด๋–ป๊ฒŒ ๋งˆ๋ฒ•์„ ๋ถ€๋ฆด ์ˆ˜ ์žˆ๋Š”์ง€ ๋ฐฐ์›Œ๋ณด์ž!

XPath ์‚ฌ์šฉ

Selenium์—์„œ XPath๋ฅผ ์ด์šฉํ•ด ์š”์†Œ๋ฅผ ์ฐพ๋Š” ์˜ˆ์ œ์•ผ:

Python

from selenium import webdriver

# ๋“œ๋ผ์ด๋ฒ„ ์„ค์ •
driver = webdriver.Chrome()

# ํŽ˜์ด์ง€ ์—ด๊ธฐ
driver.get('https://example.com')

# XPath๋กœ ์š”์†Œ ์ฐพ๊ธฐ
element = driver.find_element_by_xpath('//div[@id="menu"]/ul/li/a')

# ์š”์†Œ์˜ ํ…์ŠคํŠธ ์ถœ๋ ฅ
print(element.text)

# ๋ธŒ๋ผ์šฐ์ € ๋‹ซ๊ธฐ
driver.quit()

์„ค๋ช…:

  • //div[@id="menu"]/ul/li/a โ€” ์ด๊ฒŒ ์šฐ๋ฆฌ์˜ XPath์•ผ. ์ด ์˜๋ฏธ๋Š” ์ด๋ ‡๊ฒŒ ์ฝ์„ ์ˆ˜ ์žˆ์–ด: "๋“œ๋ผ์ด๋ฒ„์•ผ, li ์•ˆ์— ์žˆ๋Š” ul ์•ˆ์— ์žˆ๋Š” div ์ค‘ id="menu" ์ธ a ์š”์†Œ๋ฅผ ์ฐพ์•„์ค˜."

CSS Selector ์‚ฌ์šฉ

์ด์ œ CSS Selector๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์ž:

Python

from selenium import webdriver

# ๋“œ๋ผ์ด๋ฒ„ ์„ค์ •
driver = webdriver.Chrome()

# ํŽ˜์ด์ง€ ์—ด๊ธฐ
driver.get('https://example.com')

# CSS Selector๋กœ ์š”์†Œ ์ฐพ๊ธฐ
element = driver.find_element_by_css_selector('div#menu > ul > li > a')

# ์š”์†Œ์˜ ํ…์ŠคํŠธ ์ถœ๋ ฅ
print(element.text)

# ๋ธŒ๋ผ์šฐ์ € ๋‹ซ๊ธฐ
driver.quit()

์„ค๋ช…:

  • div#menu > ul > li > a โ€” ์šฐ๋ฆฌ์˜ CSS Selector์•ผ. XPath์™€ ๋น„์Šทํ•˜๊ฒŒ ์š”์†Œ๋ฅผ ์ฐพ์ง€๋งŒ ๋” ๊ฐ„๊ฒฐํ•œ ๋ฌธ๋ฒ•์„ ์‚ฌ์šฉํ•ด.

3. XPath์™€ CSS Selector์˜ ์ฐจ์ด

XPath์™€ CSS Selector์˜ ์ฐจ์ด๊ฐ€ ๋ญ๋ƒ๊ณ ? ์ข‹์€ ์งˆ๋ฌธ์ด์•ผ! ์–ธ์ œ ๋ฌด์—‡์„ ์จ์•ผ ํ• ์ง€ ํ•œ ๋ฒˆ ์•Œ์•„๋ณด์ž.

์œ ์—ฐ์„ฑ vs. ๊ฐ„๋‹จํ•จ

XPath๋Š” ๋” ์œ ์—ฐํ•ด. DOM ํŠธ๋ฆฌ์—์„œ "์œ„๋กœ" ์ด๋™ํ•˜๊ฑฐ๋‚˜ ๋ณต์žกํ•œ ๋…ผ๋ฆฌ ์กฐ๊ฑด์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ ์œ ์šฉํ•ด. ๊ทธ๋ž˜์„œ ๋” ๋ณต์žกํ•œ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐ ์œ ๋ฆฌํ•˜์ง€. ํ•˜์ง€๋งŒ CSS Selector์˜ ๊ฐ„๋‹จํ•จ๊ณผ ๊ฐ„๊ฒฐํ•จ์€ ๋” ์„ ํ˜ธ๋  ์ˆ˜๋„ ์žˆ์–ด. CSS Selector๋Š” ์ฝ๊ณ  ์“ฐ๊ธฐ ์‰ฝ๊ณ , ํŠนํžˆ ํด๋ž˜์Šค๋‚˜ ์•„์ด๋””๋กœ ์š”์†Œ๋ฅผ ์ฐพ์•„์•ผ ํ•  ๋•Œ ์œ ๋ฆฌํ•ด.

ํ•จ์ˆ˜ ์ง€์›

XPath๋Š” ํ•จ์ˆ˜ ์‚ฌ์šฉ์„ ์ง€์›ํ•ด โ€” ํ…์ŠคํŠธ ํ™•์ธ๋ถ€ํ„ฐ ์†์„ฑ ์ž‘์—…๊นŒ์ง€, contains()๋‚˜ starts-with() ๊ฐ™์€ ๊ธฐ๋Šฅ๋“ค์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด. ํ•˜์ง€๋งŒ CSS Selector๋Š” ์ด๋ ‡๊ฒŒ ๋ณต์žกํ•œ ์ง€์›์€ ์—†์–ด.

์ž‘์—… ์†๋„

๋ช‡๋ช‡ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ๋Š” CSS Selector๊ฐ€ ๋” ๋น ๋ฅด๊ฒŒ ์ž‘๋™ํ•ด. ์ด๋Š” ๋ธŒ๋ผ์šฐ์ €๊ฐ€ CSS Selector๋กœ ์ž‘์—…ํ•˜๋„๋ก ์ตœ์ ํ™”๋˜์–ด ์žˆ์–ด์„œ ๊ฐ„๋‹จํ•œ ์ž‘์—…์—์„œ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์„ ํƒ๋˜๊ธฐ ๋•Œ๋ฌธ์ด์•ผ.

๋ฌธ๋ฒ•

XPath๋Š” ๋” ๋ณต์žกํ•œ ๋ฌธ๋ฒ•์„ ๊ฐ€์ง€๊ณ  ์žˆ์–ด. ์ด๊ฑด ์žฅ์ ์ด์ž ๋‹จ์ ์ด ๋  ์ˆ˜ ์žˆ์–ด. CSS Selector๋Š” ์“ฐ๊ธฐ ๋ฐฐ์šฐ๊ธฐ ๋” ์‰ฌ์›Œ.

4. ์‹ค์ „ ์ ์šฉ

์ด์ œ ๋ฐฐ์šด ์ง€์‹์„ ์‹ค์ œ ์ž‘์—…์— ์ ์šฉํ•ด๋ณด์ž. ์˜ˆ๋ฅผ ๋“ค์–ด ์ƒํ’ˆ ํ…Œ์ด๋ธ”์ด ์žˆ๋Š” ์›น ํŽ˜์ด์ง€์—์„œ ๋ชจ๋“  ์ƒํ’ˆ ์ด๋ฆ„๊ณผ ๊ฐ€๊ฒฉ์„ ์ˆ˜์ง‘ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ํ•ด๋ณด์ž. ์ด๋Ÿฐ ์‹์œผ๋กœ ํ•  ์ˆ˜ ์žˆ์–ด:

XPath๋ฅผ ์‚ฌ์šฉํ•œ ์˜ˆ์ œ

Python

from selenium import webdriver

# ๋“œ๋ผ์ด๋ฒ„ ์„ค์ •
driver = webdriver.Chrome()

# ํŽ˜์ด์ง€ ์—ด๊ธฐ
driver.get('https://example.com/products')

# ๋ชจ๋“  ์ƒํ’ˆ ์š”์†Œ ์ฐพ๊ธฐ
products = driver.find_elements_by_xpath('//table[@class="product-table"]/tbody/tr')

# ๊ฐ ์ƒํ’ˆ์˜ ๋ฐ์ดํ„ฐ ์ถ”์ถœ
for product in products:
    name = product.find_element_by_xpath('.//td[@class="product-name"]').text
    price = product.find_element_by_xpath('.//td[@class="product-price"]').text
    print(f"Product: {name}, Price: {price}")

# ๋ธŒ๋ผ์šฐ์ € ๋‹ซ๊ธฐ
driver.quit()

CSS Selector๋ฅผ ์‚ฌ์šฉํ•œ ์˜ˆ์ œ

Python

from selenium import webdriver

# ๋“œ๋ผ์ด๋ฒ„ ์„ค์ •
driver = webdriver.Chrome()

# ํŽ˜์ด์ง€ ์—ด๊ธฐ
driver.get('https://example.com/products')

# ๋ชจ๋“  ์ƒํ’ˆ ์š”์†Œ ์ฐพ๊ธฐ
products = driver.find_elements_by_css_selector('table.product-table > tbody > tr')

# ๊ฐ ์ƒํ’ˆ์˜ ๋ฐ์ดํ„ฐ ์ถ”์ถœ
for product in products:
    name = product.find_element_by_css_selector('td.product-name').text
    price = product.find_element_by_css_selector('td.product-price').text
    print(f"Product: {name}, Price: {price}")

# ๋ธŒ๋ผ์šฐ์ € ๋‹ซ๊ธฐ
driver.quit()

5. ํŠน์ง•๊ณผ ์ผ๋ฐ˜์ ์ธ ์‹ค์ˆ˜

XPath์™€ CSS Selector๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋ช‡ ๊ฐ€์ง€ ํ•จ์ •์ด ์žˆ์–ด. ์˜ˆ๋ฅผ ๋“ค์–ด, XPath์˜ ์ ˆ๋Œ€ ๊ฒฝ๋กœ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ HTML ๊ตฌ์กฐ๊ฐ€ ์กฐ๊ธˆ๋งŒ ๋ฐ”๋€Œ์–ด๋„ ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ๊นจ์งˆ ์œ„ํ—˜์ด ์žˆ์–ด. ๊ทธ๋ž˜์„œ ํ•ญ์ƒ ์ƒ๋Œ€ ๊ฒฝ๋กœ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์œ ์—ฐ์„ฑ์„ ์œ ์ง€ํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•˜์ž.

CSS Selector๋Š” ๋ฐ˜๋ฉด, ๋„ˆ๋ฌด ๋ณต์žกํ•ด์ง€๋ฉด ์ฝ๊ธฐ๊ฐ€ ์–ด๋ ค์›Œ์งˆ ์ˆ˜ ์žˆ์–ด. ์ •ํ™•์„ฑ๊ณผ ๊ฐ„๋‹จํ•จ์˜ ๊ท ํ˜•์„ ์ž˜ ๋งž์ถ”๋Š” ๊ฒŒ ์ค‘์š”ํ•ด.

๋˜ ํ•˜๋‚˜ ์–ธ๊ธ‰ํ•ด์•ผ ํ•  ๊ฑด ์˜ค๋ฅ˜ ์ฒ˜๋ฆฌ์•ผ. ์š”์†Œ๋ฅผ ์ฐพ์ง€ ๋ชปํ•˜๋ฉด Selenium์€ NoSuchElementException๋ฅผ ๋˜์งˆ ๊ฑฐ์•ผ. try-except ๋ธ”๋ก์ด๋‚˜ WebDriverWait ๊ฐ™์€ ๋Œ€๊ธฐ ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์ด ๋ฌธ์ œ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋”์šฑ ์•ˆ์ •์ ์œผ๋กœ ๋งŒ๋“ค์–ด๋ณด์ž.

1
ะžะฟั€ะพั
Selenium ์†Œ๊ฐœ,ย  35 ัƒั€ะพะฒะตะฝัŒ,ย  4 ะปะตะบั†ะธั
ะฝะตะดะพัั‚ัƒะฟะตะฝ
Selenium ์†Œ๊ฐœ
Selenium ์†Œ๊ฐœ
์ฝ”๋ฉ˜ํŠธ
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION