python硒在循环中查找子元素

Question

python硒在循环中查找子元素

Kon*_*nov 2 python selenium loops for-loop web-scraping

我需要解析页面上所有父元素中的一些子元素。

创建页面上所有文章的列表

article_elements = driver.find_elements_by_tag_name('article')

Run Code Online (Sandbox Code Playgroud)

并绑定以获得for循环中的子元素并将所有结果附加到列表之后

for article in article_elements:
    title = article.find_element_by_xpath('//article/h2').text
    share_count = article.find_element_by_xpath('//footer/div/a/span').text
    poinst = article.find_element_by_xpath('//footer/div[2]/div[1]/div[3]').text
    meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})

Run Code Online (Sandbox Code Playgroud)

循环结束后，我得到了40次相同的文章meta（第一篇文章）

{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
... 40 times

Run Code Online (Sandbox Code Playgroud)

我的整个代码

 # coding: utf8
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time

driver = webdriver.Chrome()
driver.set_window_size(1024,768)
driver.get('http://www.boredpanda.com/')

time.sleep(2)

meta_info_list = []

article_elements = driver.find_elements_by_tag_name('article')

for article in article_elements:
    title = article.find_element_by_xpath('//article/h2').text
    share_count = article.find_element_by_xpath('//footer/div/a/span').text
    poinst = article.find_element_by_xpath('//footer/div[2]/div[1]/div[3]').text
    meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})

for list in meta_info_list:
    print(list)

Run Code Online (Sandbox Code Playgroud)

Answer 1

ale*_*cxe 6

循环中的XPath表达式必须以一个点开头，以特定于上下文：

for article in article_elements:
    title = article.find_element_by_xpath('.//article/h2').text
    share_count = article.find_element_by_xpath('.//footer/div/a/span').text
    poinst = article.find_element_by_xpath('.//footer/div[2]/div[1]/div[3]').text
    meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})

Run Code Online (Sandbox Code Playgroud)

附带说明，您可以使用列表推导来缩短代码：

meta_info_list = [{
    'title': article.find_element_by_xpath('.//article/h2').text,
    'share count': article.find_element_by_xpath('.//footer/div/a/span').text,
    'points': article.find_element_by_xpath('.//footer/div[2]/div[1]/div[3]').text
} for article in article_elements]

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	3645 次
最近记录：	9 年，10 月前