Selenium 通过类名两个参数查找元素

Vin*_*nce 0 python web-scraping python-3.x selenium-webdriver

如何通过类名查找元素而不重复输出?我有两堂课要刮hdrlnkresults-price。我写的代码是这样的:

x = driver.find_elements_by_class_name(['hdrlnk','result-price'])
Run Code Online (Sandbox Code Playgroud)

它给了我一些错误。我尝试过另一个代码,如下:

x = driver.find_elements_by_class_name('hdrlnk'),
y = driver.find_elements_by_class_name('result-price')
for xs in x:
    for ys in y:
        print(xs.text + ys.text)   
Run Code Online (Sandbox Code Playgroud)

但我得到了这样的结果

sony 5 disc cd changer$40
sony 5 disc cd changer$70
sony 5 disc cd changer$70
sony 5 disc cd changer$190
sony 5 disc cd changer$190
sony 5 disc cd changer$190
sony 5 disc cd changer$190
sony 5 disc cd changer$10
Run Code Online (Sandbox Code Playgroud)

我试图抓取的 HTML 结构部分

<p class="result-info">
    <span class="icon icon-star" role="button" title="save this post in your favorites list">
        <span class="screen-reader-text">favorite this post</span>
    </span>
    <time class="result-date" datetime="2019-11-07 18:20" title="Thu 07 Nov 06:20:56 PM">Nov  7</time>
    <a href="https://vancouver.craigslist.org/rch/ele/d/chandeliers/7015824686.html" data-id="7015824686" class="result-title hdrlnk">CHANDELIERS</a>
    <span class="result-meta">
        <span class="result-price">$800</span>
        <span class="result-hood"> (Richmond)</span>
        <span class="result-tags">
            <span class="pictag">pic</span>
        </span>
        <span class="banish icon icon-trash" role="button">
            <span class="screen-reader-text">hide this posting</span>
        </span>
        <span class="unbanish icon icon-trash red" role="button" aria-hidden="true"></span>
        <a href="#" class="restore-link">
            <span class="restore-narrow-text">restore</span>
            <span class="restore-wide-text">restore this posting</span>
        </a>
    </span>
</p>
Run Code Online (Sandbox Code Playgroud)

第一个元素重复,但我得到了第二个元素的正确值。我该如何纠正这个错误?

Jef*_*ffC 5

.find_elements_by_class_name()只需要一个类名。我建议使用 CSS 选择器来完成这项工作,例如.hdrlnk .result-price. 代码看起来像

prices = driver.find_elements_by_css_selector('.hdrlnk .result-price')
Run Code Online (Sandbox Code Playgroud)

这将打印所有价格。如果您还想要标签,则必须编写更多代码。

for heading in driver.find_elements_by_css_selector('.hdrlnk'):
    print(heading.text)
    for price in heading.find_elements_by_xpath('./following::span[@class="result-price"]'):
        print('  ' + price.text)
Run Code Online (Sandbox Code Playgroud)

有关查找元素的所有选项,请参阅文档。

CSS 选择器参考:
W3C 参考
Selenium 技巧:CSS 选择器
驯服高级 CSS 选择器