在标签 BeautifulSoup 内显示文本

Question

在标签 BeautifulSoup 内显示文本

R K*_*R K 9 python beautifulsoup web-scraping python-3.x

我试图只显示标签内的文本，例如：

<span class="listing-row__price ">$71,996</span>

Run Code Online (Sandbox Code Playgroud)

我只想显示

“71,996 美元”

我的代码是：

import requests
from bs4 import BeautifulSoup
from csv import writer

response = requests.get('https://www.cars.com/for-sale/searchresults.action/?mdId=21811&mkId=20024&page=1&perPage=100&rd=99999&searchSource=PAGINATION&showMore=false&sort=relevance&stkTypId=28880&zc=11209')

soup = BeautifulSoup(response.text, 'html.parser')

cars = soup.find_all('span', attrs={'class': 'listing-row__price'})
print(cars)

Run Code Online (Sandbox Code Playgroud)

如何从标签中提取文本？

Answer 1

Bit*_*han 9

要获取标签中的文本，有几种方法，

a) 使用.text 标签的属性。

cars = soup.find_all('span', attrs={'class': 'listing-row__price'})
for tag in cars:
    print(tag.text.strip())

Run Code Online (Sandbox Code Playgroud)

输出

$71,996
$75,831
$71,412
$75,476
....

Run Code Online (Sandbox Code Playgroud)

b)使用get_text()

for tag in cars:
    print(tag.get_text().strip())

Run Code Online (Sandbox Code Playgroud)

c) 如果标签内只有那个字符串，你也可以使用这些选项

.string
.contents[0]
next(tag.children)
next(tag.strings)
next(tag.stripped_strings)

IE。

for tag in cars:
    print(tag.string.strip()) #or uncomment any of the below lines
    #print(tag.contents[0].strip())
    #print(next(tag.children).strip())
    #print(next(tag.strings).strip())
    #print(next(tag.stripped_strings))

Run Code Online (Sandbox Code Playgroud)

输出：

$71,996
$75,831
$71,412
$75,476
$77,001
...

Run Code Online (Sandbox Code Playgroud)

笔记：

.text并且.string不一样。如果标签中有其他元素，则.string返回None，而 .text 将返回标签内的文本。

from bs4 import BeautifulSoup
html="""
<p>hello <b>there</b></p>
"""
soup = BeautifulSoup(html, 'html.parser')
p = soup.find('p')
print(p.string)
print(p.text)

Run Code Online (Sandbox Code Playgroud)

输出

None
hello there

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，6 月前
查看次数：	8358 次
最近记录：	4 年，4 月前