The*_*heo 3 python beautifulsoup
我有这个标签:
<span class="companyName">Actua Corp <acronym title="Central Index Key">CIK</acronym>#: <a href="/cgi-bin/browse-edgar?action=getcompany&CIK=0001085621&owner=include&count=40">0001085621 (see all company filings)</a></span>
Run Code Online (Sandbox Code Playgroud)
我怎样才能获得该值 <span class="companyName">
.
在这种情况下,Actua公司
我对所有方法持开放态度.
如果你只是想要Actua Corp
,你可以使用next
r = '<span class="companyName">Actua Corp <acronym title="Central Index Key">CIK</acronym>#: <a href="/cgi-bin/browse-edgar?action=getcompany&CIK=0001085621&owner=include&count=40">0001085621 (see all company filings)</a></span>'
from bs4 import BeautifulSoup
soup = BeautifulSoup(r)
span = soup.find('span', {'class': 'companyName'})
print(span.next)
>>> Actua Corp
Run Code Online (Sandbox Code Playgroud)
如果你想要所有的文字span
,你可以使用text
print(span.text)
>>> Actua Corp CIK#: 0001085621 (see all company filings)
Run Code Online (Sandbox Code Playgroud)