sun*_*ire 5 python beautifulsoup web-scraping python-3.x python-requests
我正在做网页抓取,但我在 find() 和 find_all() 中陷入/困惑。
比如哪里使用find_all,哪里使用find()。
另外,我可以在哪里使用这种方法,例如在for 循环或ul li列表中?
这是我尝试过的代码
from bs4 import BeautifulSoup
import requests
urls = "https://www.flipkart.com/offers-list/latest-launches?screen=dynamic&pk=themeViews%3DAug19-Latest-launch-Phones%3ADTDealcard~widgetType%3DdealCard~contentType%3Dneo&wid=7.dealCard.OMU_5&otracker=hp_omu_Latest%2BLaunches_5&otracker1=hp_omu_WHITELISTED_neo%2Fmerchandising_Latest%2BLaunches_NA_wc_view-all_5"
source = requests.get(urls)
soup = BeautifulSoup(source.content, 'html.parser')
divs = soup.find_all('div', class_='MDGhAp')
names = divs.find_all('a')
full_name = names.find_all('div', class_='iUmrbN').text
print(full_name)
Run Code Online (Sandbox Code Playgroud)
并得到这样的错误
File "C:/Users/ASUS/Desktop/utube/sunil.py", line 9, in <module>
names = divs.find_all('a')
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38-32\lib\site-packages\bs4\element.py", line 1601, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
Run Code Online (Sandbox Code Playgroud)
那么谁能解释我应该在哪里使用查找和查找全部?
通过这个例子也许更清楚:
from bs4 import BeautifulSoup
import re
html = """
<ul>
<li>First</li>
<li>Second</li>
<li>Third</li>
</ul>
"""
soup = BeautifulSoup(html,'html.parser')
for n in soup.find('li'):
# It Give you one element
print(n)
for n in soup.find_all('li'):
# It Give you all elements
print(n)
Run Code Online (Sandbox Code Playgroud)
结果 :
First
<li>First</li>
<li>Second</li>
<li>Third</li>
Run Code Online (Sandbox Code Playgroud)
欲了解更多信息,请阅读此https://www.crummy.com/software/BeautifulSoup/bs4/doc/#calling-a-tag-is-like-calling-find-all
| 归档时间: |
|
| 查看次数: |
10063 次 |
| 最近记录: |