bar*_*icz 3 python beautifulsoup
我有以下抓取代码:
import requests, bs4
def make_soup():
url = 'https://www.airbnb.pl/s/Girona--Hiszpania/homes?place_id=ChIJRRrTHsPNuhIRQMqjIeD6AAM&query=Girona%2C%20Hiszpania&refinement_paths%5B%5D=%2Fhomes&allow_override%5B%5D=&s_tag=b5bnciXv'
response = requests.get(url)
soup = bs4.BeautifulSoup(response.text, "html.parser")
return soup
def get_listings():
soup = make_soup()
listings = soup.select('._f21qs6')
number_of_listings = len(listings)
print("Current number of listings: " + str(number_of_listings))
while number_of_listings != 18:
print("Too few listings: " + str(number_of_listings))
soup = make_soup()
listings = soup.select('._f21qs6')
number_of_listings = len(listings)
print("All fine! The number of listings is: " + str(number_of_listings))
return listings
new_listings = get_listings()
print(new_listings)
Run Code Online (Sandbox Code Playgroud)
我认为 def作为字符串get_listings()返回,所以我不能在它上面listings使用 BeautifulSoup并作为一个文本块打印。prettify()new_listings
有没有办法new_listings以 HTML 式格式打印,或者至少将每个标签打印在单独的行上?
type(new_listings)
# list
Run Code Online (Sandbox Code Playgroud)
显示这new_listings是一个列表。尝试:
print(new_listings[0].prettify())
Run Code Online (Sandbox Code Playgroud)