我在抓取该网站的表格时遇到问题,我应该得到标题,但我却得到了
AttributeError: 'NoneType' object has no attribute 'tbody'
Run Code Online (Sandbox Code Playgroud)
我对网络抓取有点陌生,所以如果你能帮助我那就太好了
import requests
from bs4 import BeautifulSoup
URL = "https://www.collincad.org/propertysearch?situs_street=Willowgate&situs_street_suffix" \
"=&isd%5B%5D=any&city%5B%5D=any&prop_type%5B%5D=R&prop_type%5B%5D=P&prop_type%5B%5D=MH&active%5B%5D=1&year=2021&sort=G&page_number=1"
s = requests.Session()
page = s.get(URL)
soup = BeautifulSoup(page.content, "lxml")
table = soup.find("table", id="propertysearchresults")
table_data = table.tbody.find_all("tr")
headings = []
for td in table_data[0].find_all("td"):
headings.append(td.b.text.replace('\n', ' ').strip())
print(headings)
Run Code Online (Sandbox Code Playgroud) python beautifulsoup web-scraping python-3.x python-requests
我下面有一些 python 代码,它沿着树走下去,但我希望它沿着树向下工作,检查根据值有条件地采取一些路径。我想LandedPrice
根据条件获取树的分支fulfillmentChannel
parsed_results['LowestLanded'] = sku_multi_sku['Summary']['LowestPrices']['LowestPrice']['LandedPrice']['Amount']['value']
Run Code Online (Sandbox Code Playgroud)
沿着这棵树走下去,但是值,因为有两个LowestPrice
记录/字典为每个 condition
和fulfillmentChannel
一对返回一个。我想过滤condition=new
,fulfillmentChannel=Amazon
所以我只得到一条记录。当我解析 XML 数据时,我可以使用类似于此处的代码来完成LowestPrices/LowestPrice[@condition='new'][@fulfillmentChannel='Merchant']/LandedPrice/Amount"
此操作,但无法获得类似的代码来工作。我该如何用字典做到这一点?
"LowestPrices":{
"value":"\n ",
"LowestPrice":[
{
"value":"\n ",
"condition":{
"value":"new" #condtion new
},
"fulfillmentChannel":{
"value":"Amazon" ## fulfilllmentChannel #1
},
"LandedPrice":{
"value":"\n ",
"CurrencyCode":{
"value":"USD"
},
"Amount":{
"value":"19.57"
}
},
"ListingPrice":{
"value":"\n ",
"CurrencyCode":{
"value":"USD"
},
"Amount":{
"value":"19.57"
}
},
"Shipping":{
"value":"\n ",
"CurrencyCode":{
"value":"USD"
},
"Amount":{
"value":"0.00"
}
}
},
{
"value":"\n ",
"condition":{
"value":"new"
},
"fulfillmentChannel":{ …
Run Code Online (Sandbox Code Playgroud)