Tho*_*roy 37 python json scrapy web-scraping
如何使用Scrapy来抓取返回JSON的Web请求?例如,JSON看起来像这样:
{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}
Run Code Online (Sandbox Code Playgroud)
我将寻找刮取特定项目(例如name,fax在上面)并保存到csv.
ale*_*cxe 61
这与使用Scrapy的HtmlXPathSelectorhtml响应相同.唯一的区别是你应该使用jsonmodule来解析响应:
class MySpider(BaseSpider):
...
def parse(self, response):
jsonresponse = json.loads(response.body_as_unicode())
item = MyItem()
item["firstName"] = jsonresponse["firstName"]
return item
Run Code Online (Sandbox Code Playgroud)
希望有所帮助.
小智 7
不需要使用json模块来解析响应对象。
class MySpider(BaseSpider):
...
def parse(self, response):
jsonresponse = response.json()
item = MyItem()
item["firstName"] = jsonresponse.get("firstName", "")
return item
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
32295 次 |
| 最近记录: |