如何使用url和基本身份验证凭证scrapy shell？

Question

如何使用url和基本身份验证凭证scrapy shell？

Roh*_*nil 5 web-crawler basic-authentication scrapy python-2.7 scrapy-shell

我想使用scrapy shell和测试url的响应数据,这需要基本的身份验证凭据.我试图检查scrapy shell文档,但我找不到它.

我试过scrapy shell 'http://user:pwd@abc.com'但它没有用.有谁知道我怎么能实现它？

Answer 1

如果你只想使用shell,你可以这样做:

$ scrapy shell

Run Code Online (Sandbox Code Playgroud)

并在shell内:

>> from w3lib.http import basic_auth_header
>> from scrapy import Request
>> auth = basic_auth_header(your_user, your_password)
>> req = Request(url="http://example.com", headers={'Authorization': auth})
>> fetch(req)

Run Code Online (Sandbox Code Playgroud)

as fetch使用当前请求更新shell会话.

Answer 2

Ver*_*int 5

是的httpauth 中间件。

确保在设置中启用了 HTTPAuthMiddleware 然后定义：

class MySpider(CrawSpider):
    http_user = 'username'
    http_pass = 'password'
    ...

Run Code Online (Sandbox Code Playgroud)

作为蜘蛛中的类变量。

此外，如果在设置中启用了中间件，则无需在 url 中指定登录凭据。

归档时间：	8 年，10 月前
查看次数：	2356 次
最近记录：	8 年，10 月前