相关疑难解决方法(0)

Scrapy + Splash = 拒绝连接

我使用此链接安装了Splash。按照所有步骤进行安装,但 Splash 不起作用。

我的settings.py文件:

BOT_NAME = 'Teste'
SPIDER_MODULES = ['Test.spiders']
NEWSPIDER_MODULE = 'Test.spiders'
DOWNLOADER_MIDDLEWARES = {
     'scrapy_splash.SplashCookiesMiddleware': 723,
     'scrapy_splash.SplashMiddleware': 725, 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,}
SPIDER_MIDDLEWARES = {
'scrapy_splash.SplashDeduplicateArgsMiddleware': 100,
}
SPLASH_URL = 'http://127.0.0.1:8050/'
Run Code Online (Sandbox Code Playgroud)

当我运行时scrapy crawl TestSpider

[scrapy.core.engine] INFO: Spider opened
[scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
[scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://www.google.com.br via http://127.0.0.1:8050/render.html> (failed 1 times): Connection was refused by other side: 111: Connection refused.
[scrapy.downloadermiddlewares.retry] …
Run Code Online (Sandbox Code Playgroud)

web-crawler scrapy scrapy-splash splash-js-render

2
推荐指数
2
解决办法
2971
查看次数

Scrapy、Splash、Connection被对方拒绝:10061

我在 Javascript 驱动的网站上使用 scrapy 和splash。但是,我无法传递Connection was refused by other side: 10061错误。

我得到这样的日志:

[scrapy.downloadermiddlewares.retry] DEBUG: Retrying 
 <GET https://www2.deloitte.com/ch/en/misc/search.html#country=All#qr=accounting     
 via http://localhost:8050/render.html> (failed 1 times): Connection 
 was refused by other side: 10061: No connection could be made because 
 the target machine actively refused it..
Run Code Online (Sandbox Code Playgroud)

和指向扭曲的回溯:

twisted.internet.error.ConnectionRefusedError: Connection was refused 
by other side: 10061: No connection could be made because the target 
machine actively refused it..
Run Code Online (Sandbox Code Playgroud)

我检查了设置中的所有条目,尝试了各种USER_AGENTS条目ROBOT,但没有运气。也尝试过使用--disable-private-mode启动splash,但没有效果。

奇怪的是,只需将相同的网址复制粘贴到浏览器中就可以正常工作。

我使用普通的命令行 scrapy,以及通过 API。有趣的是,当使用 API 时,当然,在 PyCharm 中单击错误消息中目标的 url,主题标签 …

python twisted scrapy docker scrapy-splash

2
推荐指数
1
解决办法
3249
查看次数