小编Ash*_*boo的帖子

Multiprocessing of Scrapy Spiders in Parallel Processes

There as several similar questions that I have already read on Stack Overflow. Unfortunately, I lost links of all of them, because my browsing history got deleted unexpectedly.

All of the above questions, couldn't help me. Either, some of them have used CELERY or some of them SCRAPYD, and I want to use the MULTIPROCESSISNG Library. Also, the Scrapy Official Documentation shows how to run multiple spiders on a SINGLE PROCESS, not on MULTIPLE PROCESSES.

None of them couldn't help …

python scrapy web-scraping scrapy-spider python-multiprocessing

Ash*_*boo

2015 06-27

7
推荐指数

1
解决办法

4393
查看次数

从Python脚本将参数传递给Scrapy Spider

我只提到我在发布这个问题之前提到的一些问题(在发布这个问题之前,我目前没有链接到我提到过的所有问题) - :

我可以完全运行此代码,如果我没有传递参数并要求用户从BBSpider类输入(没有主函数 - 在name ="dmoz"行下方),或者将它们作为预定义(即静态)参数.

我的代码在这里.

我基本上试图从Python脚本执行Scrapy蜘蛛而不需要任何其他文件(甚至是设置文件).这就是为什么我在代码本身内部也指定了设置.

这是我执行此脚本时的输出 - :

http://bigbasket.com/ps/?q=apple
2015-06-26 12:12:34 [scrapy] INFO: Scrapy 1.0.0 started (bot: scrapybot)
2015-06-26 12:12:34 [scrapy] INFO: Optional features available: ssl, http11
2015-06-26 12:12:34 [scrapy] INFO: Overridden settings: {}
2015-06-26 12:12:35 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
None
2015-06-26 12:12:35 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-06-26 12:12:35 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, …

Run Code Online (Sandbox Code Playgroud)

python arguments scrapy web-scraping scrapy-spider

Ash*_*boo

2017 05-23

5
推荐指数

1
解决办法

2087
查看次数