相关疑难解决方法(0)

使用scrapy进行while循环时出现ReactorNotRestartable错误

我得到的twisted.internet.error.ReactorNotRestartable错误,当我执行下面的代码:

from time import sleep
from scrapy import signals
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from scrapy.xlib.pydispatch import dispatcher

result = None

def set_result(item):
    result = item

while True:
    process = CrawlerProcess(get_project_settings())
    dispatcher.connect(set_result, signals.item_scraped)

    process.crawl('my_spider')
    process.start()

    if result:
        break
    sleep(3)
Run Code Online (Sandbox Code Playgroud)

它第一次起作用,然后我得到错误.我process每次创建变量,那么问题是什么?

python twisted scrapy python-2.7

18
推荐指数
4
解决办法
1万
查看次数

Multiprocessing of Scrapy Spiders in Parallel Processes

There as several similar questions that I have already read on Stack Overflow. Unfortunately, I lost links of all of them, because my browsing history got deleted unexpectedly.

All of the above questions, couldn't help me. Either, some of them have used CELERY or some of them SCRAPYD, and I want to use the MULTIPROCESSISNG Library. Also, the Scrapy Official Documentation shows how to run multiple spiders on a SINGLE PROCESS, not on MULTIPLE PROCESSES.

None of them couldn't help …

python scrapy web-scraping scrapy-spider python-multiprocessing

7
推荐指数
1
解决办法
4393
查看次数