相关疑难解决方法(0)

Scrapy - Reactor无法重启

有:

from twisted.internet import reactor
from scrapy.crawler import CrawlerProcess
Run Code Online (Sandbox Code Playgroud)

我总是成功地运行这个过程:

process = CrawlerProcess(get_project_settings())
process.crawl(*args)
# the script will block here until the crawling is finished
process.start() 
Run Code Online (Sandbox Code Playgroud)

但是因为我已将此代码移动到web_crawler(self)函数中,如下所示:

def web_crawler(self):
    # set up a crawler
    process = CrawlerProcess(get_project_settings())
    process.crawl(*args)
    # the script will block here until the crawling is finished
    process.start() 

    # (...)

    return (result1, result2) 
Run Code Online (Sandbox Code Playgroud)

并开始使用类实例化调用该方法,如:

def __call__(self):
    results1 = test.web_crawler()[1]
    results2 = test.web_crawler()[0]
Run Code Online (Sandbox Code Playgroud)

和运行:

test()
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

Traceback (most recent call last):
  File "test.py", line 573, in <module> …
Run Code Online (Sandbox Code Playgroud)

python web-crawler scrapy

14
推荐指数
2
解决办法
1万
查看次数

标签 统计

python ×1

scrapy ×1

web-crawler ×1