相关疑难解决方法(0)

如何在Python脚本中运行Scrapy

我是Scrapy的新手,我正在寻找一种从Python脚本运行它的方法.我找到了两个解释这个的来源:

http://tryolabs.com/Blog/2011/09/27/calling-scrapy-python-script/

http://snipplr.com/view/67006/using-scrapy-from-a-script/

我无法弄清楚我应该在哪里放置我的蜘蛛代码以及如何从主函数中调用它.请帮忙.这是示例代码:

# This snippet can be used to run scrapy spiders independent of scrapyd or the scrapy command line tool and use it from a script. 
# 
# The multiprocessing library is used in order to work around a bug in Twisted, in which you cannot restart an already running reactor or in this case a scrapy instance.
# 
# [Here](http://groups.google.com/group/scrapy-users/browse_thread/thread/f332fc5b749d401a) is the mailing-list discussion for this snippet. 

#!/usr/bin/python
import os
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'project.settings') #Must be at the …
Run Code Online (Sandbox Code Playgroud)

python web-crawler scrapy web-scraping

54
推荐指数
5
解决办法
5万
查看次数

标签 统计

python ×1

scrapy ×1

web-crawler ×1

web-scraping ×1