use*_*697 19 python scrapy-spider six
这是我关于堆栈溢出的第一个问题.最近我想使用Linked-in-scraper,所以我下载并指示"scrapy crawl linkedin.com"并获得以下错误消息.为了您的信息,我使用anaconda 2.3.0和python 2.7.11.在执行程序之前,所有相关的包(包括scrapy和6)都会由pip更新.
Traceback (most recent call last):
File "/Users/byeongsuyu/anaconda/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/cmdline.py", line 108, in execute
settings = get_project_settings()
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/utils/project.py", line 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 285, in setmodule
self.set(key, getattr(module, key), priority)
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 260, in set
self.attributes[name].set(value, priority)
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 55, in set
value = BaseSettings(value, priority=priority)
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 91, in __init__
self.update(values, priority)
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 317, in update
for name, value in six.iteritems(values):
File "/Users/byeongsuyu/anaconda/lib/python2.7/site-packages/six.py", line 599, in iteritems
return d.iteritems(**kw)
AttributeError: 'list' object has no attribute 'iteritems'
Run Code Online (Sandbox Code Playgroud)
据我所知,这个错误源于d不是字典类型而是列表类型.而且由于错误来自scrapy上的代码,可能是scrapy包或六包上的问题.我该如何尝试修复此错误?
编辑:这是来自scrapy.cfg的代码
# Automatically created by: scrapy start project
#
# For more information about the [deploy] section see:
# http://doc.scrapy.org/topics/scrapyd.html
[settings]
default = linkedIn.settings
[deploy]
#url = http://localhost:6800/
project = linkedIn
Run Code Online (Sandbox Code Playgroud)
Val*_*ntz 33
这是由链接的刮刀设置引起的:
ITEM_PIPELINES = ['linkedIn.pipelines.LinkedinPipeline']
Run Code Online (Sandbox Code Playgroud)
然而,根据文件,ITEM_PIPELINES应该是一个字典:
要激活Item Pipeline组件,必须将其类添加到
ITEM_PIPELINES设置中,如下例所示:Run Code Online (Sandbox Code Playgroud)ITEM_PIPELINES = { 'myproject.pipelines.PricePipeline': 300, 'myproject.pipelines.JsonWriterPipeline': 800, }您在此设置中为类分配的整数值决定了它们运行的顺序:项目从较低值到较高值类别.习惯上在0-1000范围内定义这些数字.
根据这个问题,它曾经是一个列表,它解释了为什么这个刮刀使用列表.因此,您必须要求刮刀的开发人员更新他们的代码,或者ITEM_PIPELINES自己设置.