Cru*_*han 5 python scrapy python-2.7 scrapyd
我正在尝试使用四个蜘蛛部署一个爬虫.其中一个蜘蛛使用XMLFeedSpider并从shell和scrapyd运行良好,但其他人使用BaseSpider并且在scrapyd中运行时都会出现此错误,但是从shell运行正常
TypeError:init()得到一个意外的关键字参数'_job'
从我所看到的,这指向我的蜘蛛中的init函数的问题,但我似乎无法解决问题.我不需要init函数,如果我完全删除它,我仍然会收到错误!
我的蜘蛛看起来像这样
from scrapy import log
from scrapy.spider import BaseSpider
from scrapy.selector import XmlXPathSelector
from betfeeds_master.items import Odds
# Parameters
MYGLOBAL = 39
class homeSpider(BaseSpider):
name = "home"
#con = None
allowed_domains = ["www.myhome.com"]
start_urls = [
"http://www.myhome.com/oddxml.aspx?lang=en&subscriber=mysubscriber",
]
def parse(self, response):
items = []
traceCompetition = ""
xxs = XmlXPathSelector(response)
oddsobjects = xxs.select("//OO[OddsType='3W' and Sport='Football']")
for oddsobject in oddsobjects:
item = Odds()
item['competition'] = ''.join(oddsobject.select('Tournament/text()').extract())
if traceCompetition != item['competition']:
log.msg('Processing %s' % (item['competition'])) #print item['competition']
traceCompetition = item['competition']
item['matchDate'] = ''.join(oddsobject.select('Date/text()').extract())
item['homeTeam'] = ''.join(oddsobject.select('OddsData/HomeTeam/text()').extract())
item['awayTeam'] = ''.join(oddsobject.select('OddsData/AwayTeam/text()').extract())
item['lastUpdated'] = ''
item['bookie'] = MYGLOBAL
item['home'] = ''.join(oddsobject.select('OddsData/HomeOdds/text()').extract())
item['draw'] = ''.join(oddsobject.select('OddsData/DrawOdds/text()').extract())
item['away'] = ''.join(oddsobject.select('OddsData/AwayOdds/text()').extract())
items.append(item)
return items
Run Code Online (Sandbox Code Playgroud)
我可以在蜘蛛中使用init函数,但是我得到了完全相同的错误.
def __init__(self, *args, **kwargs):
super(homeSpider, self).__init__(*args, **kwargs)
pass
Run Code Online (Sandbox Code Playgroud)
为什么会发生这种情况,我该如何解决?
alecx 给出了很好的答案:
我的初始化函数是:
def __init__(self, domain_name):
Run Code Online (Sandbox Code Playgroud)
为了在 scrapyd 的鸡蛋内工作,它应该是:
def __init__(self, domain_name, **kwargs):
Run Code Online (Sandbox Code Playgroud)
考虑到您将domain_name作为强制参数传递
归档时间: |
|
查看次数: |
1489 次 |
最近记录: |