eN_*_*Joy 5 python django scrapy
在没有接受答案的情况下,这里的问题被问到了Scipes上的外键,所以我在这里用更明确的最小设置来重新提出问题:
django模型:
class Article(models.Model):
title = models.CharField(max_length=255)
content = models.TextField()
category = models.ForeignKey('categories.Category', null=True, blank=True)
Run Code Online (Sandbox Code Playgroud)
注意category这里的定义是如何无关紧要的,但确实如此ForeignKey.所以,在django shell中,这可以工作:
c = Article(title="foo", content="bar", category_id=2)
c.save()
Run Code Online (Sandbox Code Playgroud)
scrapy项目:
class BotsItem(DjangoItem):
django_model = Article
Run Code Online (Sandbox Code Playgroud)
scrapy管道:
class BotsPipeline(object):
def process_item(self, item, spider):
item['category_id'] = 2
item.save()
return item
Run Code Online (Sandbox Code Playgroud)
使用上面的代码,scrapy抱怨:
exceptions.KeyError: 'BotsItem does not support field: category_id'
Run Code Online (Sandbox Code Playgroud)
公平,因为category_id没有出现在django模型中,我们从中得到了scrapy项目.为了记录,如果我们有管道(假设我们有一个类别foo):
class BotsPipeline(object):
def process_item(self, item, spider):
item['category'] = 'foo'
item.save()
return item
Run Code Online (Sandbox Code Playgroud)
现在scrapy抱怨:
exceptions.TypeError: isinstance() arg 2 must be a class, type, or tuple
of classes and types
Run Code Online (Sandbox Code Playgroud)
那究竟应该怎么做?
好的,我设法解决了这个问题,我正在这里记录.如上所述exceptions.TypeError,item['category'] expects an instance of类别class, in my case I am using [django-categories ][1] so in the pipeline just replace with this (assumeCategory`已在ORM中填充:
class BotsPipeline(object):
def process_item(self, item, spider):
item['category'] = Category.objects.get(id=2)
item.save()
return item
Run Code Online (Sandbox Code Playgroud)