使用外键scrapy djangoitem

eN_*_*Joy 5 python django scrapy

没有接受答案的情况下,这里的问题被问到了Scipes上的外键,所以我在这里用更明确的最小设置来重新提出问题:

django模型:

class Article(models.Model):
    title = models.CharField(max_length=255)
    content = models.TextField()
    category = models.ForeignKey('categories.Category', null=True, blank=True)
Run Code Online (Sandbox Code Playgroud)

注意category这里的定义是如何无关紧要的,但确实如此ForeignKey.所以,在django shell中,这可以工作:

c = Article(title="foo", content="bar", category_id=2)
c.save()
Run Code Online (Sandbox Code Playgroud)

scrapy项目:

class BotsItem(DjangoItem):
    django_model = Article
Run Code Online (Sandbox Code Playgroud)

scrapy管道:

class BotsPipeline(object):
    def process_item(self, item, spider):
        item['category_id'] = 2
        item.save()
        return item
Run Code Online (Sandbox Code Playgroud)

使用上面的代码,scrapy抱怨:

exceptions.KeyError: 'BotsItem does not support field: category_id'
Run Code Online (Sandbox Code Playgroud)

公平,因为category_id没有出现在django模型中,我们从中得到了scrapy项目.为了记录,如果我们有管道(假设我们有一个类别foo):

class BotsPipeline(object):
    def process_item(self, item, spider):
        item['category'] = 'foo'
        item.save()
        return item
Run Code Online (Sandbox Code Playgroud)

现在scrapy抱怨:

exceptions.TypeError: isinstance() arg 2 must be a class, type, or tuple
 of classes and types
Run Code Online (Sandbox Code Playgroud)

那究竟应该怎么做?

eN_*_*Joy 7

好的,我设法解决了这个问题,我正在这里记录.如上所述exceptions.TypeError,item['category'] expects an instance of类别class, in my case I am using [django-categories ][1] so in the pipeline just replace with this (assumeCategory`已在ORM中填充:

class BotsPipeline(object):
    def process_item(self, item, spider):
        item['category'] = Category.objects.get(id=2)
        item.save()
        return item
Run Code Online (Sandbox Code Playgroud)