Scrapy:跳过项目并继续执行exectuion

Question

Scrapy:跳过项目并继续执行exectuion

我在做一个RSS蜘蛛.如果当前项目中没有匹配项,我想继续执行蜘蛛忽略当前节点...到目前为止,我已经得到了这个:

        if info.startswith('Foo'):
            item['foo'] = info.split(':')[1]
        else:
            return None

Run Code Online (Sandbox Code Playgroud)

(info是一个在xpath之前从xpath清理过的字符串...)

但我得到了这个例外:

    exceptions.TypeError: You cannot return an "NoneType" object from a

Run Code Online (Sandbox Code Playgroud)

蜘蛛

那么我怎么能忽略这个节点并继续执行呢？

Answer 1

ser*_*yPS 12

parse(response):
    #make some manipulations
    if info.startswith('Foo'):
            item['foo'] = info.split(':')[1]
            return [item]
        else:
            return []

Run Code Online (Sandbox Code Playgroud)

但更好的是不使用返回,使用yield或什么都不做

parse(response):
    #make some manipulations
    if info.startswith('Foo'):
            item['foo'] = info.split(':')[1]
            yield item
        else:
            return

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年前
查看次数：	3910 次
最近记录：	8 年，9 月前