相关疑难解决方法(0)

在scrapy中记录到特定的错误日志文件

我通过这样做运行scrapy日志:

from scrapy import log
class MySpider(BaseSpider):
  name = "myspider"  

  def __init__(self, name=None, **kwargs):
        LOG_FILE = "logs/spider.log"
        log.log.defaultObserver = log.log.DefaultObserver()
        log.log.defaultObserver.start()
        log.started = False
        log.start(LOG_FILE, loglevel=log.INFO)
        super(MySpider, self).__init__(name, **kwargs)

    def parse(self,response):
        ....
        raise Exception("Something went wrong!")
        log.msg('Something went wrong!', log.ERROR)

        # Somehow write to a separate error log here.
Run Code Online (Sandbox Code Playgroud)

然后我像这样运行蜘蛛:

scrapy crawl myspider
Run Code Online (Sandbox Code Playgroud)

这将存储所有log.INFO数据以及log.ERROR spider.log.

如果发生错误,我还想将这些详细信息存储在一个名为的单独日志文件中spider_errors.log.它可以更容易地搜索发生的错误,而不是试图扫描整个spider.log文件(这可能是巨大的).

有没有办法做到这一点?

编辑:

尝试使用PythonLoggingObserver:

def __init__(self, name=None, **kwargs):
        LOG_FILE = 'logs/spider.log'
        ERR_File = 'logs/spider_error.log'

        observer = log.log.PythonLoggingObserver()
        observer.start()

        log.started = …
Run Code Online (Sandbox Code Playgroud)

python logging scrapy web-scraping scrapy-spider

8
推荐指数
1
解决办法
6364
查看次数

标签 统计

logging ×1

python ×1

scrapy ×1

scrapy-spider ×1

web-scraping ×1