小编Abd*_*dul的帖子

Scrapy如何暂停/恢复工作?

有人可以向我解释暂停/恢复功能的Scrapy工作原理吗?

scrapy我正在使用的版本是0.24.5

文件并没有提供太多细节.

我有以下简单的蜘蛛:

class SampleSpider(Spider):
name = 'sample'

def start_requests(self):
        yield Request(url='https://colostate.textbookrack.com/listingDetails?lst_id=1053')
        yield Request(url='https://colostate.textbookrack.com/listingDetails?lst_id=1054')
        yield Request(url='https://colostate.textbookrack.com/listingDetails?lst_id=1055')

def parse(self, response):
    with open('responses.txt', 'a') as f:
        f.write(response.url + '\n')
Run Code Online (Sandbox Code Playgroud)

我正在运行它:

from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals


from scrapyproject.spiders.sample_spider import SampleSpider
spider = SampleSpider()
settings = get_project_settings()
settings.set('JOBDIR', '/some/path/scrapy_cache')
settings.set('DOWNLOAD_DELAY', 10)
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run() 
Run Code Online (Sandbox Code Playgroud)

如您所见,我启用了JOBDIR选项,以便保存爬行状态.

我设置为DOWNLOAD_DELAY,10 seconds以便我可以在处理请求之前停止蜘蛛.我原以为我下次运行蜘蛛时,会不会重新生成请求.事实并非如此.

我在scrapy_cache文件夹中看到一个名为requests.queue的文件夹.但是,这总是空的. …

scrapy

7
推荐指数
2
解决办法
4641
查看次数

IE9上的JavaScript.XMLDOM.selectSingleNode给出Unknown方法 - > concat

这是为什么代码给我以下错误上IE:"未知方法//作者[@select = - >的concat( 'TES'< - , 'TS').?

function a()
{
    try
    {
        var xml ='<?xml version="1.0"?><book><author select="tests">blah</author></book>';


        var doc = new ActiveXObject("Microsoft.XMLDOM");
        doc.loadXML(xml);

        node = doc.selectSingleNode("//author[@select = concat('tes','ts')]");
        if(node == null)
        {
            alert("Node is null");
        }
        else
        {
            alert("Node is NOT null");
        }
    } catch(e)
    {
        alert(e.message);
    }
}
Run Code Online (Sandbox Code Playgroud)

javascript xpath

3
推荐指数
1
解决办法
5423
查看次数

标签 统计

javascript ×1

scrapy ×1

xpath ×1