相关疑难解决方法(0)

Scrapy 具有多个页面

我创建了一个简单的 scrapy 项目，在其中，我从初始站点 example.com/full 获取了总页码。现在我需要抓取从 example.com/page-2 开始到 100 的所有页面（如果总页数为 100）。我怎样才能做到这一点？

任何意见将是有益的。

代码：

import scrapy


class AllSpider(scrapy.Spider):
    name = 'all'
    allowed_domains = ['example.com']
    start_urls = ['https://example.com/full/']
    total_pages = 0

def parse(self, response):
    total_pages = response.xpath("//body/section/div/section/div/div/ul/li[6]/a/text()").extract_first()
    #urls = ('https://example.com/page-{}'.format(i) for i in range(1,total_pages))
    print(total_pages)

Run Code Online (Sandbox Code Playgroud)

更新#1：

我尝试使用它，urls = ('https://example.com/page-{}'.format(i) for i in range(1,total_pages))但它不起作用，可能是我做错了什么。

更新#2：我已经像这样更改了我的代码

class AllSpider(scrapy.Spider):
name = 'all'
allowed_domains = ['sanet.st']
start_urls = ['https://sanet.st/full/']
total_pages = 0

def parse(self, response):
    total_pages = response.xpath("//body/section/div/section/div/div/ul/li[6]/a/text()").extract_first()
    for page in range(2, int(total_pages)):
        url = …

Run Code Online (Sandbox Code Playgroud)

scrapy web-scraping python-3.x

Ran*_*dan

2018 09-12

2
推荐指数

1
解决办法

9605
查看次数

标签统计

python-3.x ×1

scrapy ×1

web-scraping ×1

Scrapy 具有多个页面

标签 统计

标签统计