相关疑难解决方法(0)

scrapy错误:exceptions.ValueError:请求url中缺少方案:

try except用来避免错误,但我的终端仍显示错误,但没有显示日志消息:

raise ValueError('Missing scheme in request url: %s' % self._url)
exceptions.ValueError: Missing scheme in request url: 
Run Code Online (Sandbox Code Playgroud)

当scrapy没有得到image_urls时,如何避免此错误?
请指导我,非常感谢.

    try:

        item['image_urls'] = ["".join(image.extract()) ]     
    except:
        log.msg("no image foung!. url={}".format(response.url),level=log.INFO)
Run Code Online (Sandbox Code Playgroud)

python scrapy

10
推荐指数
2
解决办法
8446
查看次数

用scrapy下载图片

我从scrapy开始,我有第一个真正的问题.它正在下载图片.所以这是我的蜘蛛.

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.selector import HtmlXPathSelector
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from example.items import ProductItem
from scrapy.utils.response import get_base_url

import re

class ProductSpider(CrawlSpider):
    name = "product"
    allowed_domains = ["domain.com"]
    start_urls = [
            "http://www.domain.com/category/supplies/accessories.do"
    ]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        items = []
        sites = hxs.select('//td[@class="thumbtext"]')
        number = 0
        for site in sites:
            item = ProductItem()
            xpath = '//div[@class="thumb"]/img/@src'
            item['image_urls'] = site.select(xpath).extract()[number]
            item['image_urls'] = 'http://www.domain.com' + item['image_urls']
            items.append(item)
            number = number + 1
        return items
Run Code Online (Sandbox Code Playgroud)

当我引用ITEM_PIPELINES …

python scrapy

7
推荐指数
2
解决办法
1万
查看次数

标签 统计

python ×2

scrapy ×2