scrapy中的python.failure.Failure OpenSSL.SSL.Error(版本1.0.4)

Question

scrapy中的python.failure.Failure OpenSSL.SSL.Error(版本1.0.4)

我正在研究数据抓取项目,我的代码使用Scrapy(版本1.0.4)和Selenium(版本2.47.1).

from scrapy import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapy.spiders import CrawlSpider
from selenium import webdriver

class TradesySpider(CrawlSpider):
    name = 'tradesy'
    start_urls = ['My Start url',]

    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self, response):
        self.driver.get(response.url)
        while True:
           tradesy_urls = Selector(response).xpath('//div[@id="right-panel"]"]')
           data_urls = tradesy_urls.xpath('div[@class="item streamline"]/a/@href').extract()
           for link in data_urls:
               url = 'My base url'+link
               yield Request(url=url,callback=self.parse_data)
               time.sleep(10)
           try:
               data_path = self.driver.find_element_by_xpath('//*[@id="page-next"]')
           except:
               break
           data_path.click()
           time.sleep(10)

    def parse_data(self,response):
        'Scrapy Operations...'

Run Code Online (Sandbox Code Playgroud)

当我执行我的代码时,我得到了一些网址的预期输出,但对于其他人我得到以下错误.

2016-01-19 15:45:17 [scrapy] DEBUG: Retrying <GET MY_URL> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'SSL3_READ_BYTES', 'ssl handshake failure')]>]

Run Code Online (Sandbox Code Playgroud)

请为此查询提供解决方案.

Answer 1

eLR*_*uLL 11

根据此报告的问题,您可以创建自己的ContextFactorySSL来处理.

context.py:

from OpenSSL import SSL
from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory


class CustomContextFactory(ScrapyClientContextFactory):
    """
    Custom context factory that allows SSL negotiation.
    """

    def __init__(self):
        # Use SSLv23_METHOD so we can use protocol negotiation
        self.method = SSL.SSLv23_METHOD

Run Code Online (Sandbox Code Playgroud)

settings.py

DOWNLOADER_CLIENTCONTEXTFACTORY = 'yourproject.context.CustomContextFactory'

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，11 月前
查看次数：	4709 次
最近记录：	8 年，6 月前