在Scrapy下载器中间件中使用Deferred

Sak*_*Jun 5 python twisted scrapy

我将在Scrapy downloadermiddleware中使用一些阻塞代码(这等待免费代理).我打算用这种方法

但它真的不适用于下载器中间件,因为方法process_request(self, request, spider)在等待isinstance(response, (Response, Request))

怎么做到最好?

小智 1

您可以使用扭曲方法“deferToThread”来运行阻塞代码而不阻塞MainThread

from twisted.internet.threads import deferToThread

class DownloaderMiddleware:    
    def process_request(self, request, spider):
        return deferToThread(self.run_blocking_code_in_diffrent_thread, request, spider)

    def run_blocking_code_in_diffrent_thread(self,request, spider) -> HtmlResponse:
        print("Code will block here on a diffrent thread and wont stop MainThread")
        request.meta["proxy"] = get_proxy_blocking_call()
        return request
Run Code Online (Sandbox Code Playgroud)