断开的连接上的龙卷风内存泄漏

var*_*tec 10 python memory-leaks asynchronous tornado

我有一个设置,其中龙卷风被用作工人的传递.Tornado收到请求,该请求将此请求发送给N个工作人员,汇总结果并将其发送回客户端.哪个工作正常,除非由于某种原因发生超时 - 然后我有内存泄漏.

我有一个类似于这个伪代码的设置:

workers = ["http://worker1.example.com:1234/",
           "http://worker2.example.com:1234/", 
           "http://worker3.example.com:1234/" ...]

class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        responses = []

        def __callback(response):
            responses.append(response)
            if len(responses) == len(workers):
                self._finish_req(responses)

        for url in workers:
            async_client = tornado.httpclient.AsyncHTTPClient()
            request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
            async_client.fetch(request, __callback) 

    def _finish_req(self, responses):
        good_responses = [r for r in responses if not r.error]
        if not good_responses:
            raise tornado.web.HTTPError(500, "\n".join(str(r.error) for r in responses))
        results = aggregate_results(good_responses)
        self.set_header("Content-Type", "application/json")
        self.write(json.dumps(results))
        self.finish()

application = tornado.web.Application([
    (r"/", MyHandler),
])

if __name__ == "__main__":
    ##.. some locking code 
    application.listen()
    tornado.ioloop.IOLoop.instance().start()
Run Code Online (Sandbox Code Playgroud)

我究竟做错了什么?内存泄漏来自哪里?

Col*_*ean 5

我不知道问题的根源,似乎gc应该能够处理它,但你可以尝试两件事.

第一种方法是简化一些引用(看起来可能仍然存在responsesRequestHandler完成时的引用):

class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        self.responses = []

        for url in workers:
            async_client = tornado.httpclient.AsyncHTTPClient()
            request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
            async_client.fetch(request, self._handle_worker_response) 

    def _handle_worker_response(self, response):
        self.responses.append(response)
        if len(self.responses) == len(workers):
            self._finish_req()

    def _finish_req(self):
        ....
Run Code Online (Sandbox Code Playgroud)

如果这不起作用,您始终可以手动调用垃圾收集:

import gc
class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        ....

    def _finish_req(self):
        ....

    def on_connection_close(self):
        gc.collect()
Run Code Online (Sandbox Code Playgroud)