NDB在长时间请求期间不清除内存

Fra*_*que 7 python google-app-engine memory-leaks task-queue app-engine-ndb

我目前正在将一个长时间运行的作业卸载到TaskQueue来计算数据存储区中NDB实体之间的连接.

基本上,此队列处理几个实体键列表,这些实体键与节点中querynode_in_connected_nodes函数相关联GetConnectedNodes:

class GetConnectedNodes(object):
"""Class for getting the connected nodes from a list of nodes in a paged way"""
def __init__(self, list, query):
    # super(GetConnectedNodes, self).__init__()
    self.nodes = [ndb.model.Key('Node','%s' % x) for x in list]
    self.cursor = 0
    self.MAX_QUERY = 100
    # logging.info('Max query - %d' % self.MAX_QUERY)
    self.max_connections = len(list)
    self.connections = deque()
    self.query=query

def node_in_connected_nodes(self):
    """Checks if a node exists in the connected nodes of the next node in the 
       node list.
       Will return False if it doesn't, or the list of evidences for the connection
       if it does.
       """
    while self.cursor < self.max_connections:
        if len(self.connections) == 0:
            end = self.MAX_QUERY
            if self.max_connections - self.cursor < self.MAX_QUERY:
                end = self.max_connections - self.cursor
            self.connections.clear()
            self.connections = deque(ndb.model.get_multi_async(self.nodes[self.cursor:self.cursor+end]))

        connection = self.connections.popleft()
        connection_nodes = connection.get_result().connections

        if self.query in connection_nodes:
            connection_sources = connection.get_result().sources
            # yields (current node index in the list, sources)
            yield (self.cursor, connection_sources[connection_nodes.index(self.query)])
        self.cursor += 1
Run Code Online (Sandbox Code Playgroud)

这里Node有一个重复属性connections,它包含一个带有其他Node键ID 的数组,以及一个与sources给定连接匹配的数组.

产生的结果存储在blobstore中.

现在我遇到的问题是,在迭代连接函数之后,内存不会以某种方式被清除.以下日志显示了创建新GetConnectedNodes实例之前AppEngine使用的内存:

I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468
I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435
I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417
I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447
I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total
Run Code Online (Sandbox Code Playgroud)

除了一些波动之外,即使没有访问先前的值,内存也会不断增加.我发现很难调试它或者弄清楚我是否在某个地方有内存泄漏,但我似乎已将其追溯到该类.非常感谢任何帮助.

Luk*_*kas 10

我们遇到了类似的问题(长时间运行请求).我们通过关闭默认的ndb缓存来解决它们.你可以在这里阅读更多相关信息

  • 啊,我错过了这是一个长期运行的请求.抱歉.实际上,NDB的上下文缓存不断收集更多对象.如果它是一个特定的模型类,你可以在类体中放置_use_cache = False以避免缓存它.或者您可以在循环顶部调用ndb.get_context().clear_cache(). (9认同)