Python内存消耗对象和进程

Dej*_*ell 7 python garbage-collection

我写了以下代码:

from hurry.size import size
from pysize import get_zise
import os
import psutil
def load_objects():
   process = psutil.Process(os.getpid())
   print "start method"
   process = psutil.Process(os.getpid())
   print "process consumes " + size(process.memory_info().rss)
   objects = make_a_call()
   print "total size of objects is " + (get_size(objects))
   print "process consumes " + size(process.memory_info().rss)
   print "exit method"

def main():
    process = psutil.Process(os.getpid())
    print "process consumes " + size(process.memory_info().rss)
    load_objects()
    print "process consumes " + size(process.memory_info().rss)
Run Code Online (Sandbox Code Playgroud)

get_size()使用代码返回对象的内存消耗.

我得到以下照片:

process consumes 21M
start method
total size of objects is 20M
process consumes 29M
exit method
process consumes 29M
Run Code Online (Sandbox Code Playgroud)
  1. 如果进程仅消耗8M,对象如何消耗20M?
  2. 如果我退出一个方法不应该内存减少回21,因为垃圾收集器将清除消耗的内存?

ffe*_*ast 4

    \n
  1. 这很可能是因为您的代码不准确。
  2. \n
\n\n

这是一个完全有效的(python 2.7)示例,具有相同的问题(为了简单起见,我稍微更新了原始代码)

\n\n
from hurry.filesize import size\nfrom pysize import get_size\nimport os\nimport psutil\n\n\ndef make_a_call():\n    return range(1000000)\n\ndef load_objects():\n    process = psutil.Process(os.getpid())\n    print "start method"\n    process = psutil.Process(os.getpid())\n    print"process consumes ", size(process.memory_info().rss)\n    objects = make_a_call()\n    # FIXME\n    print "total size of objects is ", size(get_size(objects))\n    print "process consumes ", size(process.memory_info().rss)\n    print "exit method"\n\ndef main():\n    process = psutil.Process(os.getpid())\n    print "process consumes " + size(process.memory_info().rss)\n    load_objects()\n    print "process consumes " + size(process.memory_info().rss)\n\n\nmain()\n
Run Code Online (Sandbox Code Playgroud)\n\n

这是输出:

\n\n
process consumes 7M\nstart method\nprocess consumes  7M\ntotal size of objects is  30M\nprocess consumes  124M\nexit method\nprocess consumes 124M\n
Run Code Online (Sandbox Code Playgroud)\n\n

差异约为 100Mb

\n\n

这是代码的固定版本:

\n\n
from hurry.filesize import size\nfrom pysize import get_size\nimport os\nimport psutil\n\n\ndef make_a_call():\n    return range(1000000)\n\ndef load_objects():\n    process = psutil.Process(os.getpid())\n    print "start method"\n    process = psutil.Process(os.getpid())\n    print"process consumes ", size(process.memory_info().rss)\n    objects = make_a_call()\n    print "process consumes ", size(process.memory_info().rss)\n    print "total size of objects is ", size(get_size(objects))\n    print "exit method"\n\ndef main():\n    process = psutil.Process(os.getpid())\n    print "process consumes " + size(process.memory_info().rss)\n    load_objects()\n    print "process consumes " + size(process.memory_info().rss)\n\n\nmain()\n
Run Code Online (Sandbox Code Playgroud)\n\n

这是更新后的输出:

\n\n
process consumes 7M\nstart method\nprocess consumes  7M\nprocess consumes  38M\ntotal size of objects is  30M\nexit method\nprocess consumes 124M\n
Run Code Online (Sandbox Code Playgroud)\n\n

你发现区别了吗?您在测量最终进程大小之前计算对象大小,这会导致额外的内存消耗。\n让我们检查一下为什么会发生这种情况 - 这是 \n https://github.com/bosswissam/pysize/blob/master/pysize.py的来源

\n\n
import sys\nimport inspect\n\ndef get_size(obj, seen=None):\n    """Recursively finds size of objects in bytes"""\n    size = sys.getsizeof(obj)\n    if seen is None:\n        seen = set()\n    obj_id = id(obj)\n    if obj_id in seen:\n        return 0\n    # Important mark as seen *before* entering recursion to gracefully handle\n    # self-referential objects\n    seen.add(obj_id)\n    if hasattr(obj, \'__dict__\'):\n        for cls in obj.__class__.__mro__:\n            if \'__dict__\' in cls.__dict__:\n                d = cls.__dict__[\'__dict__\']\n                if inspect.isgetsetdescriptor(d) or inspect.ismemberdescriptor(d):\n                    size += get_size(obj.__dict__, seen)\n                break\n    if isinstance(obj, dict):\n        size += sum((get_size(v, seen) for v in obj.values()))\n        size += sum((get_size(k, seen) for k in obj.keys()))\n    elif hasattr(obj, \'__iter__\') and not isinstance(obj, (str, bytes, bytearray)):\n        size += sum((get_size(i, seen) for i in obj))\n    return size\n
Run Code Online (Sandbox Code Playgroud)\n\n

这里发生了很多事情!\n最值得注意的是,它保存了它在一组中看到的所有对象,以解决循环引用。如果删除该行,在任何一种情况下都不会占用那么多内存。

\n\n
    \n
  1. 首先,这种行为在很大程度上取决于您是否使用 CPython 还是其他东西。从 CPython 开始,这种情况可能会发生,因为并不总是能够立即将内存返还给操作系统。
  2. \n
\n\n

这是一篇好文章关于这个主题的

\n\n
\n

如果创建一个大对象并再次删除它,Python 可能已经释放了内存,但是涉及的内存分配器 don\xe2\x80\x99t\n 必然将内存返回给操作系统,因此它可能看起来\n如果 Python 进程使用的虚拟内存比实际使用的多得多。

\n
\n