Dej*_*ell 7 python garbage-collection
我写了以下代码:
from hurry.size import size
from pysize import get_zise
import os
import psutil
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
objects = make_a_call()
print "total size of objects is " + (get_size(objects))
print "process consumes " + size(process.memory_info().rss)
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
Run Code Online (Sandbox Code Playgroud)
get_size()使用此代码返回对象的内存消耗.
我得到以下照片:
process consumes 21M
start method
total size of objects is 20M
process consumes 29M
exit method
process consumes 29M
Run Code Online (Sandbox Code Playgroud)
这是一个完全有效的(python 2.7)示例,具有相同的问题(为了简单起见,我稍微更新了原始代码)
\n\nfrom hurry.filesize import size\nfrom pysize import get_size\nimport os\nimport psutil\n\n\ndef make_a_call():\n return range(1000000)\n\ndef load_objects():\n process = psutil.Process(os.getpid())\n print "start method"\n process = psutil.Process(os.getpid())\n print"process consumes ", size(process.memory_info().rss)\n objects = make_a_call()\n # FIXME\n print "total size of objects is ", size(get_size(objects))\n print "process consumes ", size(process.memory_info().rss)\n print "exit method"\n\ndef main():\n process = psutil.Process(os.getpid())\n print "process consumes " + size(process.memory_info().rss)\n load_objects()\n print "process consumes " + size(process.memory_info().rss)\n\n\nmain()\nRun Code Online (Sandbox Code Playgroud)\n\n这是输出:
\n\nprocess consumes 7M\nstart method\nprocess consumes 7M\ntotal size of objects is 30M\nprocess consumes 124M\nexit method\nprocess consumes 124M\nRun Code Online (Sandbox Code Playgroud)\n\n差异约为 100Mb
\n\n这是代码的固定版本:
\n\nfrom hurry.filesize import size\nfrom pysize import get_size\nimport os\nimport psutil\n\n\ndef make_a_call():\n return range(1000000)\n\ndef load_objects():\n process = psutil.Process(os.getpid())\n print "start method"\n process = psutil.Process(os.getpid())\n print"process consumes ", size(process.memory_info().rss)\n objects = make_a_call()\n print "process consumes ", size(process.memory_info().rss)\n print "total size of objects is ", size(get_size(objects))\n print "exit method"\n\ndef main():\n process = psutil.Process(os.getpid())\n print "process consumes " + size(process.memory_info().rss)\n load_objects()\n print "process consumes " + size(process.memory_info().rss)\n\n\nmain()\nRun Code Online (Sandbox Code Playgroud)\n\n这是更新后的输出:
\n\nprocess consumes 7M\nstart method\nprocess consumes 7M\nprocess consumes 38M\ntotal size of objects is 30M\nexit method\nprocess consumes 124M\nRun Code Online (Sandbox Code Playgroud)\n\n你发现区别了吗?您在测量最终进程大小之前计算对象大小,这会导致额外的内存消耗。\n让我们检查一下为什么会发生这种情况 - 这是 \n https://github.com/bosswissam/pysize/blob/master/pysize.py的来源:
\n\nimport sys\nimport inspect\n\ndef get_size(obj, seen=None):\n """Recursively finds size of objects in bytes"""\n size = sys.getsizeof(obj)\n if seen is None:\n seen = set()\n obj_id = id(obj)\n if obj_id in seen:\n return 0\n # Important mark as seen *before* entering recursion to gracefully handle\n # self-referential objects\n seen.add(obj_id)\n if hasattr(obj, \'__dict__\'):\n for cls in obj.__class__.__mro__:\n if \'__dict__\' in cls.__dict__:\n d = cls.__dict__[\'__dict__\']\n if inspect.isgetsetdescriptor(d) or inspect.ismemberdescriptor(d):\n size += get_size(obj.__dict__, seen)\n break\n if isinstance(obj, dict):\n size += sum((get_size(v, seen) for v in obj.values()))\n size += sum((get_size(k, seen) for k in obj.keys()))\n elif hasattr(obj, \'__iter__\') and not isinstance(obj, (str, bytes, bytearray)):\n size += sum((get_size(i, seen) for i in obj))\n return size\nRun Code Online (Sandbox Code Playgroud)\n\n这里发生了很多事情!\n最值得注意的是,它保存了它在一组中看到的所有对象,以解决循环引用。如果删除该行,在任何一种情况下都不会占用那么多内存。
\n\n这是一篇好文章关于这个主题的
\n\n\n\n如果创建一个大对象并再次删除它,Python 可能已经释放了内存,但是涉及的内存分配器 don\xe2\x80\x99t\n 必然将内存返回给操作系统,因此它可能看起来\n如果 Python 进程使用的虚拟内存比实际使用的多得多。
\n