F.X*_*.X. 5 python memory-leaks numpy scipy objgraph
我试图在使用C/Cython扩展和Python的Python/NumPy程序中找到令人讨厌的内存泄漏的起源multiprocessing
.
每个子进程处理一个图像列表,并且每个子进程将输出数组(通常大约200-300MB)通过a Queue
发送到主进程.相当标准的地图/减少设置.
正如你可以想象的那样,内存泄漏可能会占据巨大的数量,并且当他们只需要5-6GB就会让多个进程快乐地超过20GB内存......这很烦人.
我已经尝试通过Valgrind运行Python的调试版本,并对内存泄漏进行四重检查,但没有发现任何内容.
我检查了我的Python代码是否悬挂了对我的数组的引用,还使用了NumPy的分配跟踪器来检查我的数组是否确实已经发布.他们是.
我做的最后一件事就是将GDB附加到我的一个进程上(这个坏男孩现在正以27GB RAM运行并计数)并将大部分堆转储到磁盘上.令我惊讶的是,转储的文件充满了零!大约7G的零值.
这是Python/NumPy中的标准内存分配行为吗?我是否遗漏了一些明显的东西,可以解释为什么没有这么多的记忆?如何正确管理内存?
编辑:为了记录,我正在运行NumPy 1.7.1和Python 2.7.3.
编辑2:我一直在监视进程strace
,似乎它一直在增加每个进程的断点(使用brk()
系统调用).
CPython实际上是否正确释放内存?那么C扩展,NumPy数组呢?谁决定何时调用brk()
,是Python本身还是底层库(libc
,......)?
下面是一个带有注释的strace日志示例,来自一次迭代(即一个输入图像集).请注意,断点不断增加,但我确保(在objgraph
)Python解释器中没有保留有意义的NumPy数组.
# Reading .inf files with metadata
# Pretty small, no brk()
open("1_tif_all/AIR00642_1.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_2.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_3.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_4.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
# This is where I'm starting the heavy processing
write(2, "[INFO/MapProcess-1] Shot 642: Da"..., 68) = 68
write(2, "[INFO/MapProcess-1] Shot 642: Vi"..., 103) = 103
write(2, "[INFO/MapProcess-1] Shot 642: Re"..., 66) = 66
# I'm opening a .tif image (752 x 480, 8-bit, 1 channel)
open("1_tif_all/AIR00642_3.tif", O_RDONLY) = 6
read(6, "II*\0JC\4\0", 8) = 8
mmap(NULL, 279600, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fbb000
munmap(0x7f9387fbb000, 279600) = 0
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 53) = 53
# Another .tif
open("1_tif_all/AIR00642_4.tif", O_RDONLY) = 6
read(6, "II*\0\266\374\3\0", 8) = 8
mmap(NULL, 261532, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fc0000
munmap(0x7f9387fc0000, 261532) = 0
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 51) = 51
brk(0x1aea97000) = 0x1aea97000
# Another .tif
open("1_tif_all/AIR00642_1.tif", O_RDONLY) = 6
read(6, "II*\0\220\253\4\0", 8) = 8
mmap(NULL, 306294, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fb5000
munmap(0x7f9387fb5000, 306294) = 0
brk(0x1af309000) = 0x1af309000
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 53) = 53
brk(0x1b03da000) = 0x1b03da000
# Another .tif
open("1_tif_all/AIR00642_2.tif", O_RDONLY) = 6
mmap(NULL, 345726, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fab000
munmap(0x7f9387fab000, 345726) = 0
brk(0x1b0c42000) = 0x1b0c42000
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 51) = 51
# I'm done reading my images
write(2, "[INFO/MapProcess-1] Shot 642: Fi"..., 72) = 72
# Allocating some more arrays for additional variables
# Increases by about 8M at a time
brk(0x1b1453000) = 0x1b1453000
brk(0x1b1c63000) = 0x1b1c63000
brk(0x1b2473000) = 0x1b2473000
brk(0x1b2c84000) = 0x1b2c84000
brk(0x1b3494000) = 0x1b3494000
brk(0x1b3ca5000) = 0x1b3ca5000
# What are these mmap calls doing here?
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9377df1000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9367be2000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93579d3000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93477c4000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93375b5000
munmap(0x7f93579d3000, 270594048) = 0
munmap(0x7f93477c4000, 270594048) = 0
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93579d3000
munmap(0x7f93375b5000, 270594048) = 0
mmap(NULL, 50737152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9354970000
munmap(0x7f9354970000, 50737152) = 0
brk(0x1b4cc6000) = 0x1b4cc6000
brk(0x1b5ce7000) = 0x1b5ce7000
Run Code Online (Sandbox Code Playgroud)
编辑3: 对于小型/大型numpy阵列,处理的释放是否有所不同?可能是相关的.我越来越相信我只是分配了太多的数组,这些数组没有被释放到系统中,因为它确实是标准行为.将尝试预先分配我的数组并根据需要重用它们.
哦。我真的应该第五次检查那些 C 扩展。
我忘记减少从 C 分配的临时 NumPy 数组之一中的引用计数。该数组没有留下 C 代码,因此我没有看到需要取消分配它。
我仍然不知道为什么它没有出现在objgraph
.
归档时间: |
|
查看次数: |
1784 次 |
最近记录: |