在我的应用程序中,我将重新使用在ImageNet上训练的现有MobileNet,并仅使用5个类来重新训练flowers数据集上的输出层。重新训练的模型将保存到磁盘。此后,在几次迭代中加载模型并执行评估,最终导致内存耗尽,整个应用程序崩溃。在进行了一些诊断之后,我意识到泄漏是来自model.evaluate()keras方法的。可以在独立的示例代码中重现该问题:
import os
import resource
import keras
import numpy as np
if __name__ == '__main__':
init_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
for it in range(4):
x_valid = np.random.uniform(0, 1, (64, 224, 224, 3)).astype(np.float32)
y_valid = keras.utils.to_categorical(np.random.uniform(0, 5, (64, )).astype(np.int32), 5)
start_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
model = keras.models.load_model(os.path.abspath(os.path.join('.', 'mobilenet_flowers.h5')),
custom_objects={'relu6': keras.applications.mobilenet.relu6,
'DepthwiseConv2D': keras.applications.mobilenet.DepthwiseConv2D})
loss, _ = model.evaluate(x_valid, y_valid, batch_size=64, verbose=False)
keras.backend.clear_session()
del model
end_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
print('Iteration %d:' % it)
print(' Memory alloc before evaluate() is %7d kilobytes' % start_alloc)
print(' Memory alloc after evaluate() is %7d kilobytes' % end_alloc)
print(' Memory alloc loss for evaluate is %7d kilobytes\n' % (end_alloc - start_alloc))
exit_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
print('Memory alloc before loop is %7d kilobytes' % init_alloc)
print('Memory alloc after loop is %7d kilobytes' % exit_alloc)
print('Memory alloc difference is %7d kilobytes' % (exit_alloc - init_alloc))
Run Code Online (Sandbox Code Playgroud)
当我执行脚本时,将打印出以下内容:
Iteration 0:
Memory alloc before evaluate() is 251864 kilobytes
Memory alloc after evaluate() is 901696 kilobytes
Memory alloc loss for evaluate is 649832 kilobytes
Iteration 1:
Memory alloc before evaluate() is 901696 kilobytes
Memory alloc after evaluate() is 1036780 kilobytes
Memory alloc loss for evaluate is 135084 kilobytes
Iteration 2:
Memory alloc before evaluate() is 1036780 kilobytes
Memory alloc after evaluate() is 1148692 kilobytes
Memory alloc loss for evaluate is 111912 kilobytes
Iteration 3:
Memory alloc before evaluate() is 1148692 kilobytes
Memory alloc after evaluate() is 1190804 kilobytes
Memory alloc loss for evaluate is 42112 kilobytes
Memory alloc before loop is 138792 kilobytes
Memory alloc after loop is 1190804 kilobytes
Memory alloc difference is 1052012 kilobytes
Run Code Online (Sandbox Code Playgroud)
有什么建议可能有什么问题吗?在浏览完论坛之后,我尝试添加K.clear_session(),但是,如您在代码中所看到的那样,这没有帮助。该模型临时存储在https://ufile.io/rgaxs。
有关我的环境的一些其他信息:
== cat /etc/issue ===============================================
Linux 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
VERSION="16.04.3 LTS (Xenial Xerus)"
VERSION_ID="16.04"
VERSION_CODENAME=xenial
== are we in docker =============================================
No
== compiler =====================================================
c++ (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
== check pips ===================================================
numpy (1.12.1)
numpydoc (0.7.0)
protobuf (3.5.0)
tensorflow (1.4.0)
tensorflow-tensorboard (0.4.0rc3)
== check for virtualenv =========================================
False
== tensorflow import ============================================
tf.VERSION = 1.4.0
tf.GIT_VERSION = v1.4.0-rc1-11-g130a514
tf.COMPILER_VERSION = v1.4.0-rc1-11-g130a514
keras.VERSION = 2.0.9
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
306 次 |
最近记录: |