Tensorflow分配内存:38535168的分配超过系统内存的10%

Question

Tensorflow分配内存:38535168的分配超过系统内存的10%

Mad*_*dhi 19 python memory tensorflow resnet keras-layer

使用ResNet50预训练的权重我正在尝试构建一个分类器.代码库完全在Keras高级Tensorflow API中实现.完整的代码发布在下面的GitHub链接中.

预训练模型的文件大小为94.7mb.

我加载了预先训练好的文件

new_model = Sequential()

new_model.add(ResNet50(include_top=False,
                pooling='avg',
                weights=resnet_weight_paths))

Run Code Online (Sandbox Code Playgroud)

并适合模型

train_generator = data_generator.flow_from_directory(
    'path_to_the_training_set',
    target_size = (IMG_SIZE,IMG_SIZE),
    batch_size = 12,
    class_mode = 'categorical'
    )

validation_generator = data_generator.flow_from_directory(
    'path_to_the_validation_set',
    target_size = (IMG_SIZE,IMG_SIZE),
    class_mode = 'categorical'
    )

#compile the model

new_model.fit_generator(
    train_generator,
    steps_per_epoch = 3,
    validation_data = validation_generator,
    validation_steps = 1
)

Run Code Online (Sandbox Code Playgroud)

在训练数据集中,我有两个文件夹狗和猫,每个持有近10,000张图像.当我编译脚本时,我收到以下错误

Epoch 1/1 2018-05-12 13:04:45.847298:W tensorflow/core/framework/allocator.cc:101] 38535168的分配超过系统内存的10%.2018-05-12 13:04:46.845021:W tensorflow/core/framework/allocator.cc:101] 37171200的分配超过系统内存的10%.2018-05-12 13:04:47.552176:W tensorflow/core/framework/allocator.cc:101] 37171200的分配超过系统内存的10%.2018-05-12 13:04:48.199240:W tensorflow/core/framework/allocator.cc:101] 37171200的分配超过系统内存的10%.2018-05-12 13:04:48.918930:W tensorflow/core/framework/allocator.cc:101] 37171200的分配超过系统内存的10%.2018-05-12 13:04:49.274137:W tensorflow/core/framework/allocator.cc:101] 19267584的分配超过系统内存的10%.2018-05-12 13:04:49.647061:W tensorflow/core/framework/allocator.cc:101] 19267584的分配超过系统内存的10%.2018-05-12 13:04:50.028839:W tensorflow/core/framework/allocator.cc:101] 19267584的分配超过系统内存的10%.2018-05-12 13:04:50.413735:W tensorflow/core/framework/allocator.cc:101] 19267584的分配超过系统内存的10%.

任何想法来优化加载预训练模型(或)的方法摆脱这个警告信息？

谢谢!

Answer 1

jab*_*a_y 9

尝试将batch_size属性减少为较小的数字(如1,2或3).例:

train_generator = data_generator.flow_from_directory(
    'path_to_the_training_set',
    target_size = (IMG_SIZE,IMG_SIZE),
    batch_size = 2,
    class_mode = 'categorical'
    )

Run Code Online (Sandbox Code Playgroud)

Answer 2

Poi*_*oik 9

或者，您可以设置环境变量TF_CPP_MIN_LOG_LEVEL=2以过滤掉信息和警告消息。我发现在这个 github 问题上，他们抱怨相同的输出。要在 python 中执行此操作，您可以使用此处的解决方案：

import os
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

Run Code Online (Sandbox Code Playgroud)

你甚至可以随意打开和关闭它。我在运行我的代码之前测试最大可能的批处理大小，并且我可以在执行此操作时禁用警告和错误。

这样消息就不会显示，但问题仍然存在 (2认同)

Answer 3

Ahm*_* J. 6

我在 CPU 上运行一个小模型并遇到了同样的问题。补充：os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'解决了。

这样消息就不会显示，但问题仍然存在 (13认同)

Answer 4

rev*_*ary 5

I was having the same problem while running Tensorflow container with Docker and Jupyter notebook. I was able to fix this problem by increasing the container memory.

On Mac OS, you can easily do this from:

       Docker Icon > Preferences >  Advanced > Memory

Run Code Online (Sandbox Code Playgroud)

Drag the scrollbar to maximum (e.g. 4GB). Apply and it will restart the Docker engine.

Now run your tensor flow container again.

It was handy to use the docker stats command in a separate terminal It shows the container memory usage in realtime, and you can see how much memory consumption is growing:

CONTAINER ID   NAME   CPU %   MEM USAGE / LIMIT     MEM %    NET I/O             BLOCK I/O           PIDS
3170c0b402cc   mytf   0.04%   588.6MiB / 3.855GiB   14.91%   13.1MB / 3.06MB     214MB / 3.13MB      21

Run Code Online (Sandbox Code Playgroud)

Answer 5

小智 5

我遇到了同样的问题，我得出的结论是，看到此错误时需要考虑两个因素：1-batch_size==>因为这负责每个纪元要处理的数据大小2-image_size==>更高图像尺寸（图像尺寸），需要处理的数据更多

因此，由于这两个因素，RAM 无法处理所有所需的数据。

为了解决这个问题，我尝试了两种情况：第一个将batch_size形式32更改为3或2第二个将image_size从(608,608)减少到(416,416)

归档时间：	7 年，4 月前
查看次数：	28119 次
最近记录：	6 年，5 月前