小编Bar*_*den的帖子

如何使用 tensorflow 以编程方式确定可用的 GPU 内存？

对于矢量量化 (k-means) 程序，我想知道当前 GPU 上的可用内存量（如果有的话）。这需要选择最佳批量大小，以便在整个数据集上运行尽可能少的批次。

我编写了以下测试程序：

import tensorflow as tf
import numpy as np
from kmeanstf import KMeansTF
print("GPU Available: ", tf.test.is_gpu_available())

nn=1000
dd=250000
print("{:,d} bytes".format(nn*dd*4))
dic = {}
for x in "ABCD":
    dic[x]=tf.random.normal((nn,dd))
    print(x,dic[x][:1,:2])

print("done...")

Run Code Online (Sandbox Code Playgroud)

这是我的系统上的典型输出（ubuntu 18.04 LTS，GTX-1060 6GB）。请注意核心转储。

python misc/maxmem.py 
GPU Available:  True
1,000,000,000 bytes
A tf.Tensor([[-0.23787294 -2.0841186 ]], shape=(1, 2), dtype=float32)
B tf.Tensor([[ 0.23762687 -1.1229591 ]], shape=(1, 2), dtype=float32)
C tf.Tensor([[-1.2672468   0.92139906]], shape=(1, 2), dtype=float32)
2020-01-02 17:35:05.988473: W tensorflow/core/common_runtime/bfc_allocator.cc:419] Allocator (GPU_0_bfc) ran out of memory trying to …

Run Code Online (Sandbox Code Playgroud)

python gpu tensorflow

Bar*_*den

lucky-day

14
推荐指数

2
解决办法

1万
查看次数

如何在 tensorflow 2.0b 中检查/释放 GPU 内存？

在我的 tensorflow2.0b 程序中，我确实收到了这样的错误

    ResourceExhaustedError: OOM when allocating tensor with shape[727272703] and type int8 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:TopKV2]

Run Code Online (Sandbox Code Playgroud)

在此程序中的许多基于 GPU 的操作已成功执行后，会出现该错误。

我喜欢释放与这些过去的操作相关的所有 GPU 内存，以避免上述错误。我怎样才能在 tensorflow-2.0b 中做到这一点？如何从我的程序中检查内存使用情况？

我只能使用 tf.session() 找到相关信息，这在 tensorflow2.0 中不再可用

gpu python-3.x tensorflow2.0

Bar*_*den

lucky-day

6
推荐指数

1
解决办法

2953
查看次数