“nvidia-smi” 与“nvidia-smi” 之间的内存使用指标有什么区别

wus*_*eng 2 nvidia tensorflow

我得到nvidia-smiMemory-Usage是这样的

$nvidia-smi -i 0,1
Wed Mar  4 16:20:07 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.113      Driver Version: 418.113      CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:18:00.0 Off |                  N/A |
| 27%   37C    P8     1W / 250W |  10789MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:3B:00.0 Off |                  N/A |
| 41%   50C    P2    67W / 250W |  10893MiB / 10989MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    231853      C   tensorflow_model_server                    10779MiB |
|    1    120908      C   python                                     10883MiB |
+-----------------------------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

Memory-Usage是99%,但是当我变成nvidia-smi dmon这样的时候

$nvidia-smi dmon -i 0,1
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0     1    37     -     0     0     0     0   405   300
    1    67    50     -     0     0     0     0  6800  1350
    0     1    37     -     0     0     0     0   405   300
    1    67    50     -     0     0     0     0  6800  1350
    0     1    37     -     0     0     0     0   405   300
    1    67    50     -     0     0     0     0  6800  1350
Run Code Online (Sandbox Code Playgroud)

这个mem%有0%,有的时候是0~3%。

为什么会有这样的差异呢?

Mik*_*oul 5

Memory-Usagefromnvidia-smi是内存的使用情况

mem%fromnvidia-smi dmon是内存的利用率

Memory-Usage = used memory / total memory.

Utilization = time over the past sample period / global (device) memory was being read or written * 100%
Run Code Online (Sandbox Code Playgroud)

  • @MrArsGravis 在[nvidia-smi命令文档](https://developer.download.nvidia.cn/compute/DCGM/docs/nvidia-smi-367.38.pdf)中,第13页,解释了内存的利用。第 18 页解释了内存使用情况。 (3认同)