如何使用NVIDIA K80?

min*_*ing 0 cuda

该机器已设置4个NVIDIA K80,其输出nvidia-smi是4张卡的信息。他们的GPU ID为0,1,2,3。每个K80都有两种类型的GPU内存:FB和和BAR1,都具有12 Gb。但是CUDA程序始终使用FB内存,而BAR1剩下的内存是空闲的。当CUDA程序在每张卡中分配超过12 Gb的GPU内存时,Out of memory将发生的错误,但BAR1仍不使用该内存。

BAR1在这种设置下如何正确使用内存?

更新 的部分输出nvidia-smi

      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    > Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU1) : Yes
    > Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU2) : No
    > Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU3) : No
    > Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU0) : Yes
    > Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU2) : No
    > Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU3) : No
    > Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU0) : No
    > Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU1) : No
    > Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU3) : Yes
    > Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU0) : No
    > Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU1) : No
    > Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU2) : Yes
Run Code Online (Sandbox Code Playgroud)

Mic*_*idl 5

从nvidia-smi的手册页中:

BAR1 Memory Usage
       BAR1 is used to map the FB (device memory) so that it can  be  directly
       accessed  by  the CPU or by 3rd party devices (peer-to-peer on the PCIe
       bus).
Run Code Online (Sandbox Code Playgroud)

BAR1是虚拟地址空间,它映射设备内存以从主机和/或其他启用DMA的设备进行DMA访问。这意味着BAR1不是物理内存,并且每个GK210B GPU的K80(如规格中所述)仅具有12 GB VRAM。当此内存耗尽时,您实际上内存不足。