使用 nvidia gpu 创建 docker-compose 时“不允许‘设备’属性”

Sul*_*tha 6 gpu nvidia docker docker-compose nvidia-docker

问题说明

上下文信息(用于错误报告)

输出docker-compose version

docker-compose version 1.17.1, build unknown
docker-py version: 2.5.1
CPython version: 2.7.17
OpenSSL version: OpenSSL 1.1.1  11 Sep 2018
Run Code Online (Sandbox Code Playgroud)

输出docker version

Client:
 Version:           19.03.6
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        369ce74a3c
 Built:             Fri Dec 18 12:21:44 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.6
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       369ce74a3c
  Built:            Thu Dec 10 13:23:49 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.3-0ubuntu1~18.04.4
  GitCommit:        
 runc:
  Version:          spec: 1.0.1-dev
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:        
Run Code Online (Sandbox Code Playgroud)

输出docker-compose config (确保添加相关-f和其他标志)

ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
Run Code Online (Sandbox Code Playgroud)

重现问题的步骤

  1. 通过简单拉取 nvidia cuda 映像和检查 nvidia-gpu 的命令来创建 Dockerfile
FROM nvidia/cuda:10.2-base
CMD nvidia-smi
Run Code Online (Sandbox Code Playgroud)

2.当我们构建镜像并在没有 docker compose 的情况下运行它时,它就像一个魅力

docker image build testserver/ -t testserverimage
docker run --gpus all -exec -it testserverimage
Run Code Online (Sandbox Code Playgroud)

显示 nvidia-gpu 设备

Sat Feb 20 13:10:46 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00001918:00:00.0 Off |                    0 |
| N/A   52C    P0    71W / 149W |   7897MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)
  1. 现在创建 docker-compose.yml
version: "3.5"

services:
  testserver:
    image: nvidia/cuda:10.2-base
    build: './modelserver'
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
              driver: nvidia
Run Code Online (Sandbox Code Playgroud)

观察结果

ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
Run Code Online (Sandbox Code Playgroud)

预期结果

Sat Feb 20 13:10:46 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00001918:00:00.0 Off |                    0 |
| N/A   52C    P0    71W / 149W |   7897MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

堆栈跟踪/完整错误消息

ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
Run Code Online (Sandbox Code Playgroud)

附加信息

操作系统版本/分布、docker-compose安装方法等操作系统信息:

NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Run Code Online (Sandbox Code Playgroud)

Docker 撰写安装:

sudo apt  install docker-compose
Run Code Online (Sandbox Code Playgroud)

zig*_*arn 14

在文档https://docs.docker.com/compose/gpu-support/#enabling-gpu-access-to-service-containers中:

Docker Compose v1.28.0+允许使用 Compose 规范中定义的设备结构来定义 GPU 预留。

您的 docker-compose 版本是 1.17.1,因此您需要将 docker-compose 至少升级到 1.28.0。