适用于 NVIDIA opengl 应用程序的 Docker xserver(主机中没有 X)

DTS*_*SED 5 opengl ubuntu nvidia headless docker

我正在尝试创建一个 Docker 映像,该映像使用 NVIDIA GPU 为 OpenGL 无头应用程序运行 X 服务器。(可用于创建纹理、在没有屏幕的情况下运行 Unity3D 等)。在这种情况下,主机不运行 X 服务器,我想在容器内完成所有操作。

我正在使用这个 Dockerfile 作为图像:

FROM ubuntu:18.04
    
ENV DEBIAN_FRONTEND=noninteractive
    
RUN apt update && \
        apt install -y \
        libglvnd0 \
        libgl1 \
        libglx0 \
        libegl1 \
        libgles2 \
        xserver-xorg-video-nvidia-440    
    
COPY xorg.conf.nvidia-headless /etc/X11/xorg.conf

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES graphics
ENV DISPLAY :1
    
    ENTRYPOINT ["/bin/bash"]
Run Code Online (Sandbox Code Playgroud)

对于 xorg.config.nvidia-headless 我用 nvidia-xconfig 创建了它

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
EndSection

Section "Files"
EndSection

Section "Module"
    Load           "dbe"
    Load           "extmod"
    Load           "type1"
    Load           "freetype"
    Load           "glx"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "UseDisplayDevice" "None"
    SubSection     "Display"
        Virtual     1920 1080
        Depth       24
    EndSubSection
EndSection
Run Code Online (Sandbox Code Playgroud)

我使用 --privileged 和 --gpus 运行 docker,全部使用 nvidia-docker 并共享设备 --device --device=/dev/dri/card0。在 Docker 内部,我可以完美地运行 nvidia-smi。当我运行 docker 时,我启动一个 X 服务器

Xorg -noreset +extension GLX +extension RANDR +extension RENDER -logfile ./xserver.log vt1 :1
Run Code Online (Sandbox Code Playgroud)

但它显示一个错误:

(EE) 
Fatal server error:
(EE) no screens found(EE) 
(EE) 
Run Code Online (Sandbox Code Playgroud)

这是完整的日志:

X.Org X Server 1.19.6
Release Date: 2017-12-20
[  1296.109] X Protocol Version 11, Revision 0
[  1296.109] Build Operating System: Linux 4.4.0-168-generic x86_64 Ubuntu
[  1296.109] Current Operating System: Linux ubuntu 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64
[  1296.109] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-112-generic root=UUID=8f2dc01d-1666-4abd-9bd1-cfe0a20afdf1 ro splash quiet vt.handoff=1
[  1296.109] Build Date: 14 November 2019  06:20:00PM
[  1296.109] xorg-server 2:1.19.6-1ubuntu4.4 (For technical support please see http://www.ubuntu.com/support) 
[  1296.109] Current version of pixman: 0.34.0
[  1296.109]    Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
[  1296.109] Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[  1296.110] (++) Log file: "./xserver.log", Time: Wed Aug 19 08:38:46 2020
[  1296.110] (==) Using config file: "/etc/X11/xorg.conf"
[  1296.110] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[  1296.111] (==) ServerLayout "Layout0"
[  1296.111] (**) |-->Screen "Screen0" (0)
[  1296.111] (**) |   |-->Monitor "Monitor0"
[  1296.112] (**) |   |-->Device "Device0"
[  1296.112] (**) |-->Input Device "Keyboard0"
[  1296.112] (**) |-->Input Device "Mouse0"
[  1296.112] (==) Automatically adding devices
[  1296.112] (==) Automatically enabling devices
[  1296.112] (==) Automatically adding GPU devices
[  1296.112] (==) Automatically binding GPU devices
[  1296.112] (==) Max clients allowed: 256, resource mask: 0x1fffff
[  1296.114] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (WW) The directory "/usr/share/fonts/X11/Type1" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
[  1296.114]    Entry deleted from font path.
[  1296.114] (==) FontPath set to:
    /usr/share/fonts/X11/misc,
    built-ins
[  1296.114] (==) ModulePath set to "/usr/lib/xorg/modules"
[  1296.114] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[  1296.114] (WW) Disabling Keyboard0
[  1296.114] (WW) Disabling Mouse0
[  1296.115] (II) Loader magic: 0x55dca9edc020
[  1296.115] (II) Module ABI versions:
[  1296.115]    X.Org ANSI C Emulation: 0.4
[  1296.115]    X.Org Video Driver: 23.0
[  1296.115]    X.Org XInput driver : 24.1
[  1296.115]    X.Org Server Extension : 10.0
[  1296.116] (EE) dbus-core: error connecting to system bus: org.freedesktop.DBus.Error.FileNotFound (Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory)
[  1296.116] (++) using VT number 1

[  1296.116] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[  1296.116] (II) xfree86: Adding drm device (/dev/dri/card0)
[  1296.119] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/x86_64-linux-gnu/nvidia/xorg,/usr/lib/xorg/modules"
[  1296.122] (--) PCI:*(0:1:0:0) 10de:100c:1043:84b7 rev 161, Mem @ 0xf9000000/16777216, 0xd0000000/134217728, 0xd8000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/131072
[  1296.122] (II) LoadModule: "glx"
[  1296.123] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[  1296.131] (EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: glxServer
[  1296.131] (II) UnloadModule: "glx"
[  1296.131] (II) Unloading glx
[  1296.131] (EE) Failed to load module "glx" (loader failed, 7)
[  1296.131] (II) LoadModule: "nvidia"
[  1296.131] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
[  1296.138] (II) Module nvidia: vendor="NVIDIA Corporation"
[  1296.139]    compiled for 1.6.99.901, module version = 1.0.0
[  1296.139]    Module class: X.Org Video Driver
[  1296.140] (II) NVIDIA dlloader X Driver  440.100  Fri May 29 08:21:27 UTC 2020
[  1296.140] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[  1296.141] (II) Loading sub module "fb"
[  1296.141] (II) LoadModule: "fb"
[  1296.141] (II) Loading /usr/lib/xorg/modules/libfb.so
[  1296.143] (II) Module fb: vendor="X.Org Foundation"
[  1296.143]    compiled for 1.19.6, module version = 1.0.0
[  1296.143]    ABI class: X.Org ANSI C Emulation, version 0.4
[  1296.143] (II) Loading sub module "wfb"
[  1296.143] (II) LoadModule: "wfb"
[  1296.143] (II) Loading /usr/lib/xorg/modules/libwfb.so
[  1296.144] (II) Module wfb: vendor="X.Org Foundation"
[  1296.144]    compiled for 1.19.6, module version = 1.0.0
[  1296.144]    ABI class: X.Org ANSI C Emulation, version 0.4
[  1296.144] (II) Loading sub module "ramdac"
[  1296.144] (II) LoadModule: "ramdac"
[  1296.144] (II) Module "ramdac" already built-in
[  1296.145] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
[  1296.145] (EE) NVIDIA:     system's kernel log for additional error messages and
[  1296.145] (EE) NVIDIA:     consult the NVIDIA README for details.
[  1296.145] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
[  1296.145] (EE) NVIDIA:     system's kernel log for additional error messages and
[  1296.145] (EE) NVIDIA:     consult the NVIDIA README for details.
[  1296.145] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
[  1296.145] (EE) NVIDIA:     system's kernel log for additional error messages and
[  1296.145] (EE) NVIDIA:     consult the NVIDIA README for details.
[  1296.145] (EE) No devices detected.
[  1296.145] (II) Applying OutputClass "nvidia" to /dev/dri/card0
[  1296.145]    loading driver: nvidia
[  1296.145] (==) Matched nvidia as autoconfigured driver 0
[  1296.145] (==) Matched nouveau as autoconfigured driver 1
[  1296.145] (==) Matched nouveau as autoconfigured driver 2
[  1296.145] (==) Matched modesetting as autoconfigured driver 3
[  1296.145] (==) Matched fbdev as autoconfigured driver 4
[  1296.145] (==) Matched vesa as autoconfigured driver 5
[  1296.145] (==) Assigned the driver to the xf86ConfigLayout
[  1296.145] (II) LoadModule: "nvidia"
[  1296.145] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
[  1296.145] (II) Module nvidia: vendor="NVIDIA Corporation"
[  1296.145]    compiled for 1.6.99.901, module version = 1.0.0
[  1296.145]    Module class: X.Org Video Driver
[  1296.145] (II) UnloadModule: "nvidia"
[  1296.145] (II) Unloading nvidia
[  1296.145] (II) Failed to load module "nvidia" (already loaded, 21980)
[  1296.145] (II) LoadModule: "nouveau"
[  1296.146] (WW) Warning, couldn't open module nouveau
[  1296.146] (II) UnloadModule: "nouveau"
[  1296.146] (II) Unloading nouveau
[  1296.146] (EE) Failed to load module "nouveau" (module does not exist, 0)
[  1296.146] (II) LoadModule: "modesetting"
[  1296.146] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[  1296.147] (II) Module modesetting: vendor="X.Org Foundation"
[  1296.147]    compiled for 1.19.6, module version = 1.19.6
[  1296.147]    Module class: X.Org Video Driver
[  1296.147]    ABI class: X.Org Video Driver, version 23.0
[  1296.147] (II) LoadModule: "fbdev"
[  1296.147] (WW) Warning, couldn't open module fbdev
[  1296.147] (II) UnloadModule: "fbdev"
[  1296.147] (II) Unloading fbdev
[  1296.147] (EE) Failed to load module "fbdev" (module does not exist, 0)
[  1296.147] (II) LoadModule: "vesa"
[  1296.147] (WW) Warning, couldn't open module vesa
[  1296.147] (II) UnloadModule: "vesa"
[  1296.147] (II) Unloading vesa
[  1296.147] (EE) Failed to load module "vesa" (module does not exist, 0)
[  1296.147] (II) NVIDIA dlloader X Driver  440.100  Fri May 29 08:21:27 UTC 2020
[  1296.147] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[  1296.147] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[  1296.147] (WW) xf86OpenConsole: setpgid failed: Operation not permitted
[  1296.147] (WW) xf86OpenConsole: setsid failed: Operation not permitted
[  1296.147] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
[  1296.147] (EE) NVIDIA:     system's kernel log for additional error messages and
[  1296.147] (EE) NVIDIA:     consult the NVIDIA README for details.
[  1296.147] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
[  1296.147] (EE) NVIDIA:     system's kernel log for additional error messages and
[  1296.147] (EE) NVIDIA:     consult the NVIDIA README for details.
[  1296.147] (WW) Falling back to old probe method for modesetting
[  1296.147] (EE) Screen 0 deleted because of no matching config section.
[  1296.147] (II) UnloadModule: "modesetting"
[  1296.147] (EE) Device(s) detected, but none match those in the config file.
[  1296.147] (EE) 
Fatal server error:
[  1296.147] (EE) no screens found(EE) 
[  1296.147] (EE) 
Please consult the The X.Org Foundation support 
     at http://wiki.x.org
 for help. 
[  1296.147] (EE) Please also check the log file at "./xserver.log" for additional information.
[  1296.147] (EE) 
[  1296.149] (EE) Server terminated with error (1). Closing log file.
Run Code Online (Sandbox Code Playgroud)

有人可以帮我解决这个问题吗?这将在配备 NVIDIA GPU 的无头机器上运行。

dat*_*olf 3

首先要做的事情是:如果您想要无头 OpenGL,请不要使用 X 服务器!

自从需要 X 服务器与 GPU 对话以来,已经有很多年了。没有头你也可以很好地进行无头渲染。Nvidia 有一篇关于如何做到这一点的好文章:https://developer.nvidia.com/blog/egl-eye-opengl-visualization-without-x-server/

要点是,您使用 EGL 设置上下文,并通过调用 使上下文成为当前没有表面的上下文eglMakeCurrent(eglDpy, EGL_NO_SURFACE, EGL_NO_SURFACE, eglCtx);

您仍然需要 Xorg 的 Nvidia 驱动程序,因为它还包含所有屏幕外的内容,但有一个重要的警告:Nvidia 用户区驱动程序必须与主机系统nvidia内核模块版本匹配。如果将驱动程序包装在 Docker 容器中,则实质上是将该 Docker 映像绑定到主机系统上的特定内核模块版本。这不是一个理想的情况。相反,您应该配置 docker 映像以绑定来自主机系统的驱动程序和 OpenGL 实现库。不幸的是,没有通用的位置可以找到这些库和驱动程序,这意味着需要付出更多的努力才能可靠地将它们全部引入。但不要绝望,Nvidia 已经为您完成了这项工作:

https://gitlab.com/nvidia/container-images/opengl

此外,为了可靠地设置离屏上下文,它有助于取消设置DISPLAY变量:由于 Nvidia 刚刚在 Xorg 驱动程序之上构建了所​​有 Vulkan 和 EGL 内容,因此有一些代码路径可以评估该变量并取消设置它有助于推动所有代码路径正确的方向。因此,在您的程序中,在设置 OpenGL 上下文之前执行一个setenv("DISPLAY", NULL, 0).

  • 在这种情况下,您假设我正在程序中编写 OpenGL 内容,但情况并非如此。我想使用一些内部使用 OpenGL 的已发布应用程序,我无法更改它。关于 NVIDIA OpenGL 映像,它与我的 Docker 文件中使用 glvnd 将 OpenGL 调用重定向到 NVIDIA 驱动程序的方式基本相同。 (3认同)