为什么自删除的全局 Vulkan 实例仅在添加层时才会导致段错误?

Log*_*nes 1 c++ raii c++14 vulkan

我正在使用全局std::shared_ptr来处理我的 Vulkan 的自动删除VkInstance。指针有一个自定义删除器,vkDestroyInstance当它超出范围时会调用。一切都按预期工作,直到我启用该VK_LAYER_LUNARG_standard_validation层,此时该vkDestroyInstance函数会导致段错误。

我在下面添加了一个产生问题的最小示例。

最小cpp

#include <vulkan/vulkan.h>
#include <iostream>
#include <memory>
#include <vector>
#include <cstdlib>

// The global self deleting instance
std::shared_ptr<VkInstance> instance;

int main()
{
    std::vector<const char *> extensions = {VK_EXT_DEBUG_REPORT_EXTENSION_NAME};
    std::vector<const char *> layers = {};
    // Uncomment to cause segfault:
    // layers.emplace_back("VK_LAYER_LUNARG_standard_validation");

    VkApplicationInfo app_info = {};
    app_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO;
    app_info.pApplicationName = "Wat";
    app_info.applicationVersion = VK_MAKE_VERSION(1, 0, 0);
    app_info.pEngineName = "No Engine";
    app_info.engineVersion = VK_MAKE_VERSION(1, 0, 0);
    app_info.apiVersion = VK_API_VERSION_1_0;

    VkInstanceCreateInfo instance_info = {};
    instance_info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
    instance_info.pApplicationInfo = &app_info;
    instance_info.enabledExtensionCount =
        static_cast<uint32_t>(extensions.size());
    instance_info.ppEnabledExtensionNames = extensions.data();
    instance_info.enabledLayerCount = static_cast<uint32_t>(layers.size());
    instance_info.ppEnabledLayerNames = layers.data();

    // Handles auto deletion of the instance when it goes out of scope
    auto deleter = [](VkInstance *pInstance)
    {
        if (*pInstance)
        {
            vkDestroyInstance(*pInstance, nullptr);
            std::cout << "Deleted instance" << std::endl;
        }
        delete pInstance;
    };

    instance = std::shared_ptr<VkInstance>(new VkInstance(nullptr), deleter);
    if (vkCreateInstance(&instance_info, nullptr, instance.get()) != VK_SUCCESS)
    {
        std::cerr << "Failed to create a Vulkan instance" << std::endl;
        return EXIT_FAILURE;
    }
    std::cout << "Created instance" << std::endl;

    // When the program exits, everything should clean up nicely?
    return EXIT_SUCCESS;
}
Run Code Online (Sandbox Code Playgroud)

按原样运行上述程序时,输出是我所期望的:

$ g++-7 -std=c++14 minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ ./minimal 
Created instance
Deleted instance
$
Run Code Online (Sandbox Code Playgroud)

但是,一旦我添加回该VK_LAYER_LUNARG_standard_validation行:

// Uncomment to cause segfault:
layers.emplace_back("VK_LAYER_LUNARG_standard_validation");
Run Code Online (Sandbox Code Playgroud)

我得到

$ g++-7 -std=c++14 minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ ./minimal 
Created instance
Segmentation fault (core dumped)
$
Run Code Online (Sandbox Code Playgroud)

使用gdb回溯运行时会显示VkDeleteInstance函数中发生的段错误:

$ g++-7 -std=c++14 -g minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ gdb -ex run ./minimal 
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
...
Starting program: /my/path/stackoverflow/vulkan/minimal 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Created instance

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff24c4334 in threading::DestroyInstance(VkInstance_T*, VkAllocationCallbacks const*) () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libVkLayer_threading.so
(gdb) bt
#0  0x00007ffff24c4334 in threading::DestroyInstance(VkInstance_T*, VkAllocationCallbacks const*) () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libVkLayer_threading.so
#1  0x00007ffff7bad243 in vkDestroyInstance () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libvulkan.so.1
#2  0x000000000040105c in <lambda(VkInstance_T**)>::operator()(VkInstance *) const (__closure=0x617c90, pInstance=0x617c60) at minimal.cpp:38
#3  0x000000000040199a in std::_Sp_counted_deleter<VkInstance_T**, main()::<lambda(VkInstance_T**)>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose(void) (this=0x617c80) at /usr/include/c++/7/bits/shared_ptr_base.h:470
#4  0x0000000000401ef0 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x617c80) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#5  0x0000000000401bc7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x6052d8 <instance+8>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:684
#6  0x0000000000401b6a in std::__shared_ptr<VkInstance_T*, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x6052d0 <instance>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#7  0x0000000000401b9c in std::shared_ptr<VkInstance_T*>::~shared_ptr (this=0x6052d0 <instance>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr.h:93
#8  0x00007ffff724bff8 in __run_exit_handlers (status=0, listp=0x7ffff75d65f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#9  0x00007ffff724c045 in __GI_exit (status=<optimized out>) at exit.c:104
#10 0x00007ffff7232837 in __libc_start_main (main=0x40108c <main()>, argc=1, argv=0x7fffffffdcf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdce8) at ../csu/libc-start.c:325
#11 0x0000000000400ed9 in _start ()
(gdb) 
Run Code Online (Sandbox Code Playgroud)

这个问题可以通过使用本地实例(在主函数内)而不是全局实例来解决,所以我想我在使用层时可能无法完全理解 Vulkan 链接器的一些细微差别。

在我的实际应用程序中,我想使用一个延迟实例化的静态类来跟踪我的所有 Vulkan 对象,因此当程序退出时我遇到了同样的问题。

设置

  • g++:7.3.0
  • 操作系统:Ubuntu 16.04
  • Nvidia 驱动程序:390.67(也试过 396)
  • Vulkan SDK:1.1.77.0(也试过 1.1.73)
  • GPU:GeForce GTX TITAN(如果重要的话,双 SLI?)

Yak*_*ont 5

全局变量是一个坏主意。在大多数情况下,它们的破坏相对于彼此而言是无序的。

在 main 中清理您的状态,而不是在静态销毁时。仅依赖于内存(从 POD 向前迈出一小步)且不交叉依赖的简单对象往往不会引起问题,但如果再进一步,您就会进入马蜂窝。

您的全局共享 ptr 正在被清除,并且销毁代码在 Vulkan 中的某个任意全局状态被清除后运行。这会导致段错误。这里有趣的不是“为什么会出现这种段错误”,而是“我怎样才能避免这种段错误”。答案是“停止使用全局状态”;没有其他东西真正有效。

  • 拥有 `VkInstance`(nb `VkAllocationCallbacks`)的全部意义在于在 Vulkan 中没有全局状态。遵循相同的原则会很好。如果你发现了这一点,你应该在 [KhronosGroup/Vulkan-ValidationLayers](https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues) 上报告。使用符号运行以获取“threading.cpp”中的违规行号会有所帮助。 (2认同)