如何使用 CMake 3.23 和 MSVC 2019 制作可工作的 CUDA 11.6

Thi*_*ult 5 cuda cmake buildconfiguration visual-studio-2019

我找不到解决方案来管理如何使用标准 MSVC 2019 编译器在 Windows 上的 CMake 项目中使用语言 CUDA。

\n

我正在尝试配置和编译hello-cmake-cuda存储库(也在本博客文章中进行了描述)。

\n

CMakeLists.txt文件内容:

\n
cmake_minimum_required(VERSION 3.8 FATAL_ERROR)\nproject(hello LANGUAGES CXX CUDA)\nenable_language(CUDA)\nadd_executable(hello hello.cu)\n
Run Code Online (Sandbox Code Playgroud)\n

cmake ..以下是从构建目录中运行的命令的输出:

\n
PS C:\\GitRepo\\cuda_hello\\build> cmake ..\n-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:311 (message):\n  CMAKE_CUDA_ARCHITECTURES must be valid if set.\nCall Stack (most recent call first):\n  CMakeLists.txt:5 (project)\n\n\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

这意味着architectures_testedfromCMakeDetermineCUDACompiler.cmake:311为空...

\n

如何让 CMake 完成其配置和构建简单的程序?

\n

我的开发环境

\n
    \n
  • 操作系统: Windows 11 版本 10.0.22000 Build 22000
  • \n
  • 编译器:Microsoft Visual Studio Community 2019 版本 16.11.11
  • \n
  • CMake版本是3.23
  • \n
  • CUDA版本是11.6
  • \n
\n

我已经尝试了每种软件的不同版本,但仍然遇到相同的问题。我目前决定保留这些版本。

\n

我的 GPU 已正确配置:它显示为nvidia-smi,并且我还能够构建并运行deviceQueryCUDA 示例:

\n
CUDA Device Query (Runtime API) version (CUDART static linking)\n\nDetected 1 CUDA Capable device(s)\n\nDevice 0: "NVIDIA GeForce GTX 1650"\n  CUDA Driver Version / Runtime Version          11.6 / 11.6\n  CUDA Capability Major/Minor version number:    7.5\n  etc. etc. ...\n\ndeviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1\nResult = PASS\n
Run Code Online (Sandbox Code Playgroud)\n
\n

我的环境路径变量:

\n
PS C:\\GitRepo\\hello-cuda-cmake-master> $env:path -split ";"\nC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\bin\nC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\libnvvp\nC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.3\\bin\nC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.3\\libnvvp\n\nC:\\Program Files (x86)\\Common Files\\Oracle\\Java\\javapath\nC:\\Python38\\Scripts\\\nC:\\Python38\\\nC:\\Windows\\system32\nC:\\Windows\nC:\\Windows\\System32\\Wbem\nC:\\Windows\\System32\\WindowsPowerShell\\v1.0\\\nC:\\Windows\\System32\\OpenSSH\\\nC:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\nC:\\Program Files\\NVIDIA Corporation\\NVIDIA NvDLISR\nC:\\Program Files\\PuTTY\\\nC:\\Program Files (x86)\\PuTTY\\\nC:\\Program Files\\Microsoft SQL Server\\110\\Tools\\Binn\\\nC:\\Program Files\\TortoiseSVN\\bin\nC:\\Program Files\\TortoiseGit\\bin\nC:\\Program Files\\Microsoft VS Code\\bin\nC:\\WINDOWS\\system32\nC:\\WINDOWS\nC:\\WINDOWS\\System32\\Wbem\nC:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\\nC:\\WINDOWS\\System32\\OpenSSH\\\nC:\\Program Files\\Docker\\Docker\\resources\\bin\nC:\\ProgramData\\DockerDesktop\\version-bin\nC:\\Program Files\\Git\\cmd\nC:\\WINDOWS\\system32\nC:\\WINDOWS\nC:\\WINDOWS\\System32\\Wbem\nC:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\\nC:\\WINDOWS\\System32\\OpenSSH\\\nC:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2022.1.1\\\nC:\\Program Files\\CMake\\bin\nC:\\Ruby30-x64\\bin\nC:\\Users\\Thibault GEFFROY\\.cargo\\bin\nC:\\Users\\Thibault GEFFROY\\AppData\\Local\\Microsoft\\WindowsApps\nC:\\Program Files\\OpenCppCoverage\nC:\\intelFPGA\\20.1\\modelsim_ase\\win32aloem\n
Run Code Online (Sandbox Code Playgroud)\n

我尝试过但没有成功

\n

如果我尝试插入想要的CMAKE_CUDA_ARCHITECTURES

\n

set(CMAKE_CUDA_ARCHITECTURES 75)

\n

我得到:

\n
PS C:\\GitRepo\\cuda_hello\\build> cmake ..\n-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.\n-- The CUDA compiler identification is unknown\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:654 (message):\n  The CMAKE_CUDA_ARCHITECTURES:\n\n    75\n\n  do not all work with this compiler.  Try:\n\n\n\n  instead.\nCall Stack (most recent call first):\n  CMakeLists.txt:5 (project)\n\n\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

如果我尝试使用FindCUDA模块来设置- @alfC此处CMAKE_CUDA_ARCHITECTURES给出的解决方案- 我得到:

\n
PS C:\\GitRepo\\cuda_hello\\build> cmake ..\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:120 (file):\n  file failed to open for writing (Permission denied):\n\n    /detect_cuda_compute_capabilities.cpp\nCall Stack (most recent call first):\n  CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)\n\n\nCMake Error: The source directory "CMAKE_FLAGS" does not exist.\nSpecify --help for usage, or press the help button on the CMake GUI.\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:141 (try_run):\n  Failed to configure test project build system.\nCall Stack (most recent call first):\n  CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)\n\n\nCMake Error: TRY_COMPILE attempt to remove -rf directory that does not contain CMakeTmp:/detect_cuda_compute_capabilities.cpp\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

最后,如果我尝试调用find_package(CUDA),我会得到:

\n
PS C:\\GitRepo\\cuda_hello\\build> cmake ..\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA.cmake:677 (cmake_initialize_per_config_variable):\n  Unknown CMake command "cmake_initialize_per_config_variable".\nCall Stack (most recent call first):\n  CMakeLists.txt:2 (find_package)\n\n\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

编辑1:

\n

回答@einpoklum 解决方案

\n

感谢您的建议,但它也不起作用。

\n

这是存储库cmake -B build中命令的输出:

\n
PS C:\\GitRepo\\hello-cuda-cmake-master> cmake -B build\n-- Building for: Visual Studio 16 2019\n-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.\n-- The CUDA compiler identification is unknown\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):\n  Failed to detect a default CUDA architecture.\n\n\n\n  Compiler output:\n\nCall Stack (most recent call first):\n  CMakeLists.txt:2 (project)\n\n\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

使用 PowerShell 或 MSVC 命令提示符的输出相同。

\n
\n

以下是使用 cmake-gui 时的 cmake 变量及其值:

\n

Cmake 桂

\n
\n

当使用简单的 nvcc build 命令时:nvcc hello.cu从 MSVC 命令提示符我得到:

\n
nvcc fatal \xc2\xa0 : Could not set up the environment for Microsoft Visual Studio using \'c:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat\'\n
Run Code Online (Sandbox Code Playgroud)\n

不过 PATH 是有效的,并且脚本 vcvars64.bat 存在于该位置。

\n
\n

如果我将添加find_package(CUDAToolkit)CMakeLists.txt

\n

新的CMakeLists.txt

\n
cmake_minimum_required(VERSION 3.18 FATAL_ERROR)\nfind_package(CUDAToolkit)\nproject(hello LANGUAGES CUDA)\nadd_executable(hello hello.cu)\n
Run Code Online (Sandbox Code Playgroud)\n

输出 :

\n
PS C:\\GitRepo\\hello-cuda-cmake-master> cmake -B build\n-- Building for: Visual Studio 16 2019\n-- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include (found version "11.6.124")\n-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.\n-- The CUDA compiler identification is unknown\nCMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):\n  Failed to detect a default CUDA architecture.\n\n\n\n  Compiler output:\n\nCall Stack (most recent call first):\n  CMakeLists.txt:3 (project)\n\n\n-- Configuring incomplete, errors occurred!\nSee also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".\nSee also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".\n
Run Code Online (Sandbox Code Playgroud)\n

编辑2:

\n

我正在尝试在不使用 CMake 的情况下编译CUDA 示例BlackScholes,并提供 MSVC 2019 解决方案。

\n

我最终遇到了这个错误:

\n
Severity        Code        Description        Project        File        Line        Suppression State\nError        MSB3721        The command ""C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\bin\\nvcc.exe" -gencode=arch=compute_35,code=\\"sm_35,compute_35\\" -gencode=arch=compute_37,code=\\"sm_37,compute_37\\" -gencode=arch=compute_50,code=\\"sm_50,compute_50\\" -gencode=arch=compute_52,code=\\"sm_52,compute_52\\" -gencode=arch=compute_60,code=\\"sm_60,compute_60\\" -gencode=arch=compute_61,code=\\"sm_61,compute_61\\" -gencode=arch=compute_70,code=\\"sm_70,compute_70\\" -gencode=arch=compute_75,code=\\"sm_75,compute_75\\" -gencode=arch=compute_80,code=\\"sm_80,compute_80\\" -gencode=arch=compute_86,code=\\"sm_86,compute_86\\" --use-local-env -ccbin "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x64" -x cu   -I./ -I../../../Common -I./ -I"C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\/include" -I../../../Common -I"C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\include"  -G   --keep-dir x64\\Debug  -maxrregcount=0  --machine 64 --compile -cudart static -Xcompiler "/wd 4819"  --threads 0 -g  -DWIN32 -DWIN32 -D_MBCS -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64/Debug/vc142.pdb /FS /Zi /RTC1 /MTd " -o "C:\\ProgramData\\NVIDIA Corporation\\CUDA Samples\\v11.6\\cuda-samples\\Samples\\5_Domain_Specific\\BlackScholes\\x64\\Debug\\BlackScholes.cu.obj" "C:\\ProgramData\\NVIDIA Corporation\\CUDA Samples\\v11.6\\cuda-samples\\Samples\\5_Domain_Specific\\BlackScholes\\BlackScholes.cu"" exited with code 1.        BlackScholes        C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\MSBuild\\Microsoft\\VC\\v160\\BuildCustomizations\\CUDA 11.6.targets        790\n
Run Code Online (Sandbox Code Playgroud)\n

使用 WSL 2 Ubuntu 20.4 和以下 CUDA 安装以及这些说明来构建 BlackScholes 示例时,我得到以下输出:

\n
$ sudo make BlackScholes\n/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes.o -c BlackScholes.cu\nnvcc warning : The \'compute_35\', \'compute_37\', \'sm_35\', and \'sm_37\' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).\nptxas warning : For profile sm_86 adjusting per thread register count of 16 to lower bound of 24\nptxas warning : For profile sm_80 adjusting per thread register count of 16 to lower bound of 24\nptxas warning : For profile sm_70 adjusting per thread register count of 16 to lower bound of 24\nptxas warning : For profile sm_75 adjusting per thread register count of 16 to lower bound of 24\n/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes_gold.o -c BlackScholes_gold.cpp\nnvcc warning : The \'compute_35\', \'compute_37\', \'sm_35\', and \'sm_37\' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).\n/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes BlackScholes.o BlackScholes_gold.o\nnvcc warning : The \'compute_35\', \'compute_37\', \'sm_35\', and \'sm_37\' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).\nmkdir -p ../../../bin/x86_64/linux/release\ncp BlackScholes ../../../bin/x86_64/linux/release\n\n\n$ ./BlackScholes\n[./BlackScholes] - Starting...\nGPU Device 0: "Turing" with compute capability 7.5\n\nInitializing data...\n...allocating CPU memory for options.\n...allocating GPU memory for options.\n...generating input data in CPU mem.\n...copying input data to GPU mem.\nData init done.\n\nExecuting Black-Scholes GPU kernel (512 iterations)...\nOptions count             : 8000000\nBlackScholesGPU() time    : 0.722482 msec\nEffective memory bandwidth: 110.729334 GB/s\nGigaoptions per second    : 11.072933\n\nBlackScholes, Throughput = 11.0729 GOptions/s, Time = 0.00072 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128\n\nReading back GPU results...\nChecking the results...\n...running CPU calculations.\n\nComparing the results...\nL1 norm: 1.741792E-07\nMax absolute error: 1.192093E-05\n\nShutting down...\n...releasing GPU memory.\n...releasing CPU memory.\nShutdown done.\n\n[BlackScholes] - Test Summary\n\nNOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.\n\nTest passed\n
Run Code Online (Sandbox Code Playgroud)\n

小智 1

我遇到了同样的问题,主要问题是,在 CMake 3.23.2 上它不起作用。

我解决这个问题的步骤是:

  1. 删除计算机上安装的所有 CUDA 版本
  2. 删除所有与 CUDA 相关的环境变量
  3. 安装 CUDA v12.2
  4. 安装最新的CMAKE 3.27.4(GUI版本)
  5. 将 CUDA Visual Studio 集成文件复制到 Visual Studio 2022链接