al *_*ion 5 installation ubuntu cuda nvidia
当我设置Deep Learning的环境时,我发现在pytorch中,torch.cuda.is_available()函数始终为False。我多次尝试更换pytorch的版本,cpu版本安装成功,但是gpu版本无法安装。服务器之前可能以错误的方式安装了CUDA(nvcc --version不起作用,但我可以看到很多像CUDA-11.4这样的文件),所以我尝试安装CUDA 12.1并删除之前的文件。但还是无法安装CUDA。
当我第一次检查 nvidia-smi 时,输出如下:
Mon Apr 24 11:16:34 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:05:00.0 Off | Off |
| 0% 42C P8 12W / 450W| 1MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Run Code Online (Sandbox Code Playgroud)
它显示当前的 nvidia 驱动程序版本是 530.30.02,支持的最大 CUDA 版本是 12.1。然后我尝试下载 CUDA 12.1 并通过以下命令安装它:
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
sudo sh cuda_12.1.1_530.30.02_linux.run
Run Code Online (Sandbox Code Playgroud)
然后,它向我显示了这样的图表: CUDA Installer 然后我什么都不做,继续安装:
Installation failed. See log at /var/log/cuda-installer.log for details.
Run Code Online (Sandbox Code Playgroud)
然后我打开 cuda-installer.log: cuda-installer.log 第一行显示“驱动程序未安装”,但是当我检查 nvidia-smi 时,它显示驱动程序已安装。为什么?
然后我尝试不在 CUDA 安装程序中安装驱动程序: 未安装驱动程序 然后它输出以下警告:
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-12.1/
Please make sure that
- PATH includes /usr/local/cuda-12.1/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-12.1/lib64, or, add /usr/local/cuda-12.1/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.1/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 530.00 is required for CUDA 12.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run --silent --driver
Run Code Online (Sandbox Code Playgroud)
但此时,当我检查 nvidia-smi 时,它实际上有效,当我检查 nvcc --version 时,它打印“command not find”
然后我检查了其他安装 CUDA 的方法,例如
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
Run Code Online (Sandbox Code Playgroud)
它不起作用,输出如下:
(base) root@6f0f4f1d5e21:~/zyx/test# sudo apt-get -y install cuda
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
cuda : Depends: cuda-12-1 (>= 12.1.1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
Run Code Online (Sandbox Code Playgroud)
我在使用 APT 包时也遇到了同样的问题。我通过尝试apt install它说不会安装的每个包来遍历“不会安装”依赖关系树,直到找到一个可以安装的包。事实证明是这样libnvidia-extra-530。所以以下工作有效(遵循文档):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt update
sudo apt upgrade
sudo apt install libnvidia-extra-530
sudo apt install cuda
Run Code Online (Sandbox Code Playgroud)