Tensorflow无法打开libcuda.so.1

Qub*_*bix 10 cuda nvidia tensorflow

我有一台配备GeForce 940 MX的笔记本电脑.我想让Tensorflow在gpu上运行.我从他们的教程页面安装了所有内容,现在当我导入Tensorflow时,我得到了

>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened  CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: 
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
 I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
>>> 
Run Code Online (Sandbox Code Playgroud)

之后我认为它只是切换到在cpu上运行.

编辑:在我完成一切之后,从头开始.现在我明白了:

>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Run Code Online (Sandbox Code Playgroud)

小智 9

libcuda.so.1是特定于NVIDIA驱动程序版本的文件的符号链接.它可能指向错误的版本或可能不存在.

# See where the link is pointing.  
ls  /usr/lib/x86_64-linux-gnu/libcuda.so.1 -la
# My result:
# lrwxrwxrwx 1 root root 19 Feb 22 20:40 \
# /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> ./libcuda.so.375.39

# Make sure it is pointing to the right version. 
# Compare it with the installed NVIDIA driver.
nvidia-smi

# Replace libcuda.so.1 with a link to the correct version
cd /usr/lib/x86_64-linux-gnu
sudo ln -f -s libcuda.so.<yournvidia.version> libcuda.so.1
Run Code Online (Sandbox Code Playgroud)

现在以相同的方式,从libcuda.so.1创建另一个符号链接到LD_LIBRARY_PATH目录中的同名链接.

您可能还会发现需要在名为libcuda.so的/ usr/lib/x86_64-linux-gnu中创建指向libcuda.so.1的链接

  • ***现在以同样的方式,从libcuda.so.1创建另一个符号链接到LD_LIBRARY_PATH目录中的同名链接.***这究竟是如何完成的?什么是我的"LD_LIBRARY_PATH目录"?非常感谢! (2认同)

Rod*_*oza 8

万一仍然有人遇到这种情况。首先,请确保添加--runtime=nvidia参数以运行容器。

docker run --runtime=nvidia -t tensorflow/serving:latest-gpu
Run Code Online (Sandbox Code Playgroud)

tensorflow/serving:latest-gpudocker映像的名称在哪里。


wor*_*ise 7

在我刚刚解决的情况下,它将 GPU 驱动程序更新为最新版本并安装 cuda 工具包。首先,添加ppa并安装GPU驱动程序:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390
Run Code Online (Sandbox Code Playgroud)

添加 ppa 后,它显示了驱动程序版本的选项,390 是显示的最新“稳定”版本。

然后安装cuda工具包:

sudo apt install nvidia-cuda-toolkit
Run Code Online (Sandbox Code Playgroud)

然后重启:

sudo reboot
Run Code Online (Sandbox Code Playgroud)

它将驱动程序更新为比第一步中最初安装的 390 更新的版本(它是 410;这是 AWS 上的 p2.xlarge 实例)。