如何安装 libcusolver.so.11

puk*_*puk 12 cuda tensorflow

我正在尝试安装 Tensorflow,但它要求安装 libcusolver.so.11,而我只有 libcusolver.so.10。有人能告诉我我做错了什么吗

这是我的 Ubuntu、nvidia 和 CUDA 版本

$ uname -a
$ Linux *****-dev-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$nvidia-smi --query-gpu=gpu_name --format=csv|tail -n 1
GeForce GTX 1650

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
Run Code Online (Sandbox Code Playgroud)

这是我构建 tensorflow 的方式

$git clone https://github.com/tensorflow/tensorflow.git
$cd ./tensorflow
$git checkout tags/v2.2.0
$./configure
$bazel build --config=v2 --config=cuda --config=monolithic --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --copt=-Wno-sign-compare //        tensorflow:libtensorflow_cc.so
Run Code Online (Sandbox Code Playgroud)

这是我收到的错误

ERROR: An error occurred during the fetch of repository 'local_config_cuda':
    Traceback (most recent call last):
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
         _create_local_cuda_repository(<1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
         _find_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
         _check_cuda_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
         execute(repository_ctx, <1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
         fail(<1 more arguments>)
 Repository command failed
 No library found under: /usr/local/cuda/lib64/libcusolver.so.11
 ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
         _create_local_cuda_repository(<1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
         _find_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
         _check_cuda_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
         execute(repository_ctx, <1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
         fail(<1 more arguments>)
 Repository command failed
 No library found under: /usr/local/cuda/lib64/libcusolver.so.11
 WARNING: Target pattern parsing failed.
 ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
         _create_local_cuda_repository(<1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
         _find_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
         _check_cuda_libs(repository_ctx, <2 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
         execute(repository_ctx, <1 more arguments>)
     File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
         fail(<1 more arguments>)
 Repository command failed
 No library found under: /usr/local/cuda/lib64/libcusolver.so.11
 INFO: Elapsed time: 1.998s
 INFO: 0 processes.
 FAILED: Build did NOT complete successfully (0 packages loaded)
     currently loading: tensorflow
 NORMAL   test.log
Run Code Online (Sandbox Code Playgroud)

tal*_*ies 12

有人能告诉我我做错了什么吗

没有。

如评论中所述,CUDA 11.0 版本中没有 cuSolver 11.0 版。很明显,bazel 中内置了一些逻辑,它自动从它检测到的工具包的主要版本中派生组件库的名称。对于您拥有的 CUDA 工具包,该逻辑不正确。我会向 bazel 的开发人员提出这个错误。您也许能够以某种方式显式覆盖它,但我无法告诉您如何进行。


Ale*_*nko 9

如果你想要一个具体的解决方案,只需在你的机器上找到 libcusolver.so.10 并创建一个指向 libcusolver.so.11 的链接:

以下命令为我解决了问题:

sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
Run Code Online (Sandbox Code Playgroud)

归功于:https : //github.com/tensorflow/tensorflow/issues/43947

  • 如果这仍然不起作用,请另外尝试“export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}” (3认同)
  • 有史以来最奇怪的事情,这没有帮助。然后我还需要将其符号链接到我的虚拟环境中 `# sudo ln -s /usr/local/cuda/targets/x86_64-linux/lib/libcusolver.so.11 .venv/lib/python3.9/site-packages/张量流/python/libcusolver.so.11 ` (3认同)

tih*_*iho 6

如果有人遇到这个问题,对我来说问题是我使用的是 CUDA 11.0,而更新的 TensorFlow 版本需要 11.2