我正在尝试安装 Tensorflow,但它要求安装 libcusolver.so.11,而我只有 libcusolver.so.10。有人能告诉我我做错了什么吗
这是我的 Ubuntu、nvidia 和 CUDA 版本
$ uname -a
$ Linux *****-dev-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$nvidia-smi --query-gpu=gpu_name --format=csv|tail -n 1
GeForce GTX 1650
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
Run Code Online (Sandbox Code Playgroud)
这是我构建 tensorflow 的方式
$git clone https://github.com/tensorflow/tensorflow.git
$cd ./tensorflow
$git checkout tags/v2.2.0
$./configure
$bazel build --config=v2 --config=cuda --config=monolithic --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --copt=-Wno-sign-compare // tensorflow:libtensorflow_cc.so
Run Code Online (Sandbox Code Playgroud)
这是我收到的错误
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
INFO: Elapsed time: 1.998s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
currently loading: tensorflow
NORMAL test.log
Run Code Online (Sandbox Code Playgroud)
tal*_*ies 12
有人能告诉我我做错了什么吗
没有。
如评论中所述,CUDA 11.0 版本中没有 cuSolver 11.0 版。很明显,bazel 中内置了一些逻辑,它自动从它检测到的工具包的主要版本中派生组件库的名称。对于您拥有的 CUDA 工具包,该逻辑不正确。我会向 bazel 的开发人员提出这个错误。您也许能够以某种方式显式覆盖它,但我无法告诉您如何进行。
如果你想要一个具体的解决方案,只需在你的机器上找到 libcusolver.so.10 并创建一个指向 libcusolver.so.11 的链接:
以下命令为我解决了问题:
sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
Run Code Online (Sandbox Code Playgroud)
归功于:https : //github.com/tensorflow/tensorflow/issues/43947
归档时间: |
|
查看次数: |
7423 次 |
最近记录: |