修复 Tensorflow GPU 无法加载动态库的问题

9 python tensorflow

我想将我的 GPU 用于 Tensorflow。

我尝试过在仅使用tensorflow CPU的安装上无法加载动态库'cudart64_101.dll'

不幸的是,我不断收到错误Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found。我怎样才能解决这个问题?Python 版本:3.8.3、CUDA 10.1

2020-11-03 12:30:28.832014: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2020-11-03 12:30:28.832688: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2020-11-03 12:30:28.833342: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2020-11-03 12:30:28.833994: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2020-11-03 12:30:28.834645: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2020-11-03 12:30:28.835297: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2020-11-03 12:30:28.835948: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2020-11-03 12:30:28.836594: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2020-11-03 12:30:28.836789: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1761] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-11-03 12:30:28.837575: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-03 12:30:28.838495: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1265] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-03 12:30:28.838708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1271]      
2020-11-03 12:30:28.838831: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
Run Code Online (Sandbox Code Playgroud)

Cyp*_*erX 11

解决方案

我建议您使用conda(Ananconda/Miniconda)创建一个单独的环境并安装tensorflow-gpu,cudnncudatoolkit. Miniconda 的占用空间比 Anaconda 小得多。如果您还没有安装 Miniconda,我建议您安装conda

快速安装

# Quick and dirty: with channel specification
conda create -n tf_gpu_env python=3.8 
conda activate tf_gpu_env
conda install tensorflow-gpu -c anaconda
conda install cudnn -c conda-forge 
conda install cudatoolkit -c anaconda
Run Code Online (Sandbox Code Playgroud)

检查 Tensorflow 是否正在使用 GPU

有关更多详细信息,请参阅: https: //www.tensorflow.org/guide/gpu

# Sanity check for validating 
# visibility of the GPUs to TF
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Run Code Online (Sandbox Code Playgroud)

使用环境文件轻松重现安装:environment.yml

尽管您可以快速创建conda(Ananconda/ Miniconda)环境(如前所述),但更希望使安装过程尽可能可重现:输入environment.yml文件

将环境文件保存在存储库(或项目)的根目录中,并运行以下命令以tfgpu_env使用环境文件在隔离的 conda 环境(此处我将其命名为 )中安装所有包。环境文件的顶部包含一些有用的命令。

conda env create -f environment.yml
Run Code Online (Sandbox Code Playgroud)

将以下内容保存environment.yml在您的存储库下。并考虑pinning以下三个库:

一个例子:

  • tensorflow-gpu=2.4
  • cudnn=8
  • cudatoolkit=11
## filename: environment.yml

## Environment File Definition

name: tfgpu_env # tensorflow-gpu environment
channels:
  - conda-forge
  - anaconda
  - default
dependencies:
  - python=3.8
  ## Core Necessities
  - numpy # -c conda-forge, anaconda
  - pandas # -c conda-forge, anaconda
  - tabulate # -c conda-forge, anaconda  # necessary for df.to_markdown() in pandas
  - scipy # -c conda-forge, anaconda
  - matplotlib # -c conda-forge, anaconda
  ## Jupyter Related
  - jupyter # -c anaconda, conda-forge
  - jupyterlab # -c anaconda, conda-forge
  - jupyter_dashboards # -c conda-forge  (see: https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e)
  - jupyter_contrib_nbextensions # -c conda-forge
  ## Progressbar
  - tqdm # -c conda-forge, anaconda
  ## Machine Learning
  - tensorflow-gpu # -c anaconda | version: 2.4.1 (linux), 2.3.0 (windows)
  # - tensorflow # -c anaconda | version: 2.2.0 (linux), 2.1.0 (windows)
  - cudnn # -c conda-forge | version: 8.1.0.77 (linux/windows)
  #       # -c anaconda | version: 7.6.5 (linux/windows)
  - cudatoolkit # -c conda-forge | version: 8.1.0.77 (linux/windows)
  #             # -c anaconda | version: 11.0.221 (linux/windows)
  - scikit-learn # -c conda-forge, anaconda
  ## Hyperparameter Optimization
  - optuna # -c conda-forge # works for pytorch, tf/keras, mxnet, scikit-learn, xgboost, lightgbm
  - keras-tuner # -c conda-forge
  ## Image Processing
  - opencv # -c conda-forge, anaconda
  - imageio # -c anaconda, conda-forge
  ## Image Augmentation
  - albumentations # -c conda-forge
  - imgaug # -c conda-forge
  ## Code Linting
  - pylint # -c conda-forge, anaconda
  - autopep8 # -c conda-forge, anaconda
  ## Installations with pip
  - pip:
    ## Web App Framework
    # - Flask-Testing
    - streamlit # https://docs.streamlit.io/en/stable/troubleshooting/clean-install.html
Run Code Online (Sandbox Code Playgroud)

有用的说明

您不妨将以下说明复制并粘贴到环境文件本身中,以方便使用。

# Instruction:
#-----------------------------------------------------------
#
## For an environment installed locally (under: ./.venv)
# mkdir -p .venv && cd .venv
# conda env create --prefix . -f ../environment.yml
## For Updating local environment
# cd .venv
# conda env update --prefix . -f ../environment.yml  --prune
#
## For an environment installed globally
## with a name: fav_env 
# NOTE: The env-name is stored inside the 
#       environment.yml file.
# conda env create -f environment.yml
## For Updating global environment from env-file
# conda env update -f ./environment.yml  --prune
#
## Update conda itself
# conda update -n base -c defaults conda
#
## Creating a global environment in one-line: py37, py38
# conda create -n py37 python=3.7
# conda create -n py38 python=3.8
#
### In each of the envs: base, py37, py38
# conda install jupyter jupyterlab numpy scipy pandas matplotlib scikit-learn scikit-image tqdm plotly imageio requests pylint autopep8 tabulate opencv
#
## Export a platform independent copy of an environment
#  conda env export --from-history > path/to/environment.yml
### Make exports directory (if not present already) and export
# $targetDir = conda_exports
# mkdir ./$targetDir
# conda env export --from-history > ./$targetDir/exported_environment.yml
Run Code Online (Sandbox Code Playgroud)

参考

  1. 如何使用 Conda 管理包

  2. https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/

  3. https://www.tensorflow.org/guide/gpu

  4. https://www.fastwebhost.in/blog/how-to-find-if-linux-is-running-on-32-bit-or-64-bit/

  5. Miniconda 安装:https://docs.conda.io/en/latest/miniconda.html


小智 3

好吧,您可以看到您的 Tensorflow 安装正在寻找版本 11、10 的 Cuda 库,而您有 10.1。因此,为了解决此问题,请安装正确的 Cuda 版本。为什么它要寻找3个不同的版本,我不知道。但您可以在此处找到 Cuda、Tensorflow 和 CUDNN 的有效组合。

编辑:从 Cuda 版本中删除了 8,Tensorflow 实际上正在寻找 CUDNN 版本 8。所以也不要忘记安装 CUDNN(我的猜测是您正在安装最新版本的 Tensorflow -> 这就是为什么它正在寻找最新的 Cuda 和 CUDNN 版本。)

  • 非常感谢!我再次安装了所有东西。使用“Tensorflow 2.3.1”、“CUDNN 7.6”和“Cuda 10.1”,仍然出现错误..:/ (2认同)