PyTorch 错误:CUDA 错误:调用“cublasCreate(handle)”时出现 CUBLAS_STATUS_INTERNAL_ERROR

Sha*_*oon 7 python pytorch

我有一个非常简单的例子

import torch

if __name__ == "__main__":
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    m = torch.nn.Linear(20, 30).to(DEVICE)
    input = torch.randn(128, 20).to(DEVICE)
    output = m(input)
    print('output', output.size())
    exit()
Run Code Online (Sandbox Code Playgroud)

我得到:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    output = m(input)
  File "/home/shamoon/.local/share/virtualenvs/speech-reconstruction-7HMT9fTW/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/shamoon/.local/share/virtualenvs/speech-reconstruction-7HMT9fTW/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/shamoon/.local/share/virtualenvs/speech-reconstruction-7HMT9fTW/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`
Run Code Online (Sandbox Code Playgroud)

我在用着PyTorch 1.7.1。任何帮助将不胜感激。

谢谢。

编辑。的更新内容python -m torch.utils.collect_env为:

Collecting environment information...
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 11.1.0
CMake version: version 3.18.4

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: TITAN RTX
GPU 1: TITAN RTX
GPU 2: TITAN RTX
GPU 3: TITAN RTX
GPU 4: TITAN RTX
GPU 5: TITAN RTX
GPU 6: TITAN RTX
GPU 7: TITAN RTX

Nvidia driver version: 460.39
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.1
[pip3] torch==1.8.0
[pip3] torchaudio==0.8.0
[pip3] torchsummary==1.5.1
[conda] Could not collect
Run Code Online (Sandbox Code Playgroud)

Jer*_*mie 2

正如日志中所述,安装了 pytorch 1.8,而不是 1.7.1。否则,请使用正确的 python 二进制文件再次发送您的日志。

我使用 1.8 时遇到了完全相同的问题。降级到 1.7.1 解决了这个问题(如Huggingface Transformers github 问题中提到的)。