小编Ald*_*lda的帖子

NVidia驱动程序停止使用Ubuntu 16.04和Tesla K80 GPU在AWS EC2实例上工作

我一直在使用带有Tesla K80 GPU的AWS EC2实例来运行TensorFlow代码。我已经安装了CUDA 9.0和cuDNN 7.1.4,我使用的是TF 1.12,所有这些都在Ubuntu 16.04上

到昨天为止一切正常,但今天看来NVidia驱动程序由于某种原因已停止运行:

ubuntu@ip-10-0-0-13:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Run Code Online (Sandbox Code Playgroud)

我检查了驱动程序:

ubuntu@ip-10-0-0-13:~$ dpkg -l | grep nvidia
rc  nvidia-367                              367.48-0ubuntu1                            amd64        NVIDIA binary driver - version 367.48
ii  nvidia-396                              396.37-0ubuntu1                            amd64        NVIDIA binary driver - version 396.37
ii  nvidia-396-dev                          396.37-0ubuntu1                            amd64        NVIDIA binary Xorg driver development files
ii  nvidia-machine-learning-repo-ubuntu1604 1.0.0-1                                    amd64        nvidia-machine-learning repository configuration files
ii  nvidia-modprobe …
Run Code Online (Sandbox Code Playgroud)

gpu nvidia amazon-ec2 amazon-web-services tensorflow

5
推荐指数
2
解决办法
1370
查看次数