如何找出 Ubuntu 20.04 冻结的原因?

Cri*_*ova 5 nvidia amd-ryzen 20.04

两周前,我在我的 Amd Ryzen 5 1600 / Nvidia Gtx 1070 上安装了 Ubuntu 20.04,但 Ubuntu 不时完全死机。

键盘和屏幕完全停止工作,鼠标有时还能继续移动。我试过使用神奇的 SysRq 键,但没有用。也尝试过alt+F1但也没有得到系统的任何响应。基本上我只能按电源按钮才能重新启动。

我怀疑是英伟达,但我不知道如何验证。

nvidia-smi 显示驱动程序版本 440.100。

发现这些日志/var/log/Xorg.1.log.old显示我的计算机崩溃的时间。

[  1223.234] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-22ms), your system is too slow  
[  1223.234] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-35ms), your system is too slow  
[  1488.529] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-0ms), your system is too slow  
[  1488.529] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-13ms), your system is too slow  
[  5125.223] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-14ms), your system is too slow  
[  5125.223] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-27ms), your system is too slow  
[  6038.321] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-9ms), your system is too slow  
[  6206.894] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-3ms), your system is too slow  
[  6206.894] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-16ms), your system is too slow  
[  6409.650] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-9ms), your system is too slow  
[  6409.650] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-22ms), your system is too slow  
[ 10930.426] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-7ms), your system is too slow  
[ 10930.426] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-20ms), your system is too slow  
Run Code Online (Sandbox Code Playgroud)

free -h 结果:

              total        used        free      shared  buff/cache   available
Mem:           15Gi       2.5Gi        11Gi       393Mi       1.9Gi        12Gi
Swap:         2.0Gi          0B       2.0Gi
Run Code Online (Sandbox Code Playgroud)

sysctl vm.swappiness 结果:

vm.swappiness = 60
Run Code Online (Sandbox Code Playgroud)

sudo lshw -C memory 结果:

  *-firmware                
       description: BIOS
       vendor: American Megatrends Inc.
       physical id: 0
       version: 1.L0
       date: 12/28/2018
       size: 64KiB
       capacity: 16MiB
       capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification uefi
  *-memory
       description: System Memory
       physical id: f
       slot: System board or motherboard
       size: 16GiB
     *-bank:0
          description: 2933 MHz (0.3 ns) [empty]
          product: Unknown
          vendor: Unknown
          physical id: 0
          serial: Unknown
          slot: DIMM 0
          clock: 2933MHz (0.3ns)
     *-bank:1
          description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2933 MHz (0.3 ns)
          product: CMK16GX4M2B3200C16
          vendor: Unknown
          physical id: 1
          serial: 00000000
          slot: DIMM 1
          size: 8GiB
          width: 64 bits
          clock: 2933MHz (0.3ns)
     *-bank:2
          description: 2933 MHz (0.3 ns) [empty]
          product: Unknown
          vendor: Unknown
          physical id: 2
          serial: Unknown
          slot: DIMM 0
          clock: 2933MHz (0.3ns)
     *-bank:3
          description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2933 MHz (0.3 ns)
          product: CMK16GX4M2B3200C16
          vendor: Unknown
          physical id: 3
          serial: 00000000
          slot: DIMM 1
          size: 8GiB
          width: 64 bits
          clock: 2933MHz (0.3ns)
  *-cache:0
       description: L1 cache
       physical id: 11
       slot: L1 - Cache
       size: 576KiB
       capacity: 576KiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=1
  *-cache:1
       description: L2 cache
       physical id: 12
       slot: L2 - Cache
       size: 3MiB
       capacity: 3MiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=2
  *-cache:2
       description: L3 cache
       physical id: 13
       slot: L3 - Cache
       size: 16MiB
       capacity: 16MiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=3
Run Code Online (Sandbox Code Playgroud)

grep -i swap /etc/fstab 结果:

/swapfile                                 none            swap    sw              0       0
Run Code Online (Sandbox Code Playgroud)

sudo dmidecode -s bios-version 结果:

1.L0
Run Code Online (Sandbox Code Playgroud)

添加软件和更新屏幕截图

软件和更新

8 月 6 日更新:

崩溃文件,终端中列出的 gnome shell 扩展

hey*_*ema 1

BIOS

\n

微星 B350 战斧

\n

您拥有 BIOS 版本 1.L0,日期为 2018 年 12 月 28 日。

\n

这里有更新的 BIOS。编号/命名约定与现在的不同,这很不寻常。请联系 MSI 支持并询问此问题。

\n

在此输入图像描述

\n

笔记:确认我有适合您的主板型号的正确网页。

\n

笔记:不要下载/使用/安装最新的BETA版本。

\n

笔记:更新 BIOS 之前请做好备份。

\n
\n

交换

\n

让我们将 /swapfile 从 2G 增加到 4G。

\n

注意:错误使用dd可能会导致数据丢失。建议复制/粘贴。

\n
sudo swapoff -a           # turn off swap\nsudo rm -i /swapfile      # remove old /swapfile\n\nsudo dd if=/dev/zero of=/swapfile bs=1M count=4096\n\nsudo chmod 600 /swapfile  # set proper file protections\nsudo mkswap /swapfile     # init /swapfile\nsudo swapon /swapfile     # turn on swap\nfree -h                   # confirm 16G RAM and 4G swap\nreboot                    # reboot and verify operation\n
Run Code Online (Sandbox Code Playgroud)\n

将此行添加到 /etc/fstab...

\n
/swapfile \xc2\xa0 \xc2\xa0none \xc2\xa0 \xc2\xa0swap \xc2\xa0 \xc2\xa0sw \xc2\xa0 \xc2\xa0 \xc2\xa00   0\n
Run Code Online (Sandbox Code Playgroud)\n
\n
\n

英伟达

\n

您拥有 Nvidia 版本 440.100。

\n

Software & Updates表明这是当前版本。不过,可以在此处下载较新的版本 450.57 。

\n

在此输入图像描述\n在此输入图像描述

\n

注意:更新 Nvidia 驱动程序之前请做好备份。

\n

更新#1:

\n

由于您必须强制关闭计算机电源,因此让我们检查一下您的文件系统...

\n
    \n
  • 以 \xe2\x80\x9c 引导至 Ubuntu Live DVD/USB 尝试 Ubuntu\xe2\x80\x9d 模式
  • \n
  • terminalCtrl+ Alt+打开一个窗口T
  • \n
  • 类型sudo fdisk -l
  • \n
  • 识别“Linux 文件系统”的 /dev/sdXX 设备名称
  • \n
  • 输入sudo fsck -f /dev/sdXX,替换sdXX为您之前找到的数字
  • \n
  • fsck如果有错误则重复命令
  • \n
  • 类型reboot
  • \n
\n

更新#2:

\n

安装 BIOS 更新...但请先联系 MSI 支持,以确认您需要哪个 BIOS 更新文件版本...因为他们的命名约定似乎已更改。

\n

您已经安装了许多 GNOME Shell 扩展,其中任何一个都可能导致冻结,并且它们安装在“错误”的位置,因为它们安装在系统范围内,而不是特定于用户。您可以在 /usr/share/gnome-shell/extensions 目录列表中看到它们,它们都以 gcampax.github.com 结尾。

\n

删除它们的最安全方法是访问https://extensions.gnome.org/local/并删除除这三个扩展之外的所有扩展...

\n
drwxr-xr-x 2 root root 4.0K Jun 11 08:20 \'desktop-icons@csoriano\'/\ndrwxr-xr-x 3 root root 4.0K May 12 15:17 \'ubuntu-appindicators@ubuntu.com\'/\ndrwxr-xr-x 3 root root 4.0K Jun 18 09:12 \'ubuntu-dock@ubuntu.com\'/\n
Run Code Online (Sandbox Code Playgroud)\n

如果系统运行正常并且在一段时间内没有冻结,则手动重新安装任何单个收藏夹,一次一个扩展,而不是安装扩展包/zip 文件。

\n