Mar*_*ter 6 kernel debian gpu amd
我新安装了带有 Debian Buster 的机器。GPU 是 radeon FirePro W2100。使用几个小时后,机器突然死机,显示屏切换为“白噪音”,机器无法使用。
在日志中,我看到很多这样的错误:
kernel: radeon 0000:65:00.0: ring 0 stalled for more than 10240msec
kernel: radeon 0000:65:00.0: GPU lockup (current fence id 0x0000000000039bff last fence id 0x0000000000039c42 on ring 0)
kernel: adeon 0000:65:00.0: failed to get a new IB (-35)
kernel: [drm:ffffffff816219d0] *ERROR* Couldn't update BO_VA (-35)
kernel: radeon 0000:65:00.0: failed to get a new IB (-35)
Run Code Online (Sandbox Code Playgroud)
进而
kernel: radeon 0000:65:00.0: ring 0 stalled for more than 10032msec
kernel: radeon 0000:65:00.0: GPU lockup (current fence id 0x0000000000039bff last fence id 0x0000000000039c42 on ring 0)
Run Code Online (Sandbox Code Playgroud)
这些错误是什么意思,我该如何解决这个问题?
这是硬件还是软件问题?
当我在Ubuntu 20.04.5 LTS下运行 Opera Web 浏览器几分钟时,我radeon 0000:04:00.0: ring 0 stalled for more than 10240msec在我的[AMD/ATI] RV620 GL [FirePro 2450]上运行了几分钟。Firefox 或任何其他程序都没有问题,只有 Opera 没有问题。
[128524.943553] radeon 0000:04:00.0: ring 0 stalled for more than 10240msec
[128524.943565] radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000029caf6 last fence id 0x000000000029cafc on ring 0)
[128524.955392] radeon 0000:04:00.0: Saved 185 dwords of commands on ring 0.
[128524.955409] radeon 0000:04:00.0: GPU softreset: 0x00000009
[128524.955413] radeon 0000:04:00.0: R_008010_GRBM_STATUS = 0xA2303030
[128524.955417] radeon 0000:04:00.0: R_008014_GRBM_STATUS2 = 0x00000003
[128524.955420] radeon 0000:04:00.0: R_000E50_SRBM_STATUS = 0x200010C0
[128524.955423] radeon 0000:04:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[128524.955426] radeon 0000:04:00.0: R_008678_CP_STALLED_STAT2 = 0x00008002
[128524.955429] radeon 0000:04:00.0: R_00867C_CP_BUSY_STAT = 0x00008086
[128524.955432] radeon 0000:04:00.0: R_008680_CP_STAT = 0x80018645
[128524.955435] radeon 0000:04:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[128525.013038] radeon 0000:04:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
[128525.013097] radeon 0000:04:00.0: SRBM_SOFT_RESET=0x00000100
[128525.015187] radeon 0000:04:00.0: R_008010_GRBM_STATUS = 0xA0003030
[128525.015191] radeon 0000:04:00.0: R_008014_GRBM_STATUS2 = 0x00000003
[128525.015195] radeon 0000:04:00.0: R_000E50_SRBM_STATUS = 0x200080C0
[128525.015198] radeon 0000:04:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[128525.015201] radeon 0000:04:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000
[128525.015204] radeon 0000:04:00.0: R_00867C_CP_BUSY_STAT = 0x00000000
[128525.015207] radeon 0000:04:00.0: R_008680_CP_STAT = 0x80100000
[128525.015210] radeon 0000:04:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[128525.015220] radeon 0000:04:00.0: GPU reset succeeded, trying to resume
[128525.031584] [drm] PCIE gen 2 link speeds already enabled
[128525.034184] [drm] PCIE GART of 512M enabled (table at 0x0000000000142000).
[128525.034222] radeon 0000:04:00.0: WB enabled
[128525.034224] radeon 0000:04:00.0: fence driver on ring 0 use gpu addr 0x0000000010000c00
[128525.034579] radeon 0000:04:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0
[128525.034797] debugfs: File 'radeon_ring_gfx' in directory '0' already present!
[128525.066237] [drm] ring test on 0 succeeded in 1 usecs
[128525.066242] debugfs: File 'radeon_ring_uvd' in directory '0' already present!
[128525.240884] [drm] ring test on 5 succeeded in 1 usecs
[128525.240893] [drm] UVD initialized successfully.
[128535.695467] radeon 0000:04:00.0: ring 0 stalled for more than 10456msec
[128535.695479] radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000029caf8 last fence id 0x000000000029cafc on ring 0)
[128535.697433] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[128535.697551] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on GFX ring (-35).
Run Code Online (Sandbox Code Playgroud)