Ath10k 和 QCA6174 导致 PCIe 错误、固件崩溃和连接中断?

Kaz*_*lfe 6 firmware wireless atheros drivers 18.04

我最近(重新)在我的 Razer Blade Pro(2017)上安装了 Ubuntu 18.04。我的无线网卡性能极差,经常掉线。检查dmesgAtheros 消息会产生以下(令人讨厌的)崩溃:

[ 6709.200017] ath10k_pci 0000:3c:00.0: firmware crashed! (guid 01e29e97-0ee6-4538-8756-764abe49705f)
[ 6709.200048] ath10k_pci 0000:3c:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
[ 6709.200056] ath10k_pci 0000:3c:00.0: kconfig debug 0 debugfs 1 tracing 1 dfs 0 testmode 0
[ 6709.201666] ath10k_pci 0000:3c:00.0: firmware ver WLAN.RM.4.4.1-00079-QCARMSWPZ-1 api 6 features wowlan,ignore-otp crc32 fd869beb
[ 6709.202773] ath10k_pci 0000:3c:00.0: board_file api 2 bmi_id N/A crc32 20d869c3
[ 6709.202784] ath10k_pci 0000:3c:00.0: htt-ver 3.47 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
[ 6709.204809] ath10k_pci 0000:3c:00.0: firmware register dump:
[ 6709.204822] ath10k_pci 0000:3c:00.0: [00]: 0x05030000 0x000015B3 0x009E6FD4 0x00955B31
[ 6709.204830] ath10k_pci 0000:3c:00.0: [04]: 0x009E6FD4 0x00060730 0x0000001D 0x00473AD4
[ 6709.204838] ath10k_pci 0000:3c:00.0: [08]: 0x0049C59C 0x0044DEB4 0x004290B0 0x00449AB0
[ 6709.204847] ath10k_pci 0000:3c:00.0: [12]: 0x00000009 0xFFFFFFFF 0x00952F6C 0x00952F77
[ 6709.204854] ath10k_pci 0000:3c:00.0: [16]: 0x00952CC4 0x0091080D 0x00000000 0x0091080D
[ 6709.204862] ath10k_pci 0000:3c:00.0: [20]: 0x409E6FD4 0x0040E818 0x00405820 0x0049C464
[ 6709.204870] ath10k_pci 0000:3c:00.0: [24]: 0x809E9395 0x0040E878 0x0049C6E8 0xC09E6FD4
[ 6709.204879] ath10k_pci 0000:3c:00.0: [28]: 0x80932EF9 0x0040EA68 0x0040A054 0x00000009
[ 6709.204887] ath10k_pci 0000:3c:00.0: [32]: 0x809F8C46 0x0040EA98 0x0041201C 0x00000004
[ 6709.204894] ath10k_pci 0000:3c:00.0: [36]: 0x80911210 0x0040EAC8 0x00000005 0x004040F4
[ 6709.204902] ath10k_pci 0000:3c:00.0: [40]: 0x80911154 0x0040EB28 0x00400000 0x00000000
[ 6709.204910] ath10k_pci 0000:3c:00.0: [44]: 0x8091122D 0x0040EB48 0x00000000 0x00400600
[ 6709.204922] ath10k_pci 0000:3c:00.0: [48]: 0x40910024 0x0040EB78 0x0040AB98 0x0040AB98
[ 6709.204930] ath10k_pci 0000:3c:00.0: [52]: 0x00000000 0x0040EB98 0x009BB001 0x00040020
[ 6709.204938] ath10k_pci 0000:3c:00.0: [56]: 0x809EDA21 0x0040E938 0x00499F10 0x00000000
[ 6709.204944] ath10k_pci 0000:3c:00.0: Copy Engine register dump:
[ 6709.204967] ath10k_pci 0000:3c:00.0: [00]: 0x00034400  14  14   3   3
[ 6709.204990] ath10k_pci 0000:3c:00.0: [01]: 0x00034800  17  17 510 511
[ 6709.205012] ath10k_pci 0000:3c:00.0: [02]: 0x00034c00   5   5  68  69
[ 6709.205034] ath10k_pci 0000:3c:00.0: [03]: 0x00035000  27  27  29  27
[ 6709.205057] ath10k_pci 0000:3c:00.0: [04]: 0x00035400 131 131 131  67
[ 6709.205079] ath10k_pci 0000:3c:00.0: [05]: 0x00035800   0   0  64   0
[ 6709.205101] ath10k_pci 0000:3c:00.0: [06]: 0x00035c00  26  26  24  24
[ 6709.205123] ath10k_pci 0000:3c:00.0: [07]: 0x00036000   1   1   1   1
[ 6710.053042] ath10k_pci 0000:3c:00.0: Unknown eventid: 118809
[ 6710.056101] ath10k_pci 0000:3c:00.0: Unknown eventid: 90118
[ 6710.153420] ath10k_pci 0000:3c:00.0: device successfully recovered
Run Code Online (Sandbox Code Playgroud)

还有以下与无线网卡相关的条目:

[ 7403.617792] pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
[ 7403.617797] pcieport 0000:00:1c.6: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e6(Transmitter ID)
[ 7403.617800] pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00001000/00002000
[ 7403.617802] pcieport 0000:00:1c.6:    [12] Replay Timer Timeout 
Run Code Online (Sandbox Code Playgroud)

lspci卡的输出如下:

3c:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
    Subsystem: Bigfoot Networks, Inc. QCA6174 802.11ac Wireless Network Adapter
    Flags: bus master, fast devsel, latency 0, IRQ 145
    Memory at dc200000 (64-bit, non-prefetchable) [size=2M]
    Capabilities: [40] Power Management version 3
    Capabilities: [50] MSI: Enable+ Count=1/8 Maskable+ 64bit-
    Capabilities: [70] Express Endpoint, MSI 00
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [148] Virtual Channel
    Capabilities: [168] Device Serial Number 00-00-00-00-00-00-00-00
    Capabilities: [178] Latency Tolerance Reporting
    Capabilities: [180] L1 PM Substates
    Kernel driver in use: ath10k_pci
    Kernel modules: ath10k_pci

-[0000:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
           +- ...
           +-1c.0-[02-3a]--
           +-1c.4-[3b]----00.0  ...
           +-1c.6-[3c]----00.0  Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-1d.0-[3d]----00.0  ...
           +- ...
Run Code Online (Sandbox Code Playgroud)

加载卡(在启动时)显示以下dmesg输出:

[   29.432791] ath10k_pci 0000:3c:00.0: enabling device (0000 -> 0002)
[   29.433628] ath10k_pci 0000:3c:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   29.721996] ath10k_pci 0000:3c:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:3c:00.0.bin failed with error -2
[   29.722023] ath10k_pci 0000:3c:00.0: Direct firmware load for ath10k/cal-pci-0000:3c:00.0.bin failed with error -2
[   29.725059] ath10k_pci 0000:3c:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
[   29.725061] ath10k_pci 0000:3c:00.0: kconfig debug 0 debugfs 1 tracing 1 dfs 0 testmode 0
[   29.725481] ath10k_pci 0000:3c:00.0: firmware ver WLAN.RM.4.4.1-00079-QCARMSWPZ-1 api 6 features wowlan,ignore-otp crc32 fd869beb
[   29.791271] ath10k_pci 0000:3c:00.0: board_file api 2 bmi_id N/A crc32 20d869c3
[   30.386364] ath10k_pci 0000:3c:00.0: Unknown eventid: 118809
[   30.389342] ath10k_pci 0000:3c:00.0: Unknown eventid: 90118
[   30.389967] ath10k_pci 0000:3c:00.0: htt-ver 3.47 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
[   30.471606] ath: EEPROM regdomain: 0x6c
[   30.471606] ath: EEPROM indicates we should expect a direct regpair map
[   30.471607] ath: Country alpha2 being used: 00
[   30.471608] ath: Regpair used: 0x6c
[   30.475073] ath10k_pci 0000:3c:00.0 wlp60s0: renamed from wlan0
[   31.698248] ath10k_pci 0000:3c:00.0: Unknown eventid: 118809
[   31.701166] ath10k_pci 0000:3c:00.0: Unknown eventid: 90118
Run Code Online (Sandbox Code Playgroud)

值得注意的是,我的系统没有hw3.2/lib/firmware/ath10k/QCA6174. 我有版本1.173.1linux-firmware安装,并且没有专有的驱动程序似乎是适合我的无线网卡。Pastebin上提供强制性 AIO 脚本结果。

在我的无线网卡崩溃后,我通常可以通过在 GNONE 菜单中关闭 WiFi 然后重新打开来恢复连接,但是每当我的无线崩溃(从上次开始需要几分钟到几个小时)时,这很烦人崩溃发生)。在我不得不卸载 Linux 之前,这在 16.04 HWE 中运行良好,所以我不确定为什么 18.04 会带来一系列全新的问题,但显然它们现在存在。

我假设这是一个与内核相关的错误(尽管我还没有就此提交报告),但我想知道是否有任何解决方法可以使我的无线连接持续时间超过十分钟和/或停止来自混乱我的系统日志的 PCIe 总线错误消息。

除了更换无线网卡并等待官方修复之外,我可以做些什么来提高无线性能(并阻止崩溃)?

Kaz*_*lfe 8

警告:这只是部分解决方案!

虽然主要问题(wifi 掉线和崩溃)似乎已解决,但该AER Corrected Error消息仍会向日志发送垃圾邮件。至少wifi现在更一致了。

Bernard Wei 的评论导致了 ath10k 固件存储库,其中包含hw3.0链的更新。

下载firmware-6.bin_WLAN.RM.4.4.1-00110-QCARMSWP-1并替换firmware-6.bin/lib/firmware/ath10k/QCA6174/hw3.0随后重启带来一个更稳定的无线体验。

cd /lib/firmware/ath10k/QCA6174/hw3.0
sudo mv firmware-6.bin firmware-6.bin.old
sudo wget https://github.com/kvalo/ath10k-firmware/raw/master/QCA6174/hw3.0/4.4.1/firmware-6.bin_WLAN.RM.4.4.1-00110-QCARMSWP-1 -O firmware-6.bin
Run Code Online (Sandbox Code Playgroud)

但是请注意,以下几行现在在系统日志中:

[   21.482256] ath10k_pci 0000:3c:00.0: Unknown eventid: 3
[   21.498398] ath10k_pci 0000:3c:00.0: Unknown eventid: 118809
[   21.501401] ath10k_pci 0000:3c:00.0: Unknown eventid: 90118
Run Code Online (Sandbox Code Playgroud)

现在......等待它linux-firmware真正击中包裹。并修复 AER 错误...