我有一个 Seagate St2000dm001 2TB Barracuda Sata3 磁盘,它产生与此类似的错误:
[Tue Jun 14 10:02:06 2022] ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[Tue Jun 14 10:02:06 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 10:02:06 2022] ata2.00: cmd 61/00:00:00:48:9f/02:00:b2:00:00/40 tag 0 ncq 262144 out
[Tue Jun 14 10:02:06 2022] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[Tue Jun 14 10:02:06 2022] ata2.00: status: { DRDY }
[Tue Jun 14 10:02:06 2022] ata2: hard resetting link
[Tue Jun 14 10:02:16 2022] ata2: softreset failed (1st FIS failed)
[Tue Jun 14 10:02:16 2022] ata2: hard resetting link
[Tue Jun 14 10:02:26 2022] ata2: softreset failed (1st FIS failed)
[Tue Jun 14 10:02:26 2022] ata2: hard resetting link
[Tue Jun 14 10:02:42 2022] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[Tue Jun 14 10:02:42 2022] ata2.00: configured for UDMA/133
[Tue Jun 14 10:02:42 2022] ata2.00: device reported invalid CHS sector 0
[Tue Jun 14 10:02:42 2022] ata2: EH complete
Run Code Online (Sandbox Code Playgroud)
我使用不同的电缆在不同的机器上测试了磁盘,错误仍然存在。它看起来就像一个明显的磁盘损坏案例,但有一个扭曲。在执行很长的操作时查找错误mkfs.ext4 -c -c,给出了错误的周期性模式:
[Mon Jun 13 10:47:02 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 11:51:08 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 12:55:14 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 14:01:21 2022] ata2.00: failed command: READ FPDMA QUEUED
[Mon Jun 13 15:08:27 2022] ata2.00: failed command: READ FPDMA QUEUED
[Mon Jun 13 16:15:33 2022] ata2.00: failed command: READ FPDMA QUEUED
[Mon Jun 13 17:22:39 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 18:29:43 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 19:36:49 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 20:43:55 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Mon Jun 13 21:50:02 2022] ata2.00: failed command: READ FPDMA QUEUED
[Mon Jun 13 22:57:08 2022] ata2.00: failed command: READ FPDMA QUEUED
[Tue Jun 14 00:04:14 2022] ata2.00: failed command: READ FPDMA QUEUED
[Tue Jun 14 01:11:17 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 02:15:24 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 03:19:30 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 04:26:36 2022] ata2.00: failed command: READ FPDMA QUEUED
[Tue Jun 14 05:33:42 2022] ata2.00: failed command: READ FPDMA QUEUED
[Tue Jun 14 06:40:48 2022] ata2.00: failed command: READ FPDMA QUEUED
[Tue Jun 14 07:47:54 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 08:55:00 2022] ata2.00: failed command: WRITE FPDMA QUEUED
[Tue Jun 14 10:02:06 2022] ata2.00: failed command: WRITE FPDMA QUEUED
Run Code Online (Sandbox Code Playgroud)
几乎正好每1小时7分钟一班。我认为它可能与 相关smartd,但smartd没有运行。所以,我被困住了:什么样的硬件故障会产生周期为 1 小时 7 分钟的周期性错误?任何想法将不胜感激。
此致,
尼古拉斯
Mar*_*ler 21
这几乎正好是 4000 秒,在廉价振荡器的精度之内。
这意味着 SATA 驱动器或 SATA 控制器固件中的某些内容可能会自动执行此操作。
基本上,其原因可能是任何原因。例如,当某些组件检查子程序失败时,驱动器固件每 4000 秒重置一次。当 SATA 控制器固件尝试重新协商链接并且失败或其他任何情况时,每 4000 秒重置一次(这两个例子并不比其他任何情况都更有可能)。
时间表明的唯一一件事是软件决定这样做,无论它是作为操作系统、控制器还是驱动器固件运行的软件。这可能是软件错误,或者是硬件错误的实际检测。
所以,真的很难诊断这一点。如果控制器和驱动器已经是最新的固件版本(fwupdmgr get-updates两者都是您的朋友),那么。