SSD 问题:CRC 错误不断增加,冻结,有时是只读的

Mua*_*rif 6 hardware ssd disk smart ubuntu-gnome

我的笔记本电脑 SSD出现故障,自上次发布以来,错误数量猛增。

这个驱动器死了/快死了吗?
它现在已经打开,我正在写这个 - 我已经备份了所有数据,但我仍然不确定它是否可用?

联系制造商并没有多大帮助:他们让我安装 Windows 并从那里运行磁盘检查实用程序,或者将其作为外部驱动器连接到 Windows 主机并在那里进行测试。
我两者都做了,没有遇到错误。

我还使用他们提供的实用程序对其进行了检查(请参见下面的屏幕截图)。然后我用我用clonezilla制作的镜像返回到Ubuntu,发现SATA PHY错误计数接近300个错误!

我还检查了连接器,但由于 SSD 在笔记本电脑中,我无法更换电缆(很容易)。

这些是制造商的实用程序生成的测试结果

检测结果

然后smartctl在 Ubuntu上输出:

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.0-041400-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     SPCC Solid State Disk
Serial Number:    XXXXXXXXXX
Firmware Version: S9FM02.8
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Feb 18 02:22:56 2018 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (   30) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    (   2) minutes.
Conveyance self-test routine
recommended polling time:    (   2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       6352
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       2717
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
170 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       25
173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       105447539
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       77
194 Temperature_Celsius     0x0023   070   070   000    Pre-fail  Always       -       30
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
218 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       15431
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       6281157

SMART Error Log Version: 1
ATA Error Count: 298 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 298 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 01 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ff d5 01 01 00 00 00 ff      00:11:08.077  [VENDOR SPECIFIC]
  ca 00 80 b0 8f 12 e1 00      00:11:08.076  WRITE DMA
  ca 00 80 30 8f 12 e1 00      00:11:08.076  WRITE DMA
  ca 00 80 b0 8e 12 e1 00      00:11:08.075  WRITE DMA
  ca 00 80 30 8e 12 e1 00      00:11:08.074  WRITE DMA

Error 297 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 01 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ff d5 01 01 00 00 00 ff      00:11:08.039  [VENDOR SPECIFIC]
  ca 00 80 b0 7c 12 e1 00      00:11:08.038  WRITE DMA
  ca 00 80 30 7c 12 e1 00      00:11:08.038  WRITE DMA
  ca 00 80 b0 7b 12 e1 00      00:11:08.037  WRITE DMA
  ca 00 80 30 7b 12 e1 00      00:11:08.037  WRITE DMA

Error 296 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 01 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ff d5 01 01 00 00 00 ff      00:11:07.974  [VENDOR SPECIFIC]
  ca 00 80 b0 48 12 e1 00      00:11:07.973  WRITE DMA
  ca 00 80 30 48 12 e1 00      00:11:07.972  WRITE DMA
  ca 00 80 b0 47 12 e1 00      00:11:07.972  WRITE DMA
  ca 00 80 30 47 12 e1 00      00:11:07.972  WRITE DMA

Error 295 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 01 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ff d5 01 01 00 00 00 ff      00:11:07.927  [VENDOR SPECIFIC]
  ca 00 80 b0 2a 12 e1 00      00:11:07.926  WRITE DMA
  ca 00 80 30 2a 12 e1 00      00:11:07.925  WRITE DMA
  ca 00 80 b0 29 12 e1 00      00:11:07.925  WRITE DMA
  ca 00 80 30 29 12 e1 00      00:11:07.924  WRITE DMA

Error 294 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 01 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ff d5 01 01 00 00 00 ff      00:11:07.899  [VENDOR SPECIFIC]
  ca 00 80 b0 22 12 e1 00      00:11:07.898  WRITE DMA
  ca 00 80 30 22 12 e1 00      00:11:07.897  WRITE DMA
  ca 00 80 b0 21 12 e1 00      00:11:07.897  WRITE DMA
  ca 00 80 30 21 12 e1 00      00:11:07.896  WRITE DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      6288         -
# 2  Conveyance offline  Completed without error       00%      6285         -
# 3  Short offline       Completed without error       00%      6285         -
# 4  Extended offline    Completed without error       00%      6283         -
# 5  Extended offline    Completed without error       00%      6283         -
# 6  Short offline       Completed without error       00%      6283         -
# 7  Extended offline    Completed without error       00%      6262         -
# 8  Conveyance offline  Completed without error       00%      6262         -
# 9  Conveyance offline  Completed without error       00%      6262         -
#10  Extended offline    Completed without error       00%      6262         -
#11  Short offline       Completed without error       00%      6262         -
#12  Conveyance offline  Completed without error       00%      6211         -
#13  Extended offline    Completed without error       00%      6211         -
#14  Short offline       Completed without error       00%      6211         -
#15  Short offline       Completed without error       00%      6075         -
#16  Conveyance offline  Completed without error       00%      5564         -
#17  Extended offline    Completed without error       00%      5564         -
#18  Short offline       Completed without error       00%      5564         -
#19  Conveyance offline  Completed without error       00%      5319         -
#20  Short offline       Completed without error       00%      5319         -
#21  Conveyance offline  Completed without error       00%      4403         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Run Code Online (Sandbox Code Playgroud)

Rob*_*edl 6

更换您的 SSD

人们在评论中尝试了很多东西,但是这个SSD似乎有一些问题。

从 SMART 读数来看,您的驱动器没有看到很多动作(每天约 250 次电源,写入约 6 TB)并且您说它已经使用了大约 2 年。这应该在保修范围内!

我的建议是

  • 立即备份所有数据(虽然你说你已经涵盖了)
  • 移除/更换 SSD(当然取决于您的预算)
  • 将磁盘发送给制造商进行更换

您的“ Slim S70 ”磁盘应该在Silicon Power的5年保修范围内 保修单

只需在此处向他们发送RMA 请求即可。

  • @MuaadElSharif,减少损失并更换它。你已经花了很多时间了! (2认同)

Win*_*nix 3

您在 2017 年 5 月 11 日之前的某个时间更新了 SSD 固件。不过, 2017 年 9 月发布了新版本,您应该使用 Windows 应用它。


运行fstrim以丢弃文件系统中未使用的块:

$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed
Run Code Online (Sandbox Code Playgroud)

就我而言,Windows 10 分区的结果非常/mnt/c出色/mnt/e。所以我检查了文件,数据没有受到任何损害。


fsck -f未安装分区时,使用 Live-USB 启动后在 SSD 上运行。fsck -f另一个选项是从 grub运行-如何在卸载硬盘时使用可引导 USB 记忆棒对硬盘进行 fsck?


正如评论中提到的,坏的 SATA 电缆可能会导致错误。但正如这个答案指出的那样,连接松动也会导致错误。要排除连接不良/松动的情况,请从 SSD 上拔下插头,将压缩空气吹过插头和驱动器上的公插针,然后牢固地重新安装电缆。


你的时间值多少钱?

最后一个问题是你的时间值多少钱。假设您花了 10 个小时解决这个问题,那么每小时的费用为 5 美元,因为许多全新的 120GB SATA III SSD 都可以从ebay.com购买


2018 年 2 月 23 日更新

我今晚读了所有其他答案。一个答案说退货。但如果你这样做了并且他们没有发现任何问题,他们只会将其退回,而你将在 2 周到 2 个月内无法开车。

另一个答案说 smartctl 报告驱动器没有任何问题。

在这个答案中,我建议运行fsck -f,您回复说没有报告错误。

fsck每次启动都运行

作为否定答案(返回它)和肯定答案(没有任何问题)之间的折衷方案,我倾向于在每次启动时运行fsck。如果发现错误,启动将暂停,您可以阅读错误消息。总结一下链接的使用:

sudo tune2fs -c 1 /dev/sdX
Run Code Online (Sandbox Code Playgroud)

注意:替换X为您的驱动器盘符,即ab

如果一个月后没有出现错误,请将值从 更改130我认为对于大多数系统而言典型的值。在典型的 SSD 上fsck运行速度很快。

清洁并重新安装 SATA 电缆

其他人提到更换 SATA 电缆,这对于笔记本电脑来说是有问题的。作为折衷方案,请考虑拔下驱动器侧的所有电缆,在公端和母端使用压缩空气,然后将电缆牢固地插回。