这周我的机器崩溃了几次。运行 smartmontools 测试并得到以下结果:
=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MJA BH
Device Model: FUJITSU MJA2250BH G2
Serial Number: K94PT972B7RS
LU WWN Device Id: 5 00000e 043bcbddd
Firmware Version: 8919
User Capacity: 250,059,350,016 bytes [250 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 3f
Local Time is: Mon Feb 10 09:24:22 2014 IST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 118) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: ( 783) seconds.
Offline data collection
capabilities: (0x51) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 111) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 078 046 Pre-fail Always - 41112
2 Throughput_Performance 0x0025 253 253 030 Pre-fail Offline - 33619968
3 Spin_Up_Time 0x0023 100 100 025 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 4448
5 Reallocated_Sector_Ct 0x0033 253 253 024 Pre-fail Always - 0
7 Seek_Error_Rate 0x002f 100 100 047 Pre-fail Always - 2140
8 Seek_Time_Performance 0x0025 253 253 019 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 5655
10 Spin_Retry_Count 0x0033 253 253 020 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0032 253 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4319
180 Unused_Rsvd_Blk_Cnt_Tot 0x002f 100 100 098 Pre-fail Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 000 Old_age Always - 0
183 Runtime_Bad_Block 0x0032 253 100 000 Old_age Always - 327680
184 End-to-End_Error 0x0033 253 253 097 Pre-fail Always - 0
185 Unknown_Attribute 0x0030 100 100 000 Old_age Offline - 2
186 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 1441792
187 Reported_Uncorrect 0x0032 100 026 000 Old_age Always - 281470684365183
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 253 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 067 050 045 Old_age Always - 33 (Min/Max 23/33)
191 G-Sense_Error_Rate 0x0032 253 098 000 Old_age Always - 16580617
192 Power-Off_Retract_Count 0x0032 096 096 000 Old_age Always - 71566404
193 Load_Cycle_Count 0x0032 099 099 000 Old_age Always - 35363
195 Hardware_ECC_Recovered 0x003a 253 253 000 Old_age Always - 20430
196 Reallocated_Event_Count 0x0032 253 253 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 087 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 253 253 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 517 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 517 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 08 00:03:39.320 IDENTIFY DEVICE
c8 00 80 80 28 97 ec 08 00:03:30.939 READ DMA
c8 00 80 20 2a 97 ec 08 00:03:27.409 READ DMA
c8 00 90 c0 5b e2 e5 08 00:03:27.394 READ DMA
ca 00 98 00 9b 98 ec 08 00:03:27.393 WRITE DMA
Error 516 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 08 00:03:23.216 IDENTIFY DEVICE
c8 00 40 40 28 97 ec 08 00:03:14.822 READ DMA
ef 10 02 00 00 00 a0 08 00:03:14.821 SET FEATURES [Reserved for Serial ATA]
ec 00 00 00 00 00 a0 08 00:03:14.819 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 08 00:03:14.819 SET FEATURES [Set transfer mode]
Error 515 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 08 00:03:14.815 IDENTIFY DEVICE
c8 00 40 40 28 97 ec 08 00:03:06.445 READ DMA
c8 00 08 18 2a 97 ec 08 00:03:04.772 READ DMA
ef 10 02 00 00 00 a0 08 00:03:04.772 SET FEATURES [Reserved for Serial ATA]
ec 00 00 00 00 00 a0 08 00:03:04.770 IDENTIFY DEVICE
Error 514 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 03 1d 2a 97 ec Error: UNC 3 sectors at LBA = 0x0c972a1d = 211233309
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 18 2a 97 ec 08 00:03:00.416 READ DMA
ef 10 02 00 00 00 a0 08 00:03:00.415 SET FEATURES [Reserved for Serial ATA]
ec 00 00 00 00 00 a0 08 00:03:00.413 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 08 00:03:00.413 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 08 00:03:00.413 SET FEATURES [Reserved for Serial ATA]
Error 513 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 03 1d 2a 97 ec Error: UNC 3 sectors at LBA = 0x0c972a1d = 211233309
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 18 2a 97 ec 08 00:02:56.010 READ DMA
ea 00 00 00 00 00 a0 08 00:02:55.973 FLUSH CACHE EXT
35 00 08 20 44 d6 e0 08 00:02:55.973 WRITE DMA EXT
ea 00 00 00 00 00 a0 08 00:02:55.949 FLUSH CACHE EXT
35 00 38 e8 43 d6 e0 08 00:02:55.949 WRITE DMA EXT
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 60% 5618 201724230
# 2 Short offline Completed without error 00% 5617 -
# 3 Short offline Completed without error 00% 5617 -
# 4 Extended offline Completed without error 00% 5600 -
# 5 Short offline Completed: read failure 90% 5595 239457889
# 6 Short offline Completed: read failure 90% 5595 239457889
# 7 Short captive Completed without error 00% 5305 -
# 8 Short captive Completed without error 00% 5301 -
# 9 Short captive Completed without error 00% 5301 -
#10 Short captive Completed without error 00% 5301 -
#11 Short captive Completed: read failure 90% 5301 214242167
#12 Extended offline Completed: read failure 60% 4819 176075039
#13 Short offline Completed without error 00% 4819 -
#14 Short offline Aborted by host 90% 214 -
#15 Short offline Aborted by host 90% 214 -
#16 Short offline Completed without error 00% 214 -
#17 Short offline Completed without error 00% 214 -
#18 Short offline Completed without error 00% 4 -
#19 Short offline Completed without error 00% 3 -
#20 Short offline Completed without error 00% 2 -
#21 Short offline Completed without error 00% 1 -
4 of 5 failed self-tests are outdated by newer successful extended offline self-test # 4
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Run Code Online (Sandbox Code Playgroud)
有人可以让我知道这是什么意思吗?我应该立即更换硬盘吗?
更新:正如landroni 建议的那样,我使用gsmartcontrol 进行了简短和扩展的自测。简短的自检运行没有抛出任何错误。由于错误,扩展测试在 40% 时中止。这是自测日志中的粘贴:
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-51-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MJA BH
Device Model: FUJITSU MJA2250BH G2
Serial Number: K94PT972B7RS
LU WWN Device Id: 5 00000e 043bcbddd
Firmware Version: 8919
User Capacity: 250,059,350,016 bytes [250 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 3f
Local Time is: Sun Feb 23 21:13:50 2014 IST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 118) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: ( 783) seconds.
Offline data collection
capabilities: (0x51) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 111) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 078 046 Pre-fail Always - 124861
2 Throughput_Performance 0x0025 253 253 030 Pre-fail Offline - 33619968
3 Spin_Up_Time 0x0023 100 100 025 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 4489
5 Reallocated_Sector_Ct 0x0033 253 253 024 Pre-fail Always - 0
7 Seek_Error_Rate 0x002f 100 100 047 Pre-fail Always - 1157
8 Seek_Time_Performance 0x0025 253 253 019 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 5693
10 Spin_Retry_Count 0x0033 253 253 020 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0032 253 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4342
180 Unused_Rsvd_Blk_Cnt_Tot 0x002f 100 100 098 Pre-fail Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 000 Old_age Always - 0
183 Runtime_Bad_Block 0x0032 253 100 000 Old_age Always - 327680
184 End-to-End_Error 0x0033 253 253 097 Pre-fail Always - 0
185 Unknown_Attribute 0x0030 100 100 000 Old_age Offline - 2
186 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 1441792
187 Reported_Uncorrect 0x0032 100 026 000 Old_age Always - 281470684365183
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 059 050 045 Old_age Always - 41 (Min/Max 37/42)
191 G-Sense_Error_Rate 0x0032 253 098 000 Old_age Always - 16580617
192 Power-Off_Retract_Count 0x0032 096 096 000 Old_age Always - 71566404
193 Load_Cycle_Count 0x0032 099 099 000 Old_age Always - 35590
195 Hardware_ECC_Recovered 0x003a 253 253 000 Old_age Always - 68959
196 Reallocated_Event_Count 0x0032 253 253 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 087 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 253 253 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 519 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 519 occurred at disk power-on lifetime: 5685 hours (236 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 03 10 00 00 00 Error:
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
00 00 01 01 00 00 00 ff 00:01:40.036 NOP [Abort queued commands]
00 00 01 01 00 00 00 ff 00:01:30.023 NOP [Abort queued commands]
00 00 01 01 00 00 00 ff 00:01:20.011 NOP [Abort queued commands]
2f 00 01 10 00 00 a0 08 00:01:15.009 READ LOG EXT
60 08 38 f0 68 47 40 08 00:01:08.725 READ FPDMA QUEUED
Error 518 occurred at disk power-on lifetime: 5685 hours (236 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 03 d8 5b e2 40 Error: UNC at LBA = 0x00e25bd8 = 14834648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 38 f0 68 47 40 08 00:01:08.725 READ FPDMA QUEUED
60 08 30 40 09 84 40 08 00:01:08.568 READ FPDMA QUEUED
61 08 28 70 09 9d 40 08 00:01:08.243 WRITE FPDMA QUEUED
61 a0 20 00 55 d6 40 08 00:01:07.961 WRITE FPDMA QUEUED
61 08 18 68 09 9d 40 08 00:01:07.594 WRITE FPDMA QUEUED
Error 517 occurred at disk power-on lifetime: 5654 hours (235 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After comman
lan*_*oni 10
gsmartcontrol输入即可下载sudo apt-get install gsmartcontrol
使用gsmartcontrol:
short self-test;extended self-test.如果这个也很好,那么可能没有理由恐慌。然而,如果测试检测到一些坏块,那么您可能需要使用ddrescue尽快进行备份,然后尝试了解您的硬盘驱动器出了什么问题。它可能会失败,或者可能只有少数不相关的坏扇区。
也可以看看:
更新:
鉴于似乎只有少数坏扇区存在,您可以尝试告诉 FS 应该避免使用哪些坏扇区fsck.ext3 -c。但是请man fsck.ext3在使用前阅读(假设这是您的 FS)。
看:
我最近也遇到了类似的问题,smart 报告了 9 个坏块。我从实时媒体启动,然后修复了 ext4 文件系统,其中e2fsck -c /dev/SDxSDx 是有问题的驱动器(在我的例子中是 sda)。这导致了几次短读取,我忽略了这些短读取并强制重写,并找到并修复了具有多重声明块的 5 个 inode。
如果驱动器包含关键数据,您当然应该在执行其他操作之前使用正确的策略来备份数据。如果不像我的情况,请继续阅读。dmesg报告的坏扇区数量几乎是 SMART 发现的坏扇区数量的两倍,因此我e2fsck -cc /dev/SDx在 SDx 是有问题的驱动器的位置运行,以便执行非破坏性读/写测试。这显然是一个耗时的过程,但是,因为我的目标只是从用于所有意图和目的的“临时驱动器”中挤出几个小时,用于没有关键数据的实验,同时等待更换开车去送货,我觉得这可能是值得的。一小时后,TB 驱动器完成 15%,我不太确定,但由于距离更换还有 3 天,我坚持了下来。最后,所有坏扇区都被添加到坏块索引节点列表中,从而阻止它们被分配到文件或目录。
| 归档时间: |
|
| 查看次数: |
99147 次 |
| 最近记录: |