这是一个严重的 RAID 错误吗?

San*_*dra 9 linux raid hardware-raid

如果我执行以下操作

/opt/MegaRAID/MegaCli/MegaCli -LDInfo -Lall -aAll -NoLog  > /tmp/tmp
/opt/MegaRAID/MegaCli/MegaCli -LDPDInfo     -aAll -NoLog >> /tmp/tmp
Run Code Online (Sandbox Code Playgroud)

然后我看到这些错误

Media Error Count: 11
Other Error Count: 5
Run Code Online (Sandbox Code Playgroud)

它们是什么意思?他们是批判的吗?

完整输出:

Adapter 0 -- Virtual Drive Information:
Virtual Disk: 0 (target id: 0)
Name:Virtual Disk 0
RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3
Size:951296MB
State: Optimal
Stripe Size: 64kB
Number Of Drives:5
Span Depth:1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disk's Default


Adapter #0

Number of Virtual Disks: 1
Virtual Disk: 0 (target id: 0)
Name:Virtual Disk 0
RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3
Size:951296MB
State: Optimal
Stripe Size: 64kB
Number Of Drives:5
Span Depth:1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disk's Default
Number of Spans: 1
Span: 0 - Number of PDs: 5
PD: 0 Information
Enclosure Device ID: N/A
Slot Number: 0
Device Id: 0
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238418MB [0x1d1a94a2 Sectors]
Non Coerced Size: 237906MB [0x1d0a94a2 Sectors]
Coerced Size: 237824MB [0x1d080000 Sectors]
Firmware state: Online
SAS Address(0): 0x1221000000000000
Connected Port Number: 0 
Inquiry Data: ATA     WDC WD2500JS-75N2E04     WD-WCANK9523610

PD: 1 Information
Enclosure Device ID: N/A
Slot Number: 1
Device Id: 1
Sequence Number: 2
Media Error Count: 11
Other Error Count: 5
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238418MB [0x1d1a94a2 Sectors]
Non Coerced Size: 237906MB [0x1d0a94a2 Sectors]
Coerced Size: 237824MB [0x1d080000 Sectors]
Firmware state: Online
SAS Address(0): 0x1221000001000000
Connected Port Number: 1 
Inquiry Data: ATA     WDC WD2500JS-75N2E04     WD-WCANK9507278

PD: 2 Information
Enclosure Device ID: N/A
Slot Number: 2
Device Id: 2
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238418MB [0x1d1a94a2 Sectors]
Non Coerced Size: 237906MB [0x1d0a94a2 Sectors]
Coerced Size: 237824MB [0x1d080000 Sectors]
Firmware state: Online
SAS Address(0): 0x1221000002000000
Connected Port Number: 2 
Inquiry Data: ATA     WDC WD2500JS-75N2E04     WD-WCANK9504713

PD: 3 Information
Enclosure Device ID: N/A
Slot Number: 3
Device Id: 3
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238418MB [0x1d1a94a2 Sectors]
Non Coerced Size: 237906MB [0x1d0a94a2 Sectors]
Coerced Size: 237824MB [0x1d080000 Sectors]
Firmware state: Online
SAS Address(0): 0x1221000003000000
Connected Port Number: 3 
Inquiry Data: ATA     WDC WD2500JS-75N2E04     WD-WCANK9503028

PD: 4 Information
Enclosure Device ID: N/A
Slot Number: 4
Device Id: 4
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238418MB [0x1d1a94a2 Sectors]
Non Coerced Size: 237906MB [0x1d0a94a2 Sectors]
Coerced Size: 237824MB [0x1d080000 Sectors]
Firmware state: Online
SAS Address(0): 0x1221000004000000
Connected Port Number: 4 
Inquiry Data: ATA     WDC WD2500JS-75N2E04     WD-WCANK9503793
Run Code Online (Sandbox Code Playgroud)

Paw*_*cki 10

插槽 1 中的驱动器有问题。它是 RAID 5,因此您的数据受到保护,但您失去了冗余(一个磁盘不可靠)。媒体错误意味着驱动器用完备用扇区以将坏扇区重新映射到 ( http://kb.lsi.com/KnowledgebaseArticle15809.aspx http://mycusthelp.info/LSI/_cs/AnswerDetail.aspx?inc=7468 )。如果是我的数据,我会在备份时加倍小心,卸下驱动器,用新驱动器替换并同步阵列。一些供应商(例如 IBM)将接受基于预测故障指标的 RMA,有些则不会。如果您的供应商不接受具有坏扇区、不可重映射扇区的磁盘为故障,则将其从阵列中取出并在测试系统中进行测试。它应该在合理的时间内失败。

编辑:

仅对于插槽 ID 为 1 的磁盘,媒体事件不为零。在您提供的日志中,每个条目都有插槽 ID。奇怪的是,尽管磁盘上有媒体错误,raid 仍将其状态报告为最佳状态。尽管如此,我还是不相信磁盘。

由 n 个相同大小的磁盘组成的 RAID 5 为您提供 (n-1) 个磁盘的容量,因为它存储了一个磁盘的冗余数据。因此,如果您有 6 个 250 GB 磁盘和 1T 可用空间,它们很可能被划分为 5 个磁盘 RAID 5(它为您提供 4x250 GB 的可用空间)加上 1 个备用磁盘。


pQd*_*pQd 7

实际上smartctl可以为您提供有关MegaRaid raid 中每个磁盘的详细信息。要获取有关物理磁盘 #0 的信息,请运行:

smartctl -a -d megaraid,0 /dev/sda|less
Run Code Online (Sandbox Code Playgroud)

正如 Pawel 正确指出的,它很可能是重新分配的扇区,但我很少有通信问题 [在 smartctl -l xerror -d megaraid,5 /dev/sda 中可见] 被报告为 Media Error Count 的情况