soft raid6 恢复第二个磁盘故障

0 mdadm raid6

我在托管系统中出现磁盘故障,更换了故障驱动器。在恢复期间,不同驱动器上发生磁盘错误。

发生原始错误时:

md2 : active raid6 sdf3[5](F) sdd3[3] sdg3[6] sdc3[2] sdb3[7] sde3[4] sdd3[3] sda3[0]
      104849920 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
Run Code Online (Sandbox Code Playgroud)

修复后添加驱动:

root@rescue ~ # mdadm /dev/md2 -a /dev/sdf3
mdadm: added /dev/sdf3
root@rescue ~ # cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] 
md2 : active raid6 sdf3[7] sda3[0] sdg3[6] sde3[4] sdd3[3] sdc3[2]
      104849920 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/5] [U_UUU_U]
      [>....................]  recovery =  0.9% (200576/20969984) finish=5.1min speed=66858K/sec
Run Code Online (Sandbox Code Playgroud)

似乎 sda3 已经从数组中消失了

重建完成:

md2 : active raid6 sdf3[7](S) sda3[0] sdg3[6] sde3[4](F) sdd3[3] sdc3[2]
      104849920 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/4] [U_UU__U]
According to the error log the rebuild probably stopped at the error:
Jul 18 13:17:02 rescue kernel: [ 3648.976435] sd 6:0:0:0: [sde] Unhandled sense code
Jul 18 13:17:02 rescue kernel: [ 3648.976441] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 18 13:17:02 rescue kernel: [ 3648.976445] Sense Key : Medium Error [current] [descriptor]
Jul 18 13:17:02 rescue kernel: [ 3648.976451]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
Jul 18 13:17:02 rescue kernel: [ 3648.976464] sd 6:0:0:0: [sde]  
Jul 18 13:17:02 rescue kernel: [ 3648.976470] sd 6:0:0:0: [sde] CDB: 
Jul 18 13:17:02 rescue kernel: [ 3649.063660] md/raid:md2: read error not correctable (sec
tor 13785320 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063664] md/raid:md2: read error not correctable (sector 13785328 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063667] md/raid:md2: read error not correctable (sector 13785336 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063670] md/raid:md2: read error not correctable (sector 13785344 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063672] md/raid:md2: read error not correctable (sector 13785352 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063675] md/raid:md2: read error not correctable (sector 13785360 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063678] md/raid:md2: read error not correctable (sector 13785368 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063681] md/raid:md2: read error not correctable (sector 13785376 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063684] md/raid:md2: read error not correctable (sector 13785384 on sde3).
Jul 18 13:17:02 rescue kernel: [ 3649.063748] ata7: EH complete
Jul 18 13:17:02 rescue kernel: [ 3649.121786] md: md2: recovery done.
Run Code Online (Sandbox Code Playgroud)

到时候,有什么办法可以恢复(比如把sda3重新加进去)?

Mad*_*ter 6

您现在是软件 RAID6 上的三个HDD(不是两个:注意状态是[U_UU__U]),其中两个似乎在 RAID 重建过程中出现故障。是时候跳过大部分硬件并从备份中恢复了。