我对 Linux 场景相当陌生,并且几乎没有足够的经验来实际认为自己是一个可以信任使用该系统的人:P
总之长话短说 - 我决定使用 Linux RAID 5,因为我认为它比让它在 Windows 上运行更稳定。
RAID 最近挂载失败,我很确定它在尝试重建时遇到了问题。
现在尝试组装阵列,mdadm
使报告设备或资源保持忙碌 - 但据我所知,它没有安装或忙于任何事情。谷歌报告说 dmraid 可能是罪魁祸首 - 但试图删除它表明它没有安装。
系统是一个 12 驱动器 RAID-5,但似乎有 2 个驱动器没有安装正确的超级块数据。
我已经包含了下面大多数常用命令的输出
cat /proc/mdstat
erwin@erwin-ubuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdd1[10](S) sde1[2](S) sdf1[11](S) sdg1[6](S) sdm1[4](S) sdl1[9](S) sdk1[5](S) sdj1[7](S) sdi1[13](S) sdc1[8](S) sdb1[0](S) sda1[3](S)
11721120064 blocks
unused devices: <none>
Run Code Online (Sandbox Code Playgroud)
详细信息
erwin@erwin-ubuntu:~$ sudo mdadm --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
erwin@erwin-ubuntu:~$
Run Code Online (Sandbox Code Playgroud)
妈妈检查
注意到奇怪的部分 - 我不知道为什么,但系统驱动器通常是sda
- 现在突然是sdh
- 不,我没有移动任何物理接线?
erwin@erwin-ubuntu:~$ sudo mdadm --examine /dev/sd*1
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1bcd - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 97 3 active sync /dev/sdg1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdb1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1bd7 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 113 0 active sync /dev/sdh1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdc1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1bf7 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 8 8 129 8 active sync /dev/sdi1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdd1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1c0b - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 10 8 145 10 active sync /dev/sdj1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sde1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 08:05:07 2011
State : clean
Active Devices : 11
Working Devices : 12
Failed Devices : 1
Spare Devices : 1
Checksum : 3597cbb - correct
Events : 74284
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 161 2 active sync /dev/sdk1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 8 161 2 active sync /dev/sdk1
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 17 12 spare /dev/sdb1
/dev/sdf1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1c2d - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 11 8 177 11 active sync /dev/sdl1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdg1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1c33 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 6 8 193 6 active sync /dev/sdm1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
mdadm: No md superblock detected on /dev/sdh1.
/dev/sdi1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1b8b - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 13 8 17 13 spare /dev/sdb1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdj1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1b95 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 7 8 33 7 active sync /dev/sdc1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdk1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1ba1 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 5 8 49 5 active sync /dev/sdd1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdl1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1bb9 - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 9 8 65 9 active sync /dev/sde1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33 7 active sync /dev/sdc1
8 8 8 129 8 active sync /dev/sdi1
9 9 8 65 9 active sync /dev/sde1
10 10 8 145 10 active sync /dev/sdj1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 161 12 faulty /dev/sdk1
/dev/sdm1:
Magic : a92b4efc
Version : 0.90.00
UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
Creation Time : Sun Oct 10 11:54:54 2010
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Mon Dec 5 19:24:00 2011
State : clean
Active Devices : 10
Working Devices : 11
Failed Devices : 2
Spare Devices : 1
Checksum : 35a1bbf - correct
Events : 74295
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 81 4 active sync /dev/sdf1
0 0 8 113 0 active sync /dev/sdh1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 97 3 active sync /dev/sdg1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 49 5 active sync /dev/sdd1
6 6 8 193 6 active sync /dev/sdm1
7 7 8 33
首先,有时会发生驱动器重新刻字,具体取决于您的机器的设置方式。预计驱动器号在重新启动后不会稳定,因为,嗯,有一段时间了。因此,您的驱动器移动到您身上并不是什么大问题。
假设 dmraid 和 device-mapper 没有使用您的设备:
好吧,mdadm --stop /dev/md0
可能会处理您的繁忙消息,我认为这就是它抱怨的原因。然后你可以再次尝试你的组装线。如果它不起作用,再次 --stop 然后是 assemble with --run
(没有运行,--assemble --scan 不会启动降级的阵列)。然后您可以删除并重新添加故障磁盘,让它尝试重建。
/dev/sde 已过时(查看事件计数器)。其他人乍一看还不错,所以我认为你实际上没有困难的可能性很大。
您不应该将任何超级块归零。数据丢失的风险太高了。如果 --run 不起作用,我认为您会想要在本地(或可以通过 ssh 连接的人)找到知道他/她正在做什么来尝试修复的人。
“不足以启动阵列”从来都不是从 mdadm 得到的好消息。这意味着 mdadm 已从您的 12 驱动器 RAID5 阵列中找到了 10 个驱动器,我希望您知道 RAID5 只能在一次故障中幸存下来,而不是两次。
好吧,让我们试着把发生的事情拼凑起来。首先,在重新启动时,驱动器号发生了变化,这让我们试图弄清楚它很烦人,但 mdraid 并不关心这一点。通读您的 mdadm 输出,这里是发生的重映射(按 raid 磁盘 # 排序):
00 sdh1 -> sdb1
02 sdk1 -> sde1 [OUTDATED]
03 sdg1 -> sda1
04 sdf1 -> sdm1
05 sdd1 -> sdk1
06 sdm1 -> sdg1
07 sdc1 -> sdj1
08 sdi1 -> sdc1
09 sde1 -> sdl1
10 sdj1 -> sdd1
11 sdl1 -> sdf1
13 sdb1 -> sdi1 [SPARE]
Run Code Online (Sandbox Code Playgroud)
#02 的“事件”计数器低于其他计数器。这意味着它在某个时候离开了数组。
如果您了解该阵列的一些历史,那就太好了——例如,“12 驱动器 RAID5,1 个热备份”是否正确?
不过,我不太确定导致这种情况的失败顺序是什么。似乎在某个时候,设备 #1 出现故障,并且开始重建到设备 #12。
但我无法弄清楚接下来发生了什么。也许你有日志——或者有管理员要问。这是我无法解释的:
不知何故,#12 变成了#13。不知何故,#2 变成了#12。
因此,在#12 上的重建应该已经完成,然后#12 将成为#1。也许它没有——也许它由于某种原因未能重建。那么也许#2 失败了——或者#2 失败了,这就是为什么重建没有完成,有人试图删除并重新添加#2?这可能使它成为#12。然后可能会移除并重新添加备用,使其成为#13。
好的,当然,此时,您遇到了两个磁盘故障。好的。那讲得通。
如果发生了这种情况,则说明您遇到了两个磁盘故障。这意味着您丢失了数据。您接下来要做什么取决于数据的重要性(还要考虑您的备份有多好)。
如果数据非常有价值(并且您没有良好的备份),请联系数据恢复专家。除此以外:
如果数据足够有价值,您应该使用dd
对所有涉及的磁盘进行映像(您可以使用更大的磁盘和每个磁盘上的文件以节省资金。例如,2 或 3 TB 外部磁盘)。然后制作图像的副本。然后在该副本上进行恢复(您可以使用循环设备来执行此操作)。
获得更多的备件。可能,您有一个死磁盘。您至少有一些有问题的磁盘——smartctl
也许可以告诉您更多信息。
--force
在您的--assemble
线路旁边。这将使 mdadm 无论如何都使用过时的磁盘。这意味着某些行业现在将拥有过时的数据,而有些则不会。添加这些新磁盘之一作为备用,让重建完成。希望你没有遇到任何坏块(这会导致重建失败,我相信唯一的答案是让磁盘映射它们)接下来,fsck -f
磁盘。很可能会出现错误。一旦它们被修复,安装磁盘,看看你的数据是什么形状。
以后不要搭建12盘RAID5。两盘故障概率太高。请改用 RAID6 或 RAID10。此外,请确保定期清理阵列中的坏块 ( echo check > /sys/block/md0/md0/sync_action
)。
归档时间: |
|
查看次数: |
24663 次 |
最近记录: |