我的系统有问题(内部电源线有问题)。当我让系统备份并运行、重建阵列等时,我似乎遇到了这样一种情况:pvs
命令(vgs
和lvs
)报告No device found for PV <UUID>
但在据称丢失的物理卷上的逻辑卷可以成功挂载,因为它们的 DM 设备存在并映射到/dev/mapper
.
PV 设备是一个 md-raid RAID10 阵列,看起来不错,只是令人困惑的是它没有出现在pvs
输出中。
我认为这是一些内部表不同步的问题。我如何正确映射事物(无需重新启动,我认为可以修复它)?
更新:
重新启动并没有解决问题。我认为这个问题是由于将“缺失”PV(/dev/md99)配置为由 750b 磁盘(/dev/sdk)和 RAID0 阵列(/dev/md90)构建的 RAID10 far-2 阵列来自 250GB 磁盘 (/dev/sdh) 和 500GB 磁盘 (/dev/sdl)。从输出看来pvscan -vvv
,lvm2 签名是在 /dev/sdh 上找到的,而不是在 /dev/md99 上。
Asking lvmetad for VG f1bpcw-oavs-1SlJ-0Gxf-4YZI-AiMD-WGAErL (name unknown)
Setting response to OK
Setting response to OK
Setting name to b
Setting metadata/format to lvm2
Metadata cache has no info for vgname: "b"
Setting id to AzKyTe-5Ut4-dxgq-txEc-7V9v-Bkm5-mOeMBN
Setting format to lvm2
Setting device to 2160
Setting dev_size to 1464383488
Setting label_sector to 1
Opened /dev/sdh RO O_DIRECT
/dev/sdh: size is 488397168 sectors
/dev/sdh: block size is 4096 bytes
/dev/sdh: physical block size is 512 bytes
Closed /dev/sdh
/dev/sdh: size is 488397168 sectors
Opened /dev/sdh RO O_DIRECT
/dev/sdh: block size is 4096 bytes
/dev/sdh: physical block size is 512 bytes
Closed /dev/sdh
/dev/sdh: Skipping md component device
No device found for PV AzKyTe-5Ut4-dxgq-txEc-7V9v-Bkm5-mOeMBN.
Allocated VG b at 0x7fdeb00419f0.
Couldn't find device with uuid AzKyTe-5Ut4-dxgq-txEc-7V9v-Bkm5-mOeMBN.
Freeing VG b at 0x7fdeb00419f0.
Run Code Online (Sandbox Code Playgroud)
对 /dev/md99 的唯一引用(应该是 PV)是在它被添加到设备缓存时。
更新 2:
停止lvm2-lvmetad
并重复pvscan
确认而不是问题在于系统对使用哪些 PV 感到困惑,因为它发现 2 具有相同的 UUID
Using /dev/sdh
Opened /dev/sdh RO O_DIRECT
/dev/sdh: block size is 4096 bytes
/dev/sdh: physical block size is 512 bytes
/dev/sdh: lvm2 label detected at sector 1
Found duplicate PV AzKyTe5Ut4dxgqtxEc7V9vBkm5mOeMBN: using /dev/sdh not /dev/md99
/dev/sdh: PV header extension version 1 found
Incorrect metadata area header checksum on /dev/sdh at offset 4096
Closed /dev/sdh
Opened /dev/sdh RO O_DIRECT
/dev/sdh: block size is 4096 bytes
/dev/sdh: physical block size is 512 bytes
Incorrect metadata area header checksum on /dev/sdh at offset 4096
Closed /dev/sdh
Opened /dev/sdh RO O_DIRECT
/dev/sdh: block size is 4096 bytes
/dev/sdh: physical block size is 512 bytes
Closed /dev/sdh
Incorrect metadata area header checksum on /dev/sdh at offset 4096
Telling lvmetad to store PV /dev/sdh (AzKyTe-5Ut4-dxgq-txEc-7V9v-Bkm5-mOeMBN)
Setting response to OK
Run Code Online (Sandbox Code Playgroud)
由于此配置只是暂时的,我想我最好重新安排我的磁盘使用情况。
除非有人能告诉我如何明确覆盖pvscan
查看设备的顺序?
问题似乎是pvscan
由于在 RAID 阵列的组件设备和 RAID 阵列本身上看到相同的 UUID 而变得混乱。我认为通过认识到该设备是直接组件通常可以避免这种情况。就我而言,我创建了一种情况,其中设备不直接是 RAID 设备的组件,而 RAID 设备应该是 PV。
我的解决方案是备份LV,强制阵列降级,然后重新配置磁盘,以免使用多级RAID。请注意,再次重新启动后,设备字母已更改。500Gb = /dev/sdi,250Gb = /dev/sdj,750Gb = /dev/sdk
# mdadm /dev/md99 --fail --force /dev/md90
# mdadm /dev/md99 --remove failed
# mdadm --stop /dev/md90
# wipefs -a /dev/sdi /dev/sdj # wipe components
# systemctl stop lvm2-lvmetad
# pvscan -vvv
# pvs
..... /dev/md99 is now correctly reported as the PV for VG b
# fdisk /dev/sdi
...... Create 2 partitions of equal size, i.e. 250Gb
# fdisk /dev/sdj
...... Create a single 250Gb patitiion
# mdadm /dev/md91 --create -lraid5 -n3 /dev/sdi1 /dev/sdj1 missing
# mdadm /dev/md92 --create -lraid1 -n2 /dev/sdi2 missing
# pvcreate /dev/md91 /dev/md92
# vgextend b /dev/md91 /dev/md92
# pvmove /dev/md99
# vgreduce b /dev/md99
# pvremove /dev/md99
# mdadm --stop /dev/md99
# wipefs -a /dev/sdk
# fdisk /dev/sdk
..... Create 3 250Gb partitions
# mdadm /dev/md91 --add /dev/sdk1
# mdadm /dev/md92 --add /dev/sdk2
Run Code Online (Sandbox Code Playgroud)
故事的道德启示:
不要在文件系统中引入太多的间接级别!