Rib*_*die 3 linux raid5 raid10
我有 3 个 1 TB 硬盘和 3 500 GB 硬盘。现在每个大小分组都在一个 RAID 5 中,两者都在一个 LVM 卷组中(带有条带化 LV)。
我发现这对于我在小型随机写入中的使用来说太慢了。我已经在 RAID 级别和 LVM 条带级别上处理了条带大小,以及条带缓存和预读缓冲区大小的增加。我还按照通常的建议禁用了 NCQ。
所以我完成了 Linux 软件 raid 5。没有专用控制器,它对我的目的没有用。
我要添加另一个 1 TB 驱动器和另一个 500 GB 驱动器,所以每个驱动器 4 个。
您将如何配置八个驱动器以获得最佳的小随机写入性能?当然不包括简单的RAID 0,因为这个设置的重点显然也是为了冗余。我曾考虑将 4 500 GB 磁盘放入 2 个 RAID 0,然后将其添加到其他 4 个 1 TB 硬盘的 RAID 10,对于 6 磁盘 RAID 10,但我不确定这是最佳解决方案。你说什么?
编辑:没有更多的硬件升级预算。我真正要问的是,就四个 1 TB 驱动器可以非常简单地作为 RAID 10 而言,我如何处理四个 500 GB 驱动器,以便它们最适合 4x1TB RAID 10 而不会成为冗余或性能问题?我的另一个想法是将所有四个 500 GB 驱动器一起 RAID 10,然后使用 LVM 将该容量添加到 4x1TB RAID10。你能想到什么更好的吗?
另一个编辑:现有数组的格式如下:
1 TB ext4 formatted lvm striped file share. Shared to two Macs via AFP.
1 500 GB lvm logical volume exported via iscsi to a Mac, formatted as HFS+. Used a Time Machine backup.
1 260 GB lvm logical volume exported via iscsi to a Mac, formatted as HFS+. Used as a Time Machine backup.
1 200 GB ext4 formatted lvm partition, used a disk device for a virtualised OS installtion.
An lvm snapshot of the 500 GB time machine backup.
Run Code Online (Sandbox Code Playgroud)
我还没有尝试过的一件事是用 ext4 文件系统上的文件替换 Time Machine LV(以便 iscsi 挂载指向该文件而不是块设备)。我有一种感觉可以解决我的速度问题,但它会阻止我拍摄这些分区的快照。所以我不确定是否值得进行权衡。
将来,我打算将 iPhoto 和 iTunes 库移动到另一个 HFS+ iSCSI 安装上的服务器上,测试是我开始注意到愚蠢的随机写入性能的方式。
如果你很好奇,我使用了这个 url 的 Raid Math 部分中的信息:http : //wiki.centos.org/HowTos/Disk_Optimization来弄清楚如何为 ext4 分区设置一切(因此我我看到了它的出色性能)但是这似乎对 iSCSI 共享 HFS+ 卷没有任何好处。
更多细节:
output of lvdisplay:
--- Logical volume ---
LV Name /dev/array/data
VG Name array
LV UUID 2Lgn1O-q1eA-E1dj-1Nfn-JS2q-lqRR-uEqzom
LV Write Access read/write
LV Status available
# open 1
LV Size 1.00 TiB
Current LE 262144
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 2048
Block device 251:0
--- Logical volume ---
LV Name /dev/array/etm
VG Name array
LV UUID KSwnPb-B38S-Lu2h-sRTS-MG3T-miU2-LfCBU2
LV Write Access read/write
LV snapshot status source of
/dev/array/etm-snapshot [active]
LV Status available
# open 1
LV Size 500.00 GiB
Current LE 128000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 2048
Block device 251:1
--- Logical volume ---
LV Name /dev/array/jtm
VG Name array
LV UUID wZAK5S-CseH-FtBo-5Fuf-J3le-fVed-WzjpOo
LV Write Access read/write
LV Status available
# open 1
LV Size 260.00 GiB
Current LE 66560
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 2048
Block device 251:2
--- Logical volume ---
LV Name /dev/array/mappingvm
VG Name array
LV UUID 69k2D7-XivP-Zf4o-3SVg-QAbD-jP9W-cG8foD
LV Write Access read/write
LV Status available
# open 0
LV Size 200.00 GiB
Current LE 51200
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 2048
Block device 251:3
--- Logical volume ---
LV Name /dev/array/etm-snapshot
VG Name array
LV UUID 92x9Eo-yFTY-90ib-M0gA-icFP-5kC6-gd25zW
LV Write Access read/write
LV snapshot status active destination for /dev/array/etm
LV Status available
# open 0
LV Size 500.00 GiB
Current LE 128000
COW-table size 500.00 GiB
COW-table LE 128000
Allocated to snapshot 44.89%
Snapshot chunk size 4.00 KiB
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 2048
Block device 251:7
output of pvs --align -o pv_name,pe_start,stripe_size,stripes
PV 1st PE Stripe #Str
/dev/md0 192.00k 0 1
/dev/md0 192.00k 0 1
/dev/md0 192.00k 0 1
/dev/md0 192.00k 0 1
/dev/md0 192.00k 0 0
/dev/md11 512.00k 256.00k 2
/dev/md11 512.00k 256.00k 2
/dev/md11 512.00k 256.00k 2
/dev/md11 512.00k 0 1
/dev/md11 512.00k 0 1
/dev/md11 512.00k 0 0
/dev/md12 512.00k 256.00k 2
/dev/md12 512.00k 256.00k 2
/dev/md12 512.00k 256.00k 2
/dev/md12 512.00k 0 0
output of cat /proc/mdstat
md12 : active raid5 sdc1[1] sde1[0] sdh1[2]
976770560 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]
md11 : active raid5 sdg1[2] sdf1[0] sdd1[1]
1953521152 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]
output of vgdisplay:
--- Volume group ---
VG Name array
System ID
Format lvm2
Metadata Areas 2
Metadata Sequence No 8
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 5
Open LV 3
Max PV 0
Cur PV 2
Act PV 2
VG Size 2.73 TiB
PE Size 4.00 MiB
Total PE 715402
Alloc PE / Size 635904 / 2.43 TiB
Free PE / Size 79498 / 310.54 GiB
VG UUID PGE6Oz-jh96-B0Qc-zN9e-LKKX-TK6y-6olGJl
output of dumpe2fs /dev/array/data | head -n 100 (or so)
dumpe2fs 1.41.12 (17-May-2010)
Filesystem volume name: <none>
Last mounted on: /mnt/array/data
Filesystem UUID: b03e8fbb-19e5-479e-a62a-0dca0d1ba567
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 67108864
Block count: 268435456
Reserved block count: 13421772
Free blocks: 113399226
Free inodes: 67046222
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 960
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
RAID stride: 128
RAID stripe width: 128
Flex block group size: 16
Filesystem created: Thu Jul 29 22:51:26 2010
Last mount time: Sun Oct 31 14:26:40 2010
Last write time: Sun Oct 31 14:26:40 2010
Mount count: 1
Maximum mount count: 22
Last checked: Sun Oct 31 14:10:06 2010
Check interval: 15552000 (6 months)
Next check after: Fri Apr 29 14:10:06 2011
Lifetime writes: 677 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 9e6a9db2-c179-495a-bd1a-49dfb57e4020
Journal backup: inode blocks
Journal features: journal_incompat_revoke
Journal size: 128M
Journal length: 32768
Journal sequence: 0x000059af
Journal start: 1
output of lvs array --aligned -o seg_all,lv_all
Type #Str Stripe Stripe Region Region Chunk Chunk Start Start SSize Seg Tags PE Ranges Devices LV UUID LV Attr Maj Min Rahead KMaj KMin KRahead LSize #Seg Origin OSize Snap% Copy% Move Convert LV Tags Log Modules
striped 2 256.00k 256.00k 0 0 0 0 0 0 1.00t /dev/md11:0-131071 /dev/md12:0-131071 /dev/md11(0),/dev/md12(0) 2Lgn1O-q1eA-E1dj-1Nfn-JS2q-lqRR-uEqzom data -wi-ao -1 -1 auto 251 0 1.00m 1.00t 1 0
striped 2 256.00k 256.00k 0 0 0 0 0 0 500.00g /dev/md11:131072-195071 /dev/md12:131072-195071 /dev/md11(131072),/dev/md12(131072) KSwnPb-B38S-Lu2h-sRTS-MG3T-miU2-LfCBU2 etm owi-ao -1 -1 auto 251 1 1.00m 500.00g 1 500.00g snapshot
linear 1 0 0 0 0 4.00k 4.00k 0 0 500.00g /dev/md11:279552-407551 /dev/md11(279552) 92x9Eo-yFTY-90ib-M0gA-icFP-5kC6-gd25zW etm-snapshot swi-a- -1 -1 auto 251 7 1.00m 500.00g 1 etm 500.00g 44.89 snapshot
striped 2 256.00k 256.00k 0 0 0 0 0 0 260.00g /dev/md11:195072-228351 /dev/md12:195072-228351 /dev/md11(195072),/dev/md12(195072) wZAK5S-CseH-FtBo-5Fuf-J3le-fVed-WzjpOo jtm -wi-ao -1 -1 auto 251 2 1.00m 260.00g 1 0
linear 1 0 0 0 0 0 0 0 0 200.00g /dev/md11:228352-279551 /dev/md11(228352) 69k2D7-XivP-Zf4o-3SVg-QAbD-jP9W-cG8foD mappingvm -wi-a- -1 -1 auto 251 3 1.00m 200.00g 1 0
cat /sys/block/md11/queue/logical_block_size
512
cat /sys/block/md11/queue/physical_block_size
512
cat /sys/block/md11/queue/optimal_io_size
524288
cat /sys/block/md11/queue/minimum_io_size
262144
cat /sys/block/md12/queue/minimum_io_size
262144
cat /sys/block/md12/queue/optimal_io_size
524288
cat /sys/block/md12/queue/logical_block_size
512
cat /sys/block/md12/queue/physical_block_size
512
Run Code Online (Sandbox Code Playgroud)
编辑:所以没有人可以告诉我这里是否有问题?根本没有具体的建议?嗯。
抱歉,除非控制器有足够的缓存,否则 RAID 5 总是不适合小写。校验和有很多读取和写入。
你最好的床是硬件控制器上的 Raid 10 - 要获得真正的尖叫性能,请获得像adaptec 之类的东西,并将驱动器制作成 SSD 的一半……这样所有读取都将进入 SSD,这将为您提供大量性能,尽管写入显然必须被拆分。不确定 Linux 软件可以做同样的事情。
其余的完全取决于您的使用模式,并且基本上 - 您没有告诉我们任何关于此的信息。