fav*_*adi 5 linux hard-drive io iops
我很难找出导致服务器上 iowait 高的原因。
这是日志 iostat -xm 5 5
Linux 2.6.32-358.6.1.el6.x86_64 (prod-1.localdomain) 09/28/2013 _x86_64_ (16 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
6.98 0.05 3.72 3.54 0.00 85.71
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdb 0.08 120.88 30.27 1.72 0.96 0.48 92.20 0.34 10.67 3.79 12.13
sda 7.63 37.19 8.96 4.89 0.35 0.16 76.40 0.16 11.63 2.19 3.04
avg-cpu: %user %nice %system %iowait %steal %idle
5.41 0.00 6.20 37.65 0.00 50.74
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdb 0.00 109.80 186.20 1.40 3.75 0.43 45.66 98.21 519.80 5.33 100.00
sda 33.20 3.40 18.00 2.00 0.37 0.02 40.32 0.07 3.41 3.17 6.34
avg-cpu: %user %nice %system %iowait %steal %idle
5.55 0.00 7.42 30.06 0.00 56.97
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 196.00 0.00 3.91 0.00 40.85 100.41 506.01 5.10 100.00
sda 0.00 2.40 1.80 2.60 0.05 0.02 30.91 0.01 2.95 2.73 1.20
avg-cpu: %user %nice %system %iowait %steal %idle
5.71 0.00 7.04 31.76 0.00 55.49
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdb 0.00 100.00 189.00 1.20 3.72 0.40 44.33 95.32 514.88 5.26 100.00
sda 33.20 4.20 19.20 5.20 0.39 0.04 35.80 0.02 1.01 0.79 1.92
avg-cpu: %user %nice %system %iowait %steal %idle
61.93 0.00 10.08 14.99 0.00 12.99
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.40 185.40 1.40 3.76 0.01 41.31 83.22 431.16 5.28 98.62
sda 33.20 5.40 9.60 4.00 0.21 0.04 37.65 0.02 1.24 1.04 1.42
Run Code Online (Sandbox Code Playgroud)
如您所见,除了 await 和 %util 高得离谱之外,所有指标都是正常的。所以我认为 /dev/sdb 可能有问题。
但是smartctl没有报告任何有用的信息。
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-358.6.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Western Digital RE4 Serial ATA
Device Model: WDC WD2003FYYS-02W0B1
Serial Number: WD-WMAY04093732
LU WWN Device Id: 5 0014ee 05877b196
Firmware Version: 01.01D02
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sat Sep 28 09:05:30 2013 ICT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (29160) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 283) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 253 253 021 Pre-fail Always - 9100
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 42
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 7373
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 40
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 10
194 Temperature_Celsius 0x0022 123 107 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Run Code Online (Sandbox Code Playgroud)
我被困在这里,不知道解决问题的下一步行动。
任何帮助将不胜感激!
更新:
@迈克尔汉普顿
我的自测日志,没有兴趣信息。 smartctl -l selftest /dev/sdb
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-358.6.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 7380
Run Code Online (Sandbox Code Playgroud)
@米菲
我的pidstat -d 1 30结果。
Linux 2.6.32-358.6.1.el6.x86_64 (cass-23_120.localdomain) 09/28/2013 _x86_64_ (16 CPU)
05:57:43 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:44 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:45 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:46 PM 1555 736.00 0.00 0.00 java
05:57:46 PM 16698 0.00 4.00 0.00 java
05:57:46 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:47 PM 552 0.00 68.00 0.00 jbd2/sda3-8
05:57:47 PM 1555 352.00 0.00 0.00 java
05:57:47 PM 16698 0.00 12.00 0.00 java
05:57:47 PM 18074 0.00 4.00 0.00 java
05:57:47 PM 19295 1564.00 0.00 0.00 java
05:57:47 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:48 PM 1554 3128.00 8.00 4.00 xinetd
05:57:48 PM 1570 840.00 0.00 0.00 gmond
05:57:48 PM 2183 0.00 4.00 0.00 java
05:57:48 PM 2394 64.00 0.00 0.00 rsync
05:57:48 PM 2395 324.00 0.00 0.00 ssh
05:57:48 PM 13280 28.00 0.00 0.00 downloadm_new.s
05:57:48 PM 19295 1724.00 0.00 0.00 java
05:57:48 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:49 PM 19295 1744.00 0.00 0.00 java
05:57:49 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:50 PM 1083 0.00 8.00 0.00 flush-8:0
05:57:50 PM 1086 0.00 8.00 0.00 java
05:57:50 PM 2183 0.00 12.00 0.00 java
05:57:50 PM 13280 388.00 0.00 0.00 downloadm_new.s
05:57:50 PM 18074 0.00 4.00 0.00 java
05:57:50 PM 19295 1728.00 0.00 0.00 java
05:57:50 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:51 PM 2183 0.00 4.00 0.00 java
05:57:51 PM 2400 8.00 0.00 0.00 sleep
05:57:51 PM 18074 0.00 4.00 0.00 java
05:57:51 PM 19295 1680.00 0.00 0.00 java
05:57:51 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:52 PM 552 0.00 28.00 0.00 jbd2/sda3-8
05:57:52 PM 1112 0.00 4.00 0.00 jbd2/sda4-8
05:57:52 PM 2183 0.00 8.00 0.00 java
05:57:52 PM 16698 0.00 4.00 0.00 java
05:57:52 PM 18074 0.00 4.00 0.00 java
05:57:52 PM 19295 1672.00 0.00 0.00 java
05:57:52 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:53 PM 1555 376.00 20.00 0.00 java
05:57:53 PM 1570 792.00 0.00 0.00 gmond
05:57:53 PM 19295 1568.00 8.00 0.00 java
05:57:53 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:54 PM 3734 844.00 188.00 0.00 java
05:57:54 PM 19295 1672.00 0.00 0.00 java
05:57:54 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:55 PM 1083 0.00 20.00 0.00 flush-8:0
05:57:55 PM 3734 2620.00 1156.00 0.00 java
05:57:55 PM 4327 0.00 8.00 0.00 java
05:57:55 PM 9677 0.00 8.00 0.00 java
05:57:55 PM 16613 0.00 8.00 0.00 java
05:57:55 PM 19295 1272.00 8.00 0.00 java
05:57:55 PM 19426 0.00 8.00 0.00 java
05:57:55 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:56 PM 3734 3592.00 1200.00 0.00 java
05:57:56 PM 19295 332.00 0.00 0.00 java
05:57:56 PM 19426 0.00 4.00 0.00 java
05:57:56 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:57 PM 552 0.00 36.00 0.00 jbd2/sda3-8
05:57:57 PM 2405 1068.00 32.00 0.00 java
05:57:57 PM 3734 2972.00 828.00 0.00 java
05:57:57 PM 5457 0.00 8.00 0.00 java
05:57:57 PM 9677 28424.00 144.00 20.00 java
05:57:57 PM 16698 0.00 8.00 0.00 java
05:57:57 PM 18074 0.00 4.00 0.00 java
05:57:57 PM 19295 0.00 4.00 0.00 java
05:57:57 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:58 PM 2183 0.00 4.00 0.00 java
05:57:58 PM 2222 0.00 4.00 0.00 pidstat
05:57:58 PM 2405 500.00 0.00 0.00 java
05:57:58 PM 3734 4016.00 720.00 0.00 java
05:57:58 PM 5457 0.00 8.00 0.00 java
05:57:58 PM 16698 0.00 4.00 0.00 java
05:57:58 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:57:59 PM 1112 0.00 8.00 0.00 jbd2/sda4-8
05:57:59 PM 3734 4572.00 372.00 0.00 java
05:57:59 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:00 PM 1083 0.00 32.00 0.00 flush-8:0
05:58:00 PM 2405 496.00 0.00 0.00 java
05:58:00 PM 3734 5412.00 4.00 0.00 java
05:58:00 PM 5457 0.00 16.00 0.00 java
05:58:00 PM 11681 0.00 8.00 0.00 java
05:58:00 PM 14824 0.00 8.00 0.00 java
05:58:00 PM 16698 0.00 12.00 0.00 java
05:58:00 PM 17694 0.00 8.00 0.00 java
05:58:00 PM 18074 0.00 12.00 0.00 java
05:58:00 PM 18129 0.00 8.00 0.00 java
05:58:00 PM 19542 0.00 8.00 0.00 java
05:58:00 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:01 PM 3734 3888.00 0.00 0.00 java
05:58:01 PM 3813 8.00 12.00 0.00 java
05:58:01 PM 13280 28.00 0.00 0.00 downloadm_new.s
05:58:01 PM 18074 0.00 8.00 0.00 java
05:58:01 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:02 PM 552 0.00 44.00 0.00 jbd2/sda3-8
05:58:02 PM 1129 0.00 16.00 0.00 jbd2/sdb1-8
05:58:02 PM 2405 256.00 0.00 0.00 java
05:58:02 PM 3734 1200.00 1128.00 0.00 java
05:58:02 PM 16698 0.00 4.00 0.00 java
05:58:02 PM 18074 0.00 8.00 0.00 java
05:58:02 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:03 PM 1570 1172.00 0.00 0.00 gmond
05:58:03 PM 2183 0.00 4.00 0.00 java
05:58:03 PM 2405 256.00 0.00 0.00 java
05:58:03 PM 2442 92.00 0.00 0.00 rsync
05:58:03 PM 2443 916.00 0.00 0.00 ssh
05:58:03 PM 3734 576.00 0.00 0.00 java
05:58:03 PM 9677 0.00 4.00 0.00 java
05:58:03 PM 13280 56.00 8996.00 0.00 downloadm_new.s
05:58:03 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:04 PM 2183 0.00 4.00 0.00 java
05:58:04 PM 2405 256.00 0.00 0.00 java
05:58:04 PM 2443 8.00 0.00 0.00 ssh
05:58:04 PM 3734 2032.00 16.00 0.00 java
05:58:04 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:05 PM 1083 0.00 4.00 0.00 flush-8:0
05:58:05 PM 2405 224.00 0.00 0.00 java
05:58:05 PM 2446 160.00 0.00 0.00 sleep
05:58:05 PM 3734 5344.00 648.00 0.00 java
05:58:05 PM 3813 0.00 8.00 0.00 java
05:58:05 PM 13280 1016.00 0.00 0.00 downloadm_new.s
05:58:05 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:06 PM 2405 16.00 0.00 0.00 java
05:58:06 PM 3734 6196.00 344.00 0.00 java
05:58:06 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:07 PM 552 0.00 8.00 0.00 jbd2/sda3-8
05:58:07 PM 2405 112.00 0.00 0.00 java
05:58:07 PM 3734 3532.00 0.00 0.00 java
05:58:07 PM 16698 0.00 4.00 0.00 java
05:58:07 PM 18074 0.00 4.00 0.00 java
05:58:07 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:08 PM 1570 1172.00 0.00 0.00 gmond
05:58:08 PM 2183 0.00 4.00 0.00 java
05:58:08 PM 2405 352.00 0.00 0.00 java
05:58:08 PM 3734 4588.00 0.00 0.00 java
05:58:08 PM 16698 0.00 8.00 0.00 java
05:58:08 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:09 PM 2222 0.00 4.00 0.00 pidstat
05:58:09 PM 2405 368.00 0.00 0.00 java
05:58:09 PM 3734 1720.00 0.00 0.00 java
05:58:09 PM 16698 0.00 4.00 0.00 java
05:58:09 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:10 PM 1083 0.00 8.00 0.00 flush-8:0
05:58:10 PM 2405 480.00 0.00 0.00 java
05:58:10 PM 3734 40.00 16.00 0.00 java
05:58:10 PM 17768 0.00 8.00 0.00 java
05:58:10 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:11 PM 2405 608.00 0.00 0.00 java
05:58:11 PM 3734 264.00 0.00 0.00 java
05:58:11 PM 19426 0.00 4.00 0.00 java
05:58:11 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:12 PM 1129 0.00 24.00 0.00 jbd2/sdb1-8
05:58:12 PM 2405 240.00 0.00 0.00 java
05:58:12 PM 18074 0.00 8.00 0.00 java
05:58:12 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
05:58:13 PM 1570 1172.00 0.00 0.00 gmond
05:58:13 PM 2183 0.00 4.00 0.00 java
05:58:13 PM 2405 128.00 0.00 0.00 java
05:58:13 PM 18074 0.00 4.00 0.00 java
Average: PID kB_rd/s kB_wr/s kB_ccwr/s Command
Average: 552 0.00 6.13 0.00 jbd2/sda3-8
Average: 1083 0.00 2.40 0.00 flush-8:0
Average: 1086 0.00 0.27 0.00 java
Average: 1112 0.00 0.40 0.00 jbd2/sda4-8
Average: 1129 0.00 1.33 0.00 jbd2/sdb1-8
Average: 1554 104.16 0.27 0.13 xinetd
Average: 1570 171.43 0.00 0.00 gmond
Average: 2183 0.00 1.60 0.00 java
Average: 2222 0.00 0.27 0.00 pidstat
Average: 2405 178.49 1.07 0.00 java
Average: 2446 5.33 0.00 0.00 sleep
Average: 3734 1778.49 220.45 0.00 java
Average: 3813 0.27 0.67 0.00 java
Average: 4327 0.00 0.27 0.00 java
Average: 5457 0.00 1.07 0.00 java
Average: 9677 946.52 5.19 0.67 java
Average: 11681 0.00 0.27 0.00 java
Average: 13280 50.48 299.57 0.00 downloadm_new.s
Average: 14824 0.00 0.27 0.00 java
Average: 16613 0.00 0.27 0.00 java
Average: 16698 0.00 2.13 0.00 java
Average: 17694 0.00 0.27 0.00 java
Average: 17768 0.00 0.27 0.00 java
Average: 18074 0.00 2.13 0.00 java
Average: 18129 0.00 0.27 0.00 java
Average: 19295 498.04 0.67 0.00 java
Average: 19426 0.00 0.53 0.00 java
Average: 19542 0.00 0.27 0.00 java
Run Code Online (Sandbox Code Playgroud)
@kworr
这是我的挂载选项/dev/sdb1。
% mount | grep sdb
/dev/sdb1 on /backup type ext4 (rw,noatime,commit=100)
Run Code Online (Sandbox Code Playgroud)
Update2 硬盘的预期 IOPS 数。
7,200 rpm SATA drives HDD ~75-100 IOPS[2] SATA 3 Gb/s
10,000 rpm SATA drives HDD ~125-150 IOPS[2] SATA 3 Gbit/s
10,000 rpm SAS drives HDD ~140 IOPS[2] SAS
15,000 rpm SAS drives HDD ~175-210 IOPS[2] SAS
Run Code Online (Sandbox Code Playgroud)
这里发生了很多事情,但是 pid 3734(一个 java 进程)似乎是罪魁祸首。您应该找出它正在做什么、传递给它的参数是什么、它的父 pid 是什么以及它打算做什么。
在 1 秒样本的 30 秒周期内,java 使用 1778.49 读取 kb/秒,还有其他 java 进程,pid 9677 和 19295 分别使用 946.52 和 498.04 读取 kb/秒。
我无法告诉你他们所做的事情是错还是对,但你的高I/O主要是由于那些java进程造成的。
| 归档时间: |
|
| 查看次数: |
16786 次 |
| 最近记录: |