小编Rom*_*aza的帖子

刷新进程消耗过多 CPU

Server 是 EC2 Instance，它代表从 HTTPD 将文件保存到 NAS (NFS)。

像flush-0:32 这样的进程消耗了超过%90 的CPU 和平均负载：65.50、64.02、66.59。

根据图表，它每天都在增加，而 4 个内核的初始负载平均约为 1.01、2.02、1.80。我在负载均衡器下添加了另一个类似的实例，其 CPU 利用率仅为 %6 ATM。

这些冲洗过程究竟做了什么？

如果客户端只需要写入数据，也许我们应该关闭 NFS 属性缓存？

可能是因为数据包碎片？

以下是一些统计数据nfsstat -s -4：

=================================================================
Server 0:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
715054137   0          0          0          0       

Server nfs v4:
null         compound     
993       0% 715053143 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 143229323  6% 78092765  3% 36693816  1% 
create       delegpurge   delegreturn …

Run Code Online (Sandbox Code Playgroud)

linux amazon-ec2 linux-kernel

Rom*_*aza

2011 12-15

9
推荐指数

1
解决办法

1万
查看次数

服务器上的奇怪进程消耗 CPU

我注意到当前离线的服务器上有 15% 的 CPU 负载。它已通过 TCP 挂载 GlusterFS 卷。从顶部看，它向我展示了它的 glusterfs。在那之后，我试图弄清楚到底是什么在使用它，我得到了这个：

# lsof /storage/
COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF                NODE NAME
find    16433 nobody  cwd    DIR   0,19     8192 9259265867489333824 /storage/200000/200000/200700/200704/08

Run Code Online (Sandbox Code Playgroud)

然后：

# ps uax | grep find
root     16415  0.0  0.0   4400   724 ?        SN   06:34   0:00 /bin/sh /usr/bin/updatedb.findutils
root     16423  0.0  0.0   4400   336 ?        SN   06:34   0:00 /bin/sh /usr/bin/updatedb.findutils
nobody   16431  0.0  0.0  39524  1376 ?        SN   06:34   0:00 su nobody -s /bin/sh -c /usr/bin/find / -ignore_readdir_race …

Run Code Online (Sandbox Code Playgroud)

linux glusterfs

Rom*_*aza

2012 10-11

5
推荐指数

1
解决办法

2476
查看次数

平均负载为 50，而 CPU 利用率为 %60

我们使用 EC2 Auto Scaling，最近决定将实例类型从 m2.2xlarge 更改为 c1.xlarge（High Memory 到 High CPU），因为每个实例使用的 RAM 平均量为 2G，因此我们不需要 m2.2xlarge 提供的 34G ，并且以相同的价格拥有更多 c1.xlarge 的 CPU 能力将是个好主意。

但是在切换到 c1.xlarge 之后，我们遇到了问题：

平均负载变为 50，而 CPU 利用率从 %70 下降到 %60。
从 6 个实例扩展到 4 个实例不会影响 CPU 利用率 Cloud Watch 指标。
响应时间似乎很慢，并且由于 ELB 健康检查，实例一直在用 Auto Scaling 替换。
由于 CPU 利用率下降，Auto Scaling 将实例数量从 8 个减少到 4 个。

你能解释一下这种行为的原因是什么，我能用它做什么？

EC2 实例类型信息：

高内存双特大实例

34.2 GB 内存 13 个 EC2 计算单元（4 个虚拟内核，每个虚拟内核 3.25 个 EC2 计算单元） 850 GB 实例存储 64 位平台 I/O 性能：高 …

linux amazon-ec2 amazon-web-services

Rom*_*aza

2012 02-17

2
推荐指数

1
解决办法

5100
查看次数

已安装 NFS 设备上的“设备上没有剩余空间”

在 NFS 服务器上：

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/xvdf2           103212320  85090308  12879132  87% /export18

Run Code Online (Sandbox Code Playgroud)

在客户端服务器上：

ip-xxxxxxxx.ap-northeast-1.compute.internal:/export18
                 103212320  85090304  12879136  87% /export18

Run Code Online (Sandbox Code Playgroud)

但是，如果我尝试创建文件，则会收到以下消息：

touch: cannot touch `/export18/test': No space left on device

Run Code Online (Sandbox Code Playgroud)

我已经卸载了卷并在其上运行 fsck：

fsck -t ext3 /dev/xvdf2
fsck from util-linux-ng 2.17.2
e2fsck 1.41.14 (22-Dec-2010)
/dev/xvdf2 has gone 484 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: …

Run Code Online (Sandbox Code Playgroud)

linux network-attached-storage nfs amazon-ec2

Rom*_*aza

2013 01-09

2
推荐指数

1
解决办法

5242
查看次数