查找使用过的硬盘数据并解决磁盘压力

AVa*_*arf 9 hard-drive disk-space-utilization ubuntu-18.04

我有一台运行 Ubuntu 18.04 的服务器,它也是 K8s 的工作节点。有时我看到 K8s 由于磁盘压力而杀死了这台机器上的 pod,当我得到时,df -h --total我可以看到 85% (1.5T) 的磁盘正在使用/

~$ df -h --total
Filesystem      Size  Used Avail Use% Mounted on
udev            126G     0  126G   0% /dev
tmpfs            26G  5.3M   26G   1% /run
/dev/sda2       1.8T  1.5T  276G  85% /
tmpfs           126G     0  126G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           126G     0  126G   0% /sys/fs/cgroup
/dev/loop0       90M   90M     0 100% /snap/core/7917
/dev/loop1       90M   90M     0 100% /snap/core/8039
/dev/sdb1       9.8G  203M  9.1G   3% /boot
/dev/sdb2       511M  6.1M  505M   2% /boot/efi
/dev/sdb3       1.8T  100M  1.7T   1% /home
/dev/loop2      128K  128K     0 100% /snap/austin/42
/dev/loop3      3.0M  3.0M     0 100% /snap/micro/648
tmpfs            26G     0   26G   0% /run/user/1001
total           4.0T  1.5T  2.4T  38% -
Run Code Online (Sandbox Code Playgroud)

问题是当我去/获取时,sudo du -BG -s *我只能找到 313G 的已用数据,仅此而已:

/$ sudo du -BG -s *
1G  bin
1G  boot
0G  dev
1G  etc
1G  home
0G  initrd.img
0G  initrd.img.old
1G  lib
1G  lib64
1G  lost+found
1G  media
1G  mnt
1G  opt
du: cannot access 'proc/22512/task/22580/fdinfo/20': No such file or directory
du: cannot access 'proc/45752/task/45752/fd/4': No such file or directory
du: cannot access 'proc/45752/task/45752/fdinfo/4': No such file or directory
du: cannot access 'proc/45752/fd/3': No such file or directory
du: cannot access 'proc/45752/fdinfo/3': No such file or directory
0G  proc
1G  root
1G  run
1G  sbin
1G  snap
1G  srv
9G  swap.img
0G  sys
1G  tmp
3G  usr
313G    var
0G  vmlinuz
0G  vmlinuz.old
Run Code Online (Sandbox Code Playgroud)

如何找到其余数据并解决磁盘压力问题?

更新

我的问题/问题与建议的解决方案不同。在那种情况下,问题是删除了文件,但我的问题是 docker。我发布了一个答案,所以我可以关闭这个问题。

AVa*_*arf 10

我找到了一种方法来lsof向我展示使用过的文件列表并在https://unix.stackexchange.com/a/382696/380398对它们进行排序

sudo lsof \
| grep REG \
| grep -v "stat: No such file or directory" \
| grep -v DEL \
| awk '{if ($NF=="(deleted)") {x=3;y=1} else {x=2;y=0}; {print $(NF-x) "  " $(NF-y) } }'  \
| sort -n -u  \
| numfmt  --field=1 --to=iec
Run Code Online (Sandbox Code Playgroud)

当我使用它时,我得到了:

118M  /usr/bin/kubelet
168M  /var/lib/docker/containers/ce98aeb3e061c31e81d232933fa21f055169924cd0411ec276d51ae008dbb993/ce98aeb3e061c31e81d232933fa21f055169924cd0411ec276d51ae008dbb993-json.log
185M  /var/lib/docker/containers/933c29608da9d954dc941fc741ffe0b012e6ec55a8befa95b8487f2367596577/933c29608da9d954dc941fc741ffe0b012e6ec55a8befa95b8487f2367596577-json.log
207M  /var/lib/docker/containers/2d4c2967fe22b1eb79b234e465f36ad062c8f390659c2f2f42ad31636be8a1be/2d4c2967fe22b1eb79b234e465f36ad062c8f390659c2f2f42ad31636be8a1be-json.log
272M  /var/lib/docker/containers/4b8daa87cda051a3b2bfd1b89c70763dca990b65b0eb211260f0e6d92b972da9/4b8daa87cda051a3b2bfd1b89c70763dca990b65b0eb211260f0e6d92b972da9-json.log
343M  /var/lib/docker/containers/52cb2d7fceb6bef7a01f7e5c666cb05e0eb62537d54a9b8da8865eba9e51c728/52cb2d7fceb6bef7a01f7e5c666cb05e0eb62537d54a9b8da8865eba9e51c728-json.log
1.1G  /var/lib/docker/containers/fe2c73fd47b37a7a5e70bd1f07508bec7dad024c75b859d933b6fa5bba649f18/fe2c73fd47b37a7a5e70bd1f07508bec7dad024c75b859d933b6fa5bba649f18-json.log
1.1G  /var/lib/docker/containers/8887ea0b31603e0a5b21c934ce06bb4a35133df2367eccb5ad9e2a07eb884bd3/8887ea0b31603e0a5b21c934ce06bb4a35133df2367eccb5ad9e2a07eb884bd3-json.log
42G  /var/lib/docker/containers/1f7180db9e41b66f3646bdf021644b23c1a954830191807532af813f5aa5cde6/1f7180db9e41b66f3646bdf021644b23c1a954830191807532af813f5aa5cde6-json.log
83G  /var/lib/docker/containers/a456e37303998844207c79fc3cdb63878765d7a3151c35051cb071545c75cec7/a456e37303998844207c79fc3cdb63878765d7a3151c35051cb071545c75cec7-json.log
220G  /var/lib/docker/containers/60aad026e90035790ff5f6f1ad714e6187bec5dfeb5b1d3156b7cda1d00cc251/60aad026e90035790ff5f6f1ad714e6187bec5dfeb5b1d3156b7cda1d00cc251-json.log
260G  /var/lib/docker/containers/52c866da942a3228ba56265210ef4f13fbc96ebc1c0214501df189901a829414/52c866da942a3228ba56265210ef4f13fbc96ebc1c0214501df189901a829414-json.log
560G  /var/lib/docker/containers/f56a9853ef993ce3843a2d6acf5c9603a283e64fb4b81d6523342c6ad03243ad/f56a9853ef993ce3843a2d6acf5c9603a283e64fb4b81d6523342c6ad03243ad-json.log
Run Code Online (Sandbox Code Playgroud)

正确总结为 1.5T(如果我还添加了我以前可以看到的其他内容)。