如何使用 kdump/crash 来调查 OOM 问题?

chu*_*utz 12 troubleshooting server-crashes memory-leak oom centos6

问题

服务器在多次“内存不足”消息后崩溃,我试图查明罪魁祸首。如果它在用户空间 - 哪个进程。如果它在内核中 - 哪个内核模块。

细节

我试图找出如何使用崩溃实用程序来调查是什么触发了服务器上的 OOM。

作为安装一对新服务器的一部分,我开始了一个 14TB DRBD设备的初始化。大约在那个时候,在使用 DRBD 同步器速率配置并启动和关闭一些绑定的网络接口时,其中一台服务器崩溃了。在 30 秒的时间内,它产生了 39 条Out of memory: Kill process ####消息。然后它崩溃了:

Kernel panic - not syncing: Out of memory and no killable processes...
Run Code Online (Sandbox Code Playgroud)

系统崩溃触发了kdump。现在我有一个不错的vmcore.flat文件,应该可以直接使用它来调查问题,但是我很难找出所有内存的去向。

我知道的唯一资源是Dedoimedo 的网站,它有很好的说明,以及内核崩溃手册。这些也恰好是答案中建议的唯一资源,所以我认为这crash是唯一的调查方法。

如果有另一种方式对事件进行事后分析,我愿意接受。这只是crash我所知道的唯一实用程序。我现在拥有的只是vmcore.flat文件,我需要知道的是哪个组件占用了所有内存。我怀疑是内核模块问题,更具体地说是绑定模块之一(因为它在我关闭接口时被触发)、DRBD 模块(在 CentOS 6.3 上用树构建的版本 8.3.15),或其中一个10G 以太网模块(mlnx_en构建在树外,即我关闭的接口,或树内bnx2x,即保持活动状态的接口)。我只需要知道是否有办法验证我的怀疑。

到目前为止,我只设法使用crash实用程序提取了以下信息:

检查使用了多少内存

$ crash /usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux vmcore.flat
....
crash> kmem -i
              PAGES        TOTAL      PERCENTAGE
 TOTAL MEM  16482587      62.9 GB         ----
      FREE    54610     213.3 MB    0% of TOTAL MEM
      USED  16427977      62.7 GB   99% of TOTAL MEM
    SHARED     4683      18.3 MB    0% of TOTAL MEM
   BUFFERS      118       472 KB    0% of TOTAL MEM
    CACHED       82       328 KB    0% of TOTAL MEM
      SLAB    46635     182.2 MB    0% of TOTAL MEM

TOTAL SWAP        0            0         ----
 SWAP USED        0            0  100% of TOTAL SWAP
 SWAP FREE        0            0    0% of TOTAL SWAP
Run Code Online (Sandbox Code Playgroud)

显然,它耗尽了内存。所有 64G 都没有了……但是在哪里?

尝试查看是否有任何进程正在泄漏内存

唯一似乎相关psps命令是(这是 的子命令crash)。它没有显示任何异常,但它也没有显示内核线程。

crash> ps
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
      0      0   0  ffffffff81a8d020  RU   0.0       0      0  [swapper]
>     0      0   1  ffff88102c456040  RU   0.0       0      0  [swapper]
>     0      0   2  ffff88082c772aa0  RU   0.0       0      0  [swapper]
>     0      0   3  ffff88102c456aa0  RU   0.0       0      0  [swapper]
      0      0   4  ffff88082c7b8ae0  RU   0.0       0      0  [swapper]
>     0      0   5  ffff88102c457500  RU   0.0       0      0  [swapper]
>     0      0   6  ffff88082c7d6aa0  RU   0.0       0      0  [swapper]
>     0      0   7  ffff88102c506080  RU   0.0       0      0  [swapper]
>     0      0   8  ffff88082c016ae0  RU   0.0       0      0  [swapper]
>     0      0   9  ffff88102c506ae0  RU   0.0       0      0  [swapper]
>     0      0  10  ffff88082c05caa0  RU   0.0       0      0  [swapper]
>     0      0  11  ffff88102c507540  RU   0.0       0      0  [swapper]
>     0      0  12  ffff88082c09cae0  RU   0.0       0      0  [swapper]
.....
   4926      1   5  ffff880828a38ae0  ??   0.0       0      0  mingetty
   4928      1   1  ffff88102a4e8040  ??   0.0       0      0  mingetty
   4930      1  19  ffff880827af4080  ??   0.0       0      0  mingetty
   4932      1   2  ffff88100f122040  ??   0.0       0      0  mingetty
   4934      1  18  ffff8810296ea080  ??   0.0       0      0  mingetty
   4936   1047   4  ffff880ff342d540  IN   0.0   11184    948  udevd
   4937   1047   5  ffff88082a240080  IN   0.0   11184    948  udevd
   5060   3772   2  ffff88082881d540  ??   0.0       0      0  sshd
   5078      1   1  ffff88100f060ae0  ??   0.0       0      0  sshd
   5079      1   1  ffff88082b882ae0  ??   0.0       0      0  bash
Run Code Online (Sandbox Code Playgroud)

如果我取出内核线程(无论如何它们都显示 %MEM 为零),我们可以看到我最后几乎没有运行任何东西:

crash> ps -u
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
      1      0   1  ffff88082c41b500  ??   0.0   19348    348  init
   1047      1   2  ffff881029524040  IN   0.0   11188    948  udevd
   3171      1   3  ffff880826ccaaa0  IN   0.0   27636    240  auditd
   3172      1  17  ffff881029d1b500  IN   0.0   27636    240  auditd
>  3772      1   0  ffff88102b257500  RU   0.0   64072    668  sshd
   4800      1   0  ffff88100f061540  ??   0.0       0      0  dsm_om_shrsvcd
   4842      1  16  ffff88100f012ae0  ??   0.0       0      0  cmcld
   4854      1  17  ffff88082a241540  ??   0.0       0      0  cmlogd
   4855      1   3  ffff88082796cae0  ??   0.0       0      0  cmfileassistd
   4856      1  18  ffff88082809d500  ??   0.0       0      0  cmnetd
   4860      1   0  ffff88082705aae0  ??   0.0       0      0  cmresourced
   4924      1   9  ffff88102a4e8aa0  ??   0.0       0      0  mingetty
   4926      1   5  ffff880828a38ae0  ??   0.0       0      0  mingetty
   4928      1   1  ffff88102a4e8040  ??   0.0       0      0  mingetty
   4930      1  19  ffff880827af4080  ??   0.0       0      0  mingetty
   4932      1   2  ffff88100f122040  ??   0.0       0      0  mingetty
   4934      1  18  ffff8810296ea080  ??   0.0       0      0  mingetty
   4936   1047   4  ffff880ff342d540  IN   0.0   11184    948  udevd
   4937   1047   5  ffff88082a240080  IN   0.0   11184    948  udevd
   5060   3772   2  ffff88082881d540  ??   0.0       0      0  sshd
   5078      1   1  ffff88100f060ae0  ??   0.0       0      0  sshd
   5079      1   1  ffff88082b882ae0  ??   0.0       0      0  bash
   5257      1   1  ffff8808279e6aa0  ??   0.0       0      0  jnx_mlnxsnmp_da
Run Code Online (Sandbox Code Playgroud)

更新:

按照 Soham 的建议,包括更多输出。不幸的是,我无法从中得出任何进一步的结论。我能做的最好的事情就是怀疑内核内存泄漏,因为无论如何用户级进程几乎都死了。

的(几乎完整)输出在log -m这里

crash> ps -G | tail -n +2 | cut -b2- | gawk '{mem += $8} END {print "total " mem/1048576 "GB"}'
total 0.00391006GB
Run Code Online (Sandbox Code Playgroud)

请注意,此时几乎所有用户级进程都已停止,因此预计报告的使用率较低。

内存不足消息:

正如我上面提到的,有 39 条“内存不足:”消息,它们是:

crash> log -m | grep Out
<3>[  223.556616] Out of memory: Kill process 3189 (portreserve) score 1 or sacrifice child
<3>[  223.787234] Out of memory: Kill process 3196 (rsyslogd) score 1 or sacrifice child
<3>[  224.237119] Out of memory: Kill process 3728 (dbus-daemon) score 1 or sacrifice child
<3>[  228.771770] Out of memory: Kill process 3758 (snmpd) score 1 or sacrifice child
<3>[  229.033466] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child
<3>[  229.257710] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child
<3>[  229.484321] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child
<3>[  229.711169] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child
<3>[  229.934955] Out of memory: Kill process 3801 (cmproxyd) score 1 or sacrifice child
<3>[  230.159542] Out of memory: Kill process 3812 (ntpd) score 1 or sacrifice child
<3>[  230.382083] Out of memory: Kill process 3953 (master) score 1 or sacrifice child
<3>[  230.606613] Out of memory: Kill process 3953 (master) score 1 or sacrifice child
<3>[  230.829515] Out of memory: Kill process 3953 (master) score 1 or sacrifice child
<3>[  230.832105] Out of memory: Kill process 3961 (crond) score 1 or sacrifice child
<3>[  236.749746] Out of memory: Kill process 3974 (atd) score 1 or sacrifice child
<3>[  236.969421] Out of memory: Kill process 4272 (dsm_sa_datamgrd) score 1 or sacrifice child
<3>[  237.192102] Out of memory: Kill process 4492 (dsm_sa_datamgrd) score 1 or sacrifice child
<3>[  237.746301] Out of memory: Kill process 4552 (dsm_sa_eventmgr) score 1 or sacrifice child
<3>[  237.968308] Out of memory: Kill process 4613 (dsm_sa_snmpd) score 1 or sacrifice child
<3>[  238.190550] Out of memory: Kill process 4614 (dsm_sa_snmpd) score 1 or sacrifice child
<3>[  238.644020] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child
<3>[  238.865658] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child
<3>[  251.285450] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child
<3>[  251.506601] Out of memory: Kill process 4800 (dsm_om_shrsvcd) score 1 or sacrifice child
<3>[  251.727570] Out of memory: Kill process 4842 (cmcld) score 1 or sacrifice child
<3>[  251.947085] Out of memory: Kill process 4842 (cmcld) score 1 or sacrifice child
<3>[  252.167096] Out of memory: Kill process 4854 (cmlogd) score 1 or sacrifice child
<3>[  252.384090] Out of memory: Kill process 4855 (cmfileassistd) score 1 or sacrifice child
<3>[  252.603324] Out of memory: Kill process 4924 (mingetty) score 1 or sacrifice child
<3>[  252.820757] Out of memory: Kill process 4926 (mingetty) score 1 or sacrifice child
<3>[  253.037558] Out of memory: Kill process 4928 (mingetty) score 1 or sacrifice child
<3>[  253.254908] Out of memory: Kill process 4930 (mingetty) score 1 or sacrifice child
<3>[  253.257391] Out of memory: Kill process 4932 (mingetty) score 1 or sacrifice child
<3>[  253.259357] Out of memory: Kill process 4934 (mingetty) score 1 or sacrifice child
<3>[  253.261353] Out of memory: Kill process 5060 (sshd) score 1 or sacrifice child
<3>[  253.263365] Out of memory: Kill process 5060 (sshd) score 1 or sacrifice child
<3>[  253.264392] Out of memory: Kill process 5079 (bash) score 1 or sacrifice child
<3>[  253.266352] Out of memory: Kill process 5257 (jnx_mlnxsnmp_da) score 1 or sacrifice child
<0>[  253.529344] Kernel panic - not syncing: Out of memory and no killable processes...
Run Code Online (Sandbox Code Playgroud)

系统输出:

crash> sys
      KERNEL: /usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux
    DUMPFILE: pcdata03.vmcore.flat  [PARTIAL DUMP]
        CPUS: 32
        DATE: Wed Feb  6 02:11:52 2013
      UPTIME: 00:04:12
LOAD AVERAGE: 3.03, 0.95, 0.34
       TASKS: 578
    NODENAME: ....
     RELEASE: 2.6.32-279.5.2.el6.x86_64
     VERSION: #1 SMP Fri Aug 24 01:07:11 UTC 2012
     MACHINE: x86_64  (2700 Mhz)
      MEMORY: 64 GB
       PANIC: "[  253.529344] Kernel panic - not syncing: Out of memory and no killable processes..."
Run Code Online (Sandbox Code Playgroud)

kmem -z

crash> kmem -z
NODE: 0  ZONE: 0  ADDR: ffff88000000a0c0  NAME: "DMA"
  SIZE: 4095  PRESENT: 3839  MIN/LOW/HIGH: 5/6/7
  VM_STAT:
          NR_FREE_PAGES: 3936
       NR_INACTIVE_ANON: 0
         NR_ACTIVE_ANON: 0
       NR_INACTIVE_FILE: 0
         NR_ACTIVE_FILE: 0
         NR_UNEVICTABLE: 0
               NR_MLOCK: 0
          NR_ANON_PAGES: 0
         NR_FILE_MAPPED: 0
          NR_FILE_PAGES: 0
          NR_FILE_DIRTY: 0
           NR_WRITEBACK: 0
    NR_SLAB_RECLAIMABLE: 0
  NR_SLAB_UNRECLAIMABLE: 0
           NR_PAGETABLE: 0
        NR_KERNEL_STACK: 0
        NR_UNSTABLE_NFS: 0
              NR_BOUNCE: 0
        NR_VMSCAN_WRITE: 0
    NR_VMSCAN_IMMEDIATE: 0
      NR_WRITEBACK_TEMP: 0
       NR_ISOLATED_ANON: 0
       NR_ISOLATED_FILE: 0
               NR_SHMEM: 0
               NUMA_HIT: 0
              NUMA_MISS: 0
           NUMA_FOREIGN: 0
    NUMA_INTERLEAVE_HIT: 0
             NUMA_LOCAL: 0
             NUMA_OTHER: 0
NR_ANON_TRANSPARENT_HUGEPAGES: 0

NODE: 0  ZONE: 1  ADDR: ffff880000012780  NAME: "DMA32"
  SIZE: 1044480  PRESENT: 756520  MIN/LOW/HIGH: 1030/1287/1545
  VM_STAT:
          NR_FREE_PAGES: 30117
       NR_INACTIVE_ANON: 0
         NR_ACTIVE_ANON: 0
       NR_INACTIVE_FILE: 1
         NR_ACTIVE_FILE: 0
         NR_UNEVICTABLE: 0
               NR_MLOCK: 0
          NR_ANON_PAGES: 0
         NR_FILE_MAPPED: 0
          NR_FILE_PAGES: 1
          NR_FILE_DIRTY: 0
           NR_WRITEBACK: 0
    NR_SLAB_RECLAIMABLE: 4
  NR_SLAB_UNRECLAIMABLE: 4150
           NR_PAGETABLE: 0
        NR_KERNEL_STACK: 0
        NR_UNSTABLE_NFS: 0
              NR_BOUNCE: 0
        NR_VMSCAN_WRITE: 0
    NR_VMSCAN_IMMEDIATE: 0
      NR_WRITEBACK_TEMP: 0
       NR_ISOLATED_ANON: 0
       NR_ISOLATED_FILE: 0
               NR_SHMEM: 0
               NUMA_HIT: 575606
              NUMA_MISS: 3
           NUMA_FOREIGN: 0
    NUMA_INTERLEAVE_HIT: 0
             NUMA_LOCAL: 575598
             NUMA_OTHER: 11
NR_ANON_TRANSPARENT_HUGEPAGES: 0

NODE: 0  ZONE: 2  ADDR: ffff88000001ae40  NAME: "Normal"
  SIZE: 7602176  PRESENT: 7498240  MIN/LOW/HIGH: 10217/12771/15325
  VM_STAT:
          NR_FREE_PAGES: 10443
       NR_INACTIVE_ANON: 134
         NR_ACTIVE_ANON: 197
       NR_INACTIVE_FILE: -47
         NR_ACTIVE_FILE: 42
         NR_UNEVICTABLE: 0
               NR_MLOCK: 0
          NR_ANON_PAGES: 219
         NR_FILE_MAPPED: 115
          NR_FILE_PAGES: 45
          NR_FILE_DIRTY: 0
           NR_WRITEBACK: 0
    NR_SLAB_RECLAIMABLE: 908
  NR_SLAB_UNRECLAIMABLE: 18771
           NR_PAGETABLE: 91
        NR_KERNEL_STACK: 556
        NR_UNSTABLE_NFS: 0
              NR_BOUNCE: 0
        NR_VMSCAN_WRITE: 0
    NR_VMSCAN_IMMEDIATE: 0
      NR_WRITEBACK_TEMP: 0
       NR_ISOLATED_ANON: 0
       NR_ISOLATED_FILE: 0
               NR_SHMEM: 34
               NUMA_HIT: 8243991
              NUMA_MISS: 648
           NUMA_FOREIGN: 4593726
    NUMA_INTERLEAVE_HIT: 20066
             NUMA_LOCAL: 8243829
             NUMA_OTHER: 810
NR_ANON_TRANSPARENT_HUGEPAGES: 0

NODE: 0  ZONE: 3  ADDR: ffff880000023500  NAME: "Movable"
  [unpopulated]

NODE: 1  ZONE: 0  ADDR: ffff880840000040  NAME: "DMA"
  [unpopulated]

NODE: 1  ZONE: 1  ADDR: ffff880840008700  NAME: "DMA32"
  [unpopulated]

NODE: 1  ZONE: 2  ADDR: ffff880840010dc0  NAME: "Normal"
  SIZE: 8388608  PRESENT: 8273920  MIN/LOW/HIGH: 11274/14092/16911
  VM_STAT:
          NR_FREE_PAGES: 10114
       NR_INACTIVE_ANON: 417
         NR_ACTIVE_ANON: 83
       NR_INACTIVE_FILE: 47
         NR_ACTIVE_FILE: 32
         NR_UNEVICTABLE: 0
               NR_MLOCK: 0
          NR_ANON_PAGES: 436
         NR_FILE_MAPPED: 22
          NR_FILE_PAGES: 154
          NR_FILE_DIRTY: 0
           NR_WRITEBACK: 0
    NR_SLAB_RECLAIMABLE: 863
  NR_SLAB_UNRECLAIMABLE: 21939
           NR_PAGETABLE: 134
        NR_KERNEL_STACK: 27
        NR_UNSTABLE_NFS: 0
              NR_BOUNCE: 0
        NR_VMSCAN_WRITE: 3
    NR_VMSCAN_IMMEDIATE: 5
      NR_WRITEBACK_TEMP: 0
       NR_ISOLATED_ANON: 0
       NR_ISOLATED_FILE: 23
               NR_SHMEM: 20
               NUMA_HIT: 4332488
              NUMA_MISS: 4593726
           NUMA_FOREIGN: 665
    NUMA_INTERLEAVE_HIT: 20007
             NUMA_LOCAL: 4309300
             NUMA_OTHER: 4616914
NR_ANON_TRANSPARENT_HUGEPAGES: 0

NODE: 1  ZONE: 3  ADDR: ffff880840019480  NAME: "Movable"
  [unpopulated]
Run Code Online (Sandbox Code Playgroud)

kmem -f

crash> kmem -f
NODE
  0
ZONE  NAME        SIZE    FREE      MEM_MAP       START_PADDR  START_MAPNR
  0   DMA         4095    3936  ffffea0000000038      1000          0     
AREA    SIZE  FREE_AREA_STRUCT  BLOCKS  PAGES
  0       4k  ffff880000012128       2      2
  0       4k  ffff880000012138       0      0
  0       4k  ffff880000012148       0      0
  0       4k  ffff880000012158       0      0
  0       4k  ffff880000012168       0      0
  1       8k  ffff880000012180       1      2
  1       8k  ffff880000012190       0      0
  1       8k  ffff8800000121a0       0      0
  1       8k  ffff8800000121b0       0      0
  1       8k  ffff8800000121c0       0      0
  2      16k  ffff8800000121d8       1      4
  2      16k  ffff8800000121e8       0      0
  2      16k  ffff8800000121f8       0      0
  2      16k  ffff880000012208       0      0
  2      16k  ffff880000012218       0      0
  3      32k  ffff880000012230       1      8
  3      32k  ffff880000012240       0      0
  3      32k  ffff880000012250       0      0
  3      32k  ffff880000012260       0      0
  3      32k  ffff880000012270       0      0
  4      64k  ffff880000012288       1     16
  4      64k  ffff880000012298       0      0
  4      64k  ffff8800000122a8       0      0
  4      64k  ffff8800000122b8       0      0
  4      64k  ffff8800000122c8       0      0
  5     128k  ffff8800000122e0       0      0
  5     128k  ffff8800000122f0       0      0
  5     128k  ffff880000012300       0      0
  5     128k  ffff880000012310       0      0
  5     128k  ffff880000012320       0      0
  6     256k  ffff880000012338       1     64
  6     256k  ffff880000012348       0      0
  6     256k  ffff880000012358       0      0
  6     256k  ffff880000012368       0      0
  6     256k  ffff880000012378       0      0
  7     512k  ffff880000012390       0      0
  7     512k  ffff8800000123a0       0      0
  7     512k  ffff8800000123b0       0      0
  7     512k  ffff8800000123c0       0      0
  7     512k  ffff8800000123d0       0      0
  8    1024k  ffff8800000123e8       1    256
  8    1024k  ffff8800000123f8       0      0
  8    1024k  ffff880000012408       0      0
  8    1024k  ffff880000012418       0      0
  8    1024k  ffff880000012428       0      0
  9    2048k  ffff880000012440       0      0
  9    2048k  ffff880000012450       0      0
  9    2048k  ffff880000012460       0      0
  9    2048k  ffff880000012470       1    512
  9    2048k  ffff880000012480       0      0
 10    4096k  ffff880000012498       0      0
 10    4096k  ffff8800000124a8       0      0
 10    4096k  ffff8800000124b8       3   3072
 10    4096k  ffff8800000124c8       0      0
 10    4096k  ffff8800000124d8       0      0

ZONE  NAME        SIZE    FREE      MEM_MAP       START_PADDR  START_MAPNR
  1   DMA32     1044480   30117  ffffea0000038000    1000000        4095   
AREA    SIZE  FREE_AREA_STRUCT  BLOCKS  PAGES
  0       4k  ffff88000001a7e8      24     24
  0       4k  ffff88000001a7f8       4      4
  0       4k  ffff88000001a808      13     13
  0       4k  ffff88000001a818       0      0
  0       4k  ffff88000001a828       0      0
  1       8k  ffff88000001a840       2      4
  1       8k  ffff88000001a850       2      4
  1       8k  ffff88000001a860       4      8
  1       8k  ffff88000001a870       0      0
  1       8k  ffff88000001a880       0      0
  2      16k  ffff88000001a898       0      0
  2      16k  ffff88000001a8a8       3     12
  2      16k  ffff88000001a8b8       4     16
  2      16k  ffff88000001a8c8       0      0
  2      16k  ffff88000001a8d8       0      0
  3      32k  ffff88000001a8f0       0      0
  3      32k  ffff88000001a900       3     24
  3      32k  ffff88000001a910       3     24
  3      32k  ffff88000001a920       0      0
  3      32k  ffff88000001a930       0      0
  4      64k  ffff88000001a948       1     16
  4      64k  ffff88000001a958       3     48
  4      64k  ffff88000001a968       6     96
  4      64k  ffff88000001a978       0      0
  4      64k  ffff88000001a988       0      0
  5     128k  ffff88000001a9a0       0      0
  5     128k  ffff88000001a9b0       3     96
  5     128k  ffff88000001a9c0       7    224
  5     128k  ffff88000001a9d0       0      0
  5     128k  ffff88000001a9e0       0      0
  6     256k  ffff88000001a9f8       0      0
  6     256k  ffff88000001aa08       1     64
  6     256k  ffff88000001aa18       6    384
  6     256k  ffff88000001aa28       0      0
  6     256k  ffff88000001aa38       0      0
  7     512k  ffff88000001aa50       1    128
  7     512k  ffff88000001aa60       0      0
  7     512k  ffff88000001aa70       8   1024
  7     512k  ffff88000001aa80       0      0
  7     512k  ffff88000001aa9

小智 8

检查前 20 个最大的物理内存消耗者(常驻集大小)。

crash> ps -G | sed 's/>//g' | sort -k 8,8 -n | awk '$8 ~ /[0-9]/{ $8 = $8/1024" MB"; print }' | tail -20
Run Code Online (Sandbox Code Playgroud)

检查大页面的数量。

crash> p -d nr_huge_pages
Run Code Online (Sandbox Code Playgroud)

更新:

A) 故障转储是从以下内核版本中捕获的。

$ crash --osrelease vmcore.flat 
2.6.32-279.5.2.el6.x86_64                    
Run Code Online (Sandbox Code Playgroud)

B) 让我们从 kernel-debug-debuginfo 包中提取 vmlinux 文件。

$ rpm2cpio kernel-debug-debuginfo-2.6.32-279.5.2.el6.x86_64.rpm | \
  cpio -idv ./usr/lib/debug/lib/modules/*/vmlinux
Run Code Online (Sandbox Code Playgroud)

C) 使用崩溃实用程序打开 vmcore 文件。

$ bunzip2 vmcore.flat.bz2 
$ crash vmcore.flat ./usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux
Run Code Online (Sandbox Code Playgroud)

D) 系统信息。

crash> sys
      KERNEL: ./usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux
    DUMPFILE: vmcore.flat  [PARTIAL DUMP]
        CPUS: 32
        DATE: Tue Feb  5 12:11:52 2013
      UPTIME: 00:04:12
LOAD AVERAGE: 3.03, 0.95, 0.34
       TASKS: 578
    NODENAME: ...
     RELEASE: 2.6.32-279.5.2.el6.x86_64
     VERSION: #1 SMP Fri Aug 24 01:07:11 UTC 2012
     MACHINE: x86_64  (2700 Mhz)
      MEMORY: 64 GB
       PANIC: "[  253.529344] Kernel panic - not syncing: Out of memory and no killable processes..."
Run Code Online (Sandbox Code Playgroud)

a) 由于内存不足而发生恐慌,但系统上禁用了“panic_on_oom”参数。 

crash> p -d sysctl_panic_on_oom
sysctl_panic_on_oom = $6 = 0
Run Code Online (Sandbox Code Playgroud)

此参数启用或禁用内存不足时的恐慌功能。如果设置为 0,内核将杀死一些称为 oom_killer 的流氓进程。通常,oom_killer 可以杀死流氓进程,系统会存活下来。如果将其设置为 1,则发生内存不足时内核会发生混乱。

b) 那么,我们是如何在 oom 事件发生时捕获 vmcore 的?

好吧,让我们检查一下 mm/oom_kill.c 源代码。它说如果系统上没有任何东西可以杀死,那么只需挂起或恐慌。

++++++
499         /* Found nothing?!?! Either we hang forever, or we panic. */   
500         if (!p) {                                                  
501             read_unlock(&tasklist_lock);                                     
502             cpuset_unlock();                                        
503             panic("Out of memory and no killable processes...\n");  <<<------  
504         }                                                       
505 
++++++
Run Code Online (Sandbox Code Playgroud)

所以我们进入了恐慌状态,因为在这个系统上启用了 kdump 服务,vmcore 被捕获。

E) 让我们检查内核环形缓冲区, 

crash> log
[..]
[  253.351427] Node 0 DMA free:15744kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15356kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[  253.352234] lowmem_reserve[]: 0 2955 32245 32245
[  253.352812] Node 0 DMA32 free:120436kB min:4120kB low:5148kB high:6180kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:32kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3026080kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:20kB slab_unreclaimable:16600kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1 all_unreclaimable? no
[  253.353637] lowmem_reserve[]: 0 0 29290 29290
[  253.354216] Node 0 Normal free:40580kB min:40868kB low:51084kB high:61300kB active_anon:956kB inactive_anon:536kB active_file:260kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:29992960kB mlocked:0kB dirty:0kB writeback:0kB mapped:460kB shmem:136kB slab_reclaimable:3640kB slab_unreclaimable:75128kB kernel_stack:4448kB pagetables:428kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  253.355047] lowmem_reserve[]: 0 0 0 0
[  253.355624] Node 1 Normal free:39896kB min:45096kB low:56368kB high:67644kB active_anon:412kB inactive_anon:1668kB active_file:288kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):220kB present:33095680kB mlocked:0kB dirty:0kB writeback:0kB mapped:92kB shmem:80kB slab_reclaimable:3496kB slab_unreclaimable:87864kB kernel_stack:216kB pagetables:564kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  253.356457] lowmem_reserve[]: 0 0 0 0
[  253.357034] Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15744kB
[  253.358351] Node 0 DMA32: 41*4kB 8*8kB 7*16kB 6*32kB 10*64kB 10*128kB 7*256kB 9*512kB 7*1024kB 5*2048kB 23*4096kB = 120468kB
[  253.359674] Node 0 Normal: 718*4kB 558*8kB 278*16kB 169*32kB 88*64kB 47*128kB 13*256kB 5*512kB 0*1024kB 1*2048kB 1*4096kB = 40872kB
[  253.360995] Node 1 Normal: 876*4kB 447*8kB 249*16kB 174*32kB 116*64kB 40*128kB 8*256kB 1*512kB 1*1024kB 2*2048kB 1*4096kB = 40952kB
[  253.362319] 154 total pagecache pages
[  253.362502] 0 pages in swap cache
[  253.362684] Swap cache stats: add 0, delete 0, find 0/0
[  253.362869] Free swap  = 0kB
[  253.363050] Total swap = 0kB
[  253.526814] 16777215 pages RAM
[  253.526999] 294628 pages reserved
[  253.527190] 114911 pages shared
[  253.527372] 16392561 pages non-shared
[..]
Run Code Online (Sandbox Code Playgroud)

F) 让我们检查系统崩溃时的内存状态。

crash> kmem -i
              PAGES        TOTAL      PERCENTAGE
 TOTAL MEM  16482587      62.9 GB         ----        -------------------------------+
      FREE    54610     213.3 MB    0% of TOTAL MEM                                  |
      USED  16427977      62.7 GB   99% of TOTAL MEM                                 |
    SHARED     4683      18.3 MB    0% of TOTAL MEM                                  |
   BUFFERS      118       472 KB    0% of TOTAL MEM                                  |
    CACHED       82       328 KB    0% of TOTAL MEM                                  |
      SLAB    46635     182.2 MB    0% of TOTAL MEM                                  |
                                                                                     |
TOTAL SWAP        0            0         ----         ----------------------+        |
 SWAP USED        0            0  100% of TOTAL SWAP                        |        |
 SWAP FREE        0            0    0% of TOTAL SWAP                        |        |
                                                                            |        | 
                                                                            |        |
crash> p -d totalram_pages                                                  |        |
totalram_pages = $5 = 16482587                                              |        |
                                                                            |        |
crash> !echo "scale=5;(16482587*4096)/2^30"|bc -q                           |        |
62.87607                  <<<-----[ Total physical memory is 62.9 GB ] <<<--|--------+
                                                                            |
crash> p -d total_swap_pages                                                |
total_swap_pages = $6 = 0 <<<------[ No Swap on the system ]  <<<-----------+
Run Code Online (Sandbox Code Playgroud)
  • 我们总共有大约 63GiB 的物理内存。
  • 系统上未创建交换分区或文件,因此我们在此服务器上没有交换。
  • 用于缓存的内存非常少,为 328KB,用于缓冲区的内存为 472KB。
  • Slab 中使用的内存也非常少,只有 182.2 MB。

G) 分配给进程的总内存为 0.00391006GiB。

crash> ps -G | tail -n +2 | cut -b2- | gawk '{mem += $8} END {print "total " mem/1048576 "GB"}'
total 0.00391006GB
Run Code Online (Sandbox Code Playgroud)

H) 应用程序进程没有使用系统上的内存。 

crash> ps -G | sed 's/>//g' | sort -k 8,8 -n | awk '$8 ~ /[0-9]/{ $8 = $8/1024" MB"; print }' | tail -20
965 2 21 ffff8808292f1500 IN 0.0 0 0 MB [ext4-dio-unwrit]
966 2 22 ffff8808292d4080 IN 0.0 0 0 MB [ext4-dio-unwrit]
967 2 23 ffff8808292ce040 IN 0.0 0 0 MB [ext4-dio-unwrit]
968 2 24 ffff8808299b5540 IN 0.0 0 0 MB [ext4-dio-unwrit]
969 2 25 ffff880829aa6040 IN 0.0 0 0 MB [ext4-dio-unwrit]
970 2 26 ffff880827367500 IN 0.0 0 0 MB [ext4-dio-unwrit]
971 2 27 ffff880827366aa0 IN 0.0 0 0 MB [ext4-dio-unwrit]
972 2 28 ffff880827366040 IN 0.0 0 0 MB [ext4-dio-unwrit]
97 2 23 ffff88082c1ac080 IN 0.0 0 0 MB [ksoftirqd/23]
973 2 29 ffff880827371540 IN 0.0 0 0 MB [ext4-dio-unwrit]
974 2 30 ffff880827370ae0 IN 0.0 0 0 MB [ext4-dio-unwrit]
975 2 31 ffff880827370080 IN 0.0 0 0 MB [ext4-dio-unwrit]
98 2 23 ffff88082c1bb500 IN 0.0 0 0 MB [watchdog/23]
99 2 24 ffff88082c1baaa0 IN 0.0 0 0 MB [migration/24]
3171 1 3 ffff880826ccaaa0 IN 0.0 27636 0.234375 MB auditd
1 0 1 ffff88082c41b500 UN 0.0 19348 0.339844 MB init
3772 1 0 ffff88102b257500 RU 0.0 64072 0.652344 MB sshd
1047 1 2 ffff881029524040 IN 0.0 11188 0.925781 MB udevd
4936 1047 4 ffff880ff342d540 IN 0.0 11184 0.925781 MB udevd
4937 1047 5 ffff88082a240080 IN 0.0 11184 0.925781 MB udevd
Run Code Online (Sandbox Code Playgroud)

I) 让我们验证系统上的内存调整参数。

crash> p -d sysctl_overcommit_memory
sysctl_overcommit_memory = $7 = 0
Run Code Online (Sandbox Code Playgroud)

此值包含启用内存过量使用的标志。当此标志为 0 时,内核会尝试估计用户空间请求更多内存时剩余的可用内存量。

crash> p -d sysctl_overcommit_ratio
sysctl_overcommit_ratio = $8 = 50
Run Code Online (Sandbox Code Playgroud)

当 overcommit_memory 设置为 2 时,提交的地址空间不允许超过 swap 加上物理 RAM 的这个百分比。 

crash> p -d zone_reclaim_mode 
zone_reclaim_mode = $4 = 0
Run Code Online (Sandbox Code Playgroud)

Zone_reclaim_mode 允许某人设置或多或少激进的方法来在区域内存不足时回收内存。如果设置为零,则不会发生区域回收。

crash> p -d min_free_kbytes
min_free_kbytes = $3 = 90112  <<<--------[ 88 MB ]
Run Code Online (Sandbox Code Playgroud)

在整个系统中保持空闲的最小千字节数。该值用于计算每个低内存区域的水印值,然后分配一些与其大小成比例的保留空闲页面。设置此参数时,因为太低和太高的值都会造成损害。

换句话说,设置min_free_kbytes得太低会阻止系统回收内存。这可能会导致系统挂起和 OOM 杀死多个进程。但是,将此参数设置为过高的值(总系统内存的 5-10%)将导致您的系统立即内存不足。Linux 旨在使用所有可用 RAM 来缓存文件系统数据。设置较高的 min_free_kbytes 值会导致系统花费太多时间来回收内存。 

上面参数的值看起来没问题,那么我的记忆在哪里???

假设:

  1. 主要罪犯不在用户空间。根据我的经验,无法解释的内存是由 Mellanox 和 DRBD 模块引起的,但我不确定您的情况。 
  2. 由于大多数页面都从 vmc​​ore 文件中丢弃,以减小 vmcore 文件的大小( core_collector makedumpfile -d 31 -c )。我无法检查大页面大小。