尽管有足够的可用内存,但Postgres会出现内存错误

Ale*_*scu 12 memory postgresql

我有一台运行Postgres 9.1.15的服务器.服务器有2GB的RAM,没有交换.Postgres会间歇性地开始在某些SELECT上出现"内存不足"错误,并且会继续这样做,直到我重新启动Postgres 连接到它的某些客户端.奇怪的是,当发生这种情况时,free仍会报告超过500MB的可用内存.

select version();:

PostgreSQL 9.1.15 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
Run Code Online (Sandbox Code Playgroud)

uname -a:

Linux db 3.2.0-23-virtual #36-Ubuntu SMP Tue Apr 10 22:29:03 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Run Code Online (Sandbox Code Playgroud)

Postgresql.conf(其他一切都被注释掉/默认):

max_connections = 100
shared_buffers = 500MB
work_mem = 2000kB
maintenance_work_mem = 128MB
wal_buffers = 16MB
checkpoint_segments = 32
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 1000MB
default_statistics_target = 100
log_temp_files = 0
Run Code Online (Sandbox Code Playgroud)

我从pgtune获得了这些值(我选择了"混合类型的应用程序"),并根据我所阅读的内容摆弄它们,但没有取得太多真正的进展.目前有68个连接,这是一个典型的数字(我还没有使用pgbouncer或任何其他连接的连接器).

/etc/sysctl.conf:

kernel.shmmax=1050451968
kernel.shmall=256458

vm.overcommit_ratio=100
vm.overcommit_memory=2
Run Code Online (Sandbox Code Playgroud)

overcommit_memory大约两周前,在OOM杀手杀死Postgres服务器之后,我第一次改为2.在此之前,服务器已经运行了很长时间.我现在得到的错误不是灾难性的,而是更加烦人,因为它们更频繁.

我没有太多的运气指出导致postgres"内存不足"的第一个事件 - 每次似乎都不同.它崩溃的最近一次,记录的前三行是:

2015-04-07 05:32:39 UTC ERROR:  out of memory
2015-04-07 05:32:39 UTC DETAIL:  Failed on request of size 125.
2015-04-07 05:32:39 UTC CONTEXT:  automatic analyze of table "xxx.public.delayed_jobs"
TopMemoryContext: 68688 total in 10 blocks; 4560 free (4 chunks); 64128 used
[... snipped heaps of lines which I can provide if they are useful ...]

---

2015-04-07 05:33:58 UTC ERROR:  out of memory
2015-04-07 05:33:58 UTC DETAIL:  Failed on request of size 16.
2015-04-07 05:33:58 UTC STATEMENT:  SELECT oid, typname, typelem, typdelim, typinput FROM pg_type
2015-04-07 05:33:59 UTC LOG:  could not fork new process for connection: Cannot allocate memory
2015-04-07 05:33:59 UTC LOG:  could not fork new process for connection: Cannot allocate memory
2015-04-07 05:33:59 UTC LOG:  could not fork new process for connection: Cannot allocate memory
TopMemoryContext: 396368 total in 50 blocks; 10160 free (28 chunks); 386208 used
[... snipped heaps of lines which I can provide if they are useful ...]

---

2015-04-07 05:33:59 UTC ERROR:  out of memory
2015-04-07 05:33:59 UTC DETAIL:  Failed on request of size 1840.
2015-04-07 05:33:59 UTC STATEMENT:  SELECT... [nested select with 4 joins, 19 ands, and 2 order bys]
TopMemoryContext: 388176 total in 49 blocks; 17264 free (55 chunks); 370912 used
Run Code Online (Sandbox Code Playgroud)

之前几个小时前的崩溃只是将最后一个查询的三个实例作为崩溃的前三行.该查询获取运行非常频繁,所以我不知道,如果问题是因为此查询,或者如果它只是在错误日志中出现,因为它是一个相当复杂的SELECT越来越运行所有的时间.也就是说,这是一个解释性的分析:http://explain.depesz.com/s/r00

这是ulimit -apostgres用户的样子:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15956
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 15956
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Run Code Online (Sandbox Code Playgroud)

我会尝试从free下次发生碰撞时得到确切的数字,与此同时这是我所有信息的一个标题.

关于从哪里去的任何想法?

Chr*_*ian 2

您能否检查出现错误时是否有可用的交换内存?

我已经完全删除了 Linux 桌面中的交换内存(只是为了测试其他东西......),但我得到了完全相同的错误!我很确定你也遇到了这种情况。