Dio*_*lis 4 mmap copy-on-write linux-kernel
当尝试使用写时复制语义(PROT_READ | PROT_WRITE和MAP_PRIVATE)映射5GB文件时,会在2.6.26-2-amd64 Linux内核上发生这种情况.映射小于4GB的文件或仅使用PROT_READ工作正常.这不是本问题中报告的软资源限制问题 ; 虚拟限制大小是无限的.
这是重现问题的代码(实际代码是Boost.Interprocess的一部分).
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
main()
{
struct stat b;
void *base;
int fd = open("foo.bin", O_RDWR);
fstat(fd, &b);
base = mmap(0, b.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
if (base == MAP_FAILED) {
perror("mmap");
return 1;
}
return 0;
}
Run Code Online (Sandbox Code Playgroud)
以下是发生的事情:
dd if=/dev/zero of=foo.bin bs=1M seek=5000 count=1
./test-mmap
mmap: Cannot allocate memory
Run Code Online (Sandbox Code Playgroud)
这是相关的strace(新编译的4.5.20)输出,如nos所述.
open("foo.bin", O_RDWR) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5243928576, ...}) = 0
mmap(NULL, 5243928576, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = -1 ENOMEM (Cannot allocate memory)
dup(2) = 4
[...]
write(4, "mmap: Cannot allocate memory\n", 29mmap: Cannot allocate memory
) = 29
Run Code Online (Sandbox Code Playgroud)
尝试传递这样MAP_NORESERVE的flags字段:
mmap(NULL, b.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, fd, 0);
Run Code Online (Sandbox Code Playgroud)
您的交换和物理内存的组合可能小于请求的5GB.
或者,您可以执行此操作以进行测试,如果可行,您可以在上面更改代码:
# echo 0 > /proc/sys/vm/overcommit_memory
Run Code Online (Sandbox Code Playgroud)
以下是手册页的相关摘录.
MMAP(2):
MAP_NORESERVE
Do not reserve swap space for this mapping. When swap space is
reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get
SIGSEGV upon a write if no physical memory is available. See
also the discussion of the file /proc/sys/vm/overcommit_memory
in proc(5). In kernels before 2.6, this flag only had effect
for private writable mappings.
Run Code Online (Sandbox Code Playgroud)
PROC(5):
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode.
Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
In mode 0, calls of mmap(2) with MAP_NORESERVE are not checked,
and the default check is very weak, leading to the risk of get?
ting a process "OOM-killed". Under Linux 2.4 any non-zero value
implies mode 1. In mode 2 (available since Linux 2.6), the
total virtual address space on the system is limited to (SS +
RAM*(r/100)), where SS is the size of the swap space, and RAM is
the size of the physical memory, and r is the contents of the
file /proc/sys/vm/overcommit_ratio.
Run Code Online (Sandbox Code Playgroud)