aSt*_*eve 18 c++ boost mmap shared-memory boost-interprocess
我对使用内存映射IO的前景感兴趣,最好利用boost :: interprocess中的工具来实现跨平台支持,将文件中不连续的系统页大小块映射到内存中的连续地址空间.
简化的具体方案:
我有许多"普通旧数据"结构,每个都有一个固定的长度(小于系统页面大小.)这些结构被连接成一个(非常长的)流,其结构的类型和位置由在流中进行它们的那些结构的值.我的目标是在要求苛刻的并发环境中最小化延迟并最大化吞吐量.
我可以非常有效地读取这些数据,通过内存映射它至少是系统页面大小的两倍...并建立一个新的映射,立即读取超出倒数第二个系统页面边界的结构.这使得与普通老式的数据结构交互的代码是一无所知,这些结构是存储器映射...,例如,可以比较使用memcmp()直接在两个不同的结构,而不必关心页边界.
事情变得有趣的是关于更新这些数据流......当它们被(同时)读取时.我想要使用的策略受到系统页面大小粒度的"写入时复制"的启发......基本上是写"覆盖页面" - 允许一个进程读取旧数据而另一个进程读取更新数据.
管理哪些叠加页面以及何时使用不一定是微不足道的......这不是我主要关注的问题.我主要担心的是我可能有一个跨越第4页和第5页的结构,然后更新完全包含在第5页的结构...在第6位写入新页面...当第5页时,将第5页保留为"垃圾收集"决定不再可达.这意味着,如果我将第4页映射到位置M,我需要将第6页映射到内存位置M + page_size ...,以便能够使用现有的(非内存映射)可靠地处理跨页边界的结构意识到)功能.
我正在努力建立最好的策略,而且我受到文件的阻碍,我认为这是不完整的.本质上,我需要将地址空间的分配与内存映射分离到该地址空间.使用mmap(),我知道我可以使用MAP_FIXED - 如果我希望显式控制映射位置......但我不清楚我应该如何保留地址空间以便安全地执行此操作.我可以在没有MAP_FIXED的情况下映射/ dev/zero两个页面,然后使用MAP_FIXED两次将两个页面映射到显式VM地址的分配空间吗?如果是这样,我应该三次打电话给munmap()吗?它会泄漏资源和/或有任何其他不利的开销吗?为了使问题更加复杂,我想在Windows上采用类似的行为......有什么办法可以做到这一点吗?如果我要牺牲我的跨平台野心,是否有完整的解决方案?
-
感谢您的回答,Mahmoud ......我已经读过了,并且认为我已经理解了代码......我已经在Linux下编译了它,它的行为与您的建议一致.
我主要关心的是第62行 - 使用MAP_FIXED.它对mmap做了一些假设,当我阅读我能找到的文档时,我无法确认.您将"更新"页面映射到与最初返回的mmap()相同的地址空间 - 我认为这是'正确' - 即不是恰好在Linux上运行的东西?我还需要假设它适用于文件映射和匿名映射的跨平台.
这个样本肯定让我前进......记录我最终需要的东西可能是在Linux上用mmap()实现的 - 至少.我真正喜欢的是指向文档的指针,该文档显示MAP_FIXED行将在示例演示中运行...并且,理想情况下,从Linux/Unix特定mmap()到独立平台的转换(Boost :: interprocess) )方法.
你的问题有点令人困惑.根据我的理解,这段代码将满足您的需求:
#define PAGESIZE 4096
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
struct StoredObject
{
int IntVal;
char StrVal[25];
};
int main(int argc, char **argv)
{
int fd = open("mmapfile", O_RDWR | O_CREAT | O_TRUNC, (mode_t) 0600);
//Set the file to the size of our data (2 pages)
lseek(fd, PAGESIZE*2 - 1, SEEK_SET);
write(fd, "", 1); //The final byte
unsigned char *mapPtr = (unsigned char *) mmap(0, PAGESIZE * 2, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
struct StoredObject controlObject;
controlObject.IntVal = 12;
strcpy(controlObject.StrVal, "Mary had a little lamb.\n");
struct StoredObject *mary1;
mary1 = (struct StoredObject *)(mapPtr + PAGESIZE - 4); //Will fall on the boundary between first and second page
memcpy(mary1, &controlObject, sizeof(StoredObject));
printf("%d, %s", mary1->IntVal, mary1->StrVal);
//Should print "12, Mary had a little lamb.\n"
struct StoredObject *john1;
john1 = mary1 + 1; //Comes immediately after mary1 in memory; will start and end in the second page
memcpy(john1, &controlObject, sizeof(StoredObject));
john1->IntVal = 42;
strcpy(john1->StrVal, "John had a little lamb.\n");
printf("%d, %s", john1->IntVal, john1->StrVal);
//Should print "12, Mary had a little lamb.\n"
//Make sure the data's on the disk, as this is the initial, "read-only" data
msync(mapPtr, PAGESIZE * 2, MS_SYNC);
//This is the inital data set, now in memory, loaded across two pages
//At this point, someone could be reading from there. We don't know or care.
//We want to modify john1, but don't want to write over the existing data
//Easy as pie.
//This is the shadow map. COW-like optimization will take place:
//we'll map the entire address space from the shared source, then overlap with a new map to modify
//This is mapped anywhere, letting the system decide what address we'll be using for the new data pointer
unsigned char *mapPtr2 = (unsigned char *) mmap(0, PAGESIZE * 2, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
//Map the second page on top of the first mapping; this is the one that we're modifying. It is *not* backed by disk
unsigned char *temp = (unsigned char *) mmap(mapPtr2 + PAGESIZE, PAGESIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_ANON, 0, 0);
if (temp == MAP_FAILED)
{
printf("Fixed map failed. %s", strerror(errno));
}
assert(temp == mapPtr2 + PAGESIZE);
//Make a copy of the old data that will later be changed
memcpy(mapPtr2 + PAGESIZE, mapPtr + PAGESIZE, PAGESIZE);
//The two address spaces should still be identical until this point
assert(memcmp(mapPtr, mapPtr2, PAGESIZE * 2) == 0);
//We can now make our changes to the second page as needed
struct StoredObject *mary2 = (struct StoredObject *)(((unsigned char *)mary1 - mapPtr) + mapPtr2);
struct StoredObject *john2 = (struct StoredObject *)(((unsigned char *)john1 - mapPtr) + mapPtr2);
john2->IntVal = 52;
strcpy(john2->StrVal, "Mike had a little lamb.\n");
//Test that everything worked OK
assert(memcmp(mary1, mary2, sizeof(struct StoredObject)) == 0);
printf("%d, %s", john2->IntVal, john2->StrVal);
//Should print "52, Mike had a little lamb.\n"
//Now assume our garbage collection routine has detected that no one is using the original copy of the data
munmap(mapPtr, PAGESIZE * 2);
mapPtr = mapPtr2;
//Now we're done with all our work and want to completely clean up
munmap(mapPtr2, PAGESIZE * 2);
close(fd);
return 0;
}
Run Code Online (Sandbox Code Playgroud)
我修改后的答案应该解决您的安全问题.仅用于MAP_FIXED第二次mmap通话(就像我上面的那样).很酷的MAP_FIXED是它可以让你覆盖现有的mmap地址部分.它将卸载您重叠的范围,并将其替换为新的映射内容:
MAP_FIXED
[...] If the memory
region specified by addr and len overlaps pages of any existing
mapping(s), then the overlapped part of the existing mapping(s) will be
discarded. [...]
Run Code Online (Sandbox Code Playgroud)
通过这种方式,您可以让操作系统为您找到数百兆的连续内存块(从不调用MAP_FIXED您不确定无法确定的地址).然后MAP_FIXED,您将使用您将要修改的数据调用现在映射的巨大空间的子部分.田田.
在Windows上,这样的东西应该可以工作(我现在在Mac上,所以未经测试):
int main(int argc, char **argv)
{
HANDLE hFile = CreateFile(L"mmapfile", GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
//Set the file to the size of our data (2 pages)
SetFilePointer(hFile, PAGESIZE*2 - 1, 0, FILE_BEGIN);
DWORD bytesWritten = -1;
WriteFile(hFile, "", 1, &bytesWritten, NULL);
HANDLE hMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, PAGESIZE * 2, NULL);
unsigned char *mapPtr = (unsigned char *) MapViewOfFile(hMap, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, PAGESIZE * 2);
struct StoredObject controlObject;
controlObject.IntVal = 12;
strcpy(controlObject.StrVal, "Mary had a little lamb.\n");
struct StoredObject *mary1;
mary1 = (struct StoredObject *)(mapPtr + PAGESIZE - 4); //Will fall on the boundary between first and second page
memcpy(mary1, &controlObject, sizeof(StoredObject));
printf("%d, %s", mary1->IntVal, mary1->StrVal);
//Should print "12, Mary had a little lamb.\n"
struct StoredObject *john1;
john1 = mary1 + 1; //Comes immediately after mary1 in memory; will start and end in the second page
memcpy(john1, &controlObject, sizeof(StoredObject));
john1->IntVal = 42;
strcpy(john1->StrVal, "John had a little lamb.\n");
printf("%d, %s", john1->IntVal, john1->StrVal);
//Should print "12, Mary had a little lamb.\n"
//Make sure the data's on the disk, as this is the initial, "read-only" data
//msync(mapPtr, PAGESIZE * 2, MS_SYNC);
//This is the inital data set, now in memory, loaded across two pages
//At this point, someone could be reading from there. We don't know or care.
//We want to modify john1, but don't want to write over the existing data
//Easy as pie.
//This is the shadow map. COW-like optimization will take place:
//we'll map the entire address space from the shared source, then overlap with a new map to modify
//This is mapped anywhere, letting the system decide what address we'll be using for the new data pointer
unsigned char *reservedMem = (unsigned char *) VirtualAlloc(NULL, PAGESIZE * 2, MEM_RESERVE, PAGE_READWRITE);
HANDLE hMap2 = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, PAGESIZE, NULL);
unsigned char *mapPtr2 = (unsigned char *) MapViewOfFileEx(hMap2, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, PAGESIZE, reservedMem);
//Map the second page on top of the first mapping; this is the one that we're modifying. It is *not* backed by disk
unsigned char *temp = (unsigned char *) MapViewOfFileEx(hMap2, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, PAGESIZE, reservedMem + PAGESIZE);
if (temp == NULL)
{
printf("Fixed map failed. 0x%x\n", GetLastError());
return -1;
}
assert(temp == mapPtr2 + PAGESIZE);
//Make a copy of the old data that will later be changed
memcpy(mapPtr2 + PAGESIZE, mapPtr + PAGESIZE, PAGESIZE);
//The two address spaces should still be identical until this point
assert(memcmp(mapPtr, mapPtr2, PAGESIZE * 2) == 0);
//We can now make our changes to the second page as needed
struct StoredObject *mary2 = (struct StoredObject *)(((unsigned char *)mary1 - mapPtr) + mapPtr2);
struct StoredObject *john2 = (struct StoredObject *)(((unsigned char *)john1 - mapPtr) + mapPtr2);
john2->IntVal = 52;
strcpy(john2->StrVal, "Mike had a little lamb.\n");
//Test that everything worked OK
assert(memcmp(mary1, mary2, sizeof(struct StoredObject)) == 0);
printf("%d, %s", john2->IntVal, john2->StrVal);
//Should print "52, Mike had a little lamb.\n"
//Now assume our garbage collection routine has detected that no one is using the original copy of the data
//munmap(mapPtr, PAGESIZE * 2);
mapPtr = mapPtr2;
//Now we're done with all our work and want to completely clean up
//munmap(mapPtr2, PAGESIZE * 2);
//close(fd);
return 0;
}
Run Code Online (Sandbox Code Playgroud)