MPI:如何正确使用 MPI_Win_allocate_shared

Red*_*a94 3 mpi mpi-rma

我想在进程之间使用共享内存。我尝试了 MPI_Win_allocate_shared 但当我执行程序时它给了我一个奇怪的错误:

文件./src/mpid/ch3/include/mpid_rma_shm.h第 592 行断言失败:local_target_rank >= 0 internal ABORT

这是我的来源:

    # include <stdlib.h>
    # include <stdio.h>
    # include <time.h>
    
    # include "mpi.h"
    
    int main ( int argc, char *argv[] );
    void pt(int t[], int s);
    
    int main ( int argc, char *argv[] )
    {
        int rank, size, shared_elem = 0, i;
        MPI_Init ( &argc, &argv );
        MPI_Comm_rank ( MPI_COMM_WORLD, &rank );
        MPI_Comm_size ( MPI_COMM_WORLD, &size );
        MPI_Win win;
        int *shared;
        
        if (rank == 0) shared_elem = size;
        MPI_Win_allocate_shared(shared_elem*sizeof(int), sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &shared, &win);
        if(rank==0)
        {
            MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, MPI_MODE_NOCHECK, win);
            for(i = 0; i < size; i++)
            {
                shared[i] = -1;
            }
            MPI_Win_unlock(0,win);
        }
        MPI_Barrier(MPI_COMM_WORLD);
        int *local = (int *)malloc( size * sizeof(int) );
        MPI_Win_lock(MPI_LOCK_SHARED, 0, 0, win);
        for(i = 0; i < 10; i++)
        {
            MPI_Get(&(local[i]), 1, MPI_INT, 0, i,1, MPI_INT, win);
        }
        printf("processus %d (avant): ", rank);
        pt(local,size);
        MPI_Win_unlock(0,win);
        
        MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, win);
        
        MPI_Put(&rank, 1, MPI_INT, 0, rank, 1, MPI_INT, win);
        
        MPI_Win_unlock(0,win);
        
        MPI_Win_lock(MPI_LOCK_SHARED, 0, 0, win);
        for(i = 0; i < 10; i++)
        {
            MPI_Get(&(local[i]), 1, MPI_INT, 0, i,1, MPI_INT, win);
        }
        printf("processus %d (apres): ", rank);
        pt(local,size);
        MPI_Win_unlock(0,win);
        
        
        MPI_Win_free(&win);
        MPI_Free_mem(shared);
        MPI_Free_mem(local);
        MPI_Finalize ( );
        
        return 0;
    }
    
    void pt(int t[],int s)
    {
        int i = 0;
        while(i < s)
        {
            printf("%d ",t[i]);
            i++;
        }
        printf("\n");
    }
Run Code Online (Sandbox Code Playgroud)

我得到以下结果:

processus 0 (avant): -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
processus 0 (apres): 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 
processus 4 (avant): 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 
processus 4 (apres): 0 -1 -1 -1 4 -1 -1 -1 -1 -1 
Assertion failed in file ./src/mpid/ch3/include/mpid_rma_shm.h at line 592: local_target_rank >= 0
internal ABORT - process 5
Assertion failed in file ./src/mpid/ch3/include/mpid_rma_shm.h at line 592: local_target_rank >= 0
internal ABORT - process 6
Assertion failed in file ./src/mpid/ch3/include/mpid_rma_shm.h at line 592: local_target_rank >= 0
internal ABORT - process 9
Run Code Online (Sandbox Code Playgroud)

有人可以帮我弄清楚出了什么问题以及该错误意味着什么吗?多谢。

Hri*_*iev 5

MPI_Win_allocate_shared背离了 MPI 非常抽象的本质。它公开了底层内存组织,并允许程序绕过昂贵(且经常令人困惑)的 MPI RMA 操作,并直接在具有此类内存的系统上利用共享内存。虽然 MPI 通常处理队列不共享物理内存地址空间的分布式内存环境,但当今典型的 HPC 系统由许多互连的共享内存节点组成。因此,在同一节点上执行的rank可以附加到共享内存段并通过共享数据而不是消息传递进行通信。

MPI 提供了一种通信器拆分操作,允许创建排名子组,以便每个子组中的排名能够共享内存:

MPI_Comm_split_type(comm, MPI_COMM_TYPE_SHARED, key, info, &newcomm);
Run Code Online (Sandbox Code Playgroud)

在典型的集群上,这本质上是按它们执行的节点对排名进行分组。分割完成后,可以在每个 中的行上执行共享内存窗口分配newcomm。请注意,对于多节点集群作业,这将导致多个独立的newcomm通信器,从而产生多个共享内存窗口。一个节点上的排名不会(也不应该)能够看到其他节点上的共享内存窗口。

在这方面,MPI_Win_allocate_shared它是一个独立于平台的包装器,围绕特定于操作系统的共享内存分配机制。