简单的MPI_Scatter试试

Jie*_*eng 2 c++ mpi

我刚刚学习OpenMPI.试过一个简单的MPI_Scatter例子:

#include <mpi.h>

using namespace std;

int main() {
    int numProcs, rank;

    MPI_Init(NULL, NULL);
    MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    int* data;
    int num;

    data = new int[5];
    data[0] = 0;
    data[1] = 1;
    data[2] = 2;
    data[3] = 3;
    data[4] = 4;
    MPI_Scatter(data, 5, MPI_INT, &num, 5, MPI_INT, 0, MPI_COMM_WORLD);
    cout << rank << " recieved " << num << endl; 

    MPI_Finalize();
    return 0;
}
Run Code Online (Sandbox Code Playgroud)

但它没有按预期工作......

我期待着类似的东西

0 received 0
1 received 1 
2 received 2 ... 
Run Code Online (Sandbox Code Playgroud)

但我得到的是

32609 received 
1761637486 received 
1 received 
33 received 
1601007716 received 
Run Code Online (Sandbox Code Playgroud)

什么是奇怪的队伍?似乎与我的分散有关?另外,为什么是sendcountrecvcount一样的吗?起初我想,因为我将5个元素分散到5个处理器,每个元素会得到1个?所以我应该使用:

MPI_Scatter(data, 5, MPI_INT, &num, 1, MPI_INT, 0, MPI_COMM_WORLD);
Run Code Online (Sandbox Code Playgroud)

但是这给出了一个错误:

[JM:2861] *** An error occurred in MPI_Scatter
[JM:2861] *** on communicator MPI_COMM_WORLD
[JM:2861] *** MPI_ERR_TRUNCATE: message truncated
[JM:2861] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
Run Code Online (Sandbox Code Playgroud)

我想知道,为什么我需要区分根和子进程?好像在这种情况下,源/ root也会得到副本?另一件事是其他进程也会分散吗?可能不是,但为什么呢?我认为所有进程都将运行此代码,因为它不是典型的,如果我在MPI程序中看到的那样?

if (rank == xxx) {
Run Code Online (Sandbox Code Playgroud)

UPDATE

我注意到运行,发送和接收缓冲区必须具有相同的长度...并且数据应该声明为:

int data[5][5] = { {0}, {5}, {10}, {3}, {4} };
Run Code Online (Sandbox Code Playgroud)

请注意,列被声明为长度5,但我只初始化了1个值?这里到底发生了什么?这段代码是否正确?假设我只希望每个进程只接收1个值.

nha*_*tdh 5

sendcount是要发送到每个进程的元素数,而不是发送缓冲区中元素的数量.MPI_Scatter将从sendcount根进程中的发送缓冲区中取出*[通信器中的进程数]元素,并将其分散到通信器中的所有进程.

因此,要1个元件发送到每个在通信的处理的(假设有5个处理),设置sendcountrecvcount为1.

MPI_Scatter(data, 1, MPI_INT, &num, 1, MPI_INT, 0, MPI_COMM_WORLD);
Run Code Online (Sandbox Code Playgroud)

可能的数据类型对存在限制,它们与点对点操作相同.类型映射recvtype应该与类型映射兼容sendtype,即它们应该具有相同的基础基本数据类型列表.此外,接收缓冲区应足够大以容纳接收的消息(它可能更大,但不能更小).在大多数简单情况下,发送方和接收方的数据类型都是相同的.所以sendcount- recvcount对和sendtype- recvtype对通常最终相同.它们可以不同的一个例子是当一方在任何一方使用用户定义的数据类型时:

MPI_Datatype vec5int;

MPI_Type_contiguous(5, MPI_INT, &vec5int);
MPI_Type_commit(&vec5int);

MPI_Scatter(data, 5, MPI_INT, local_data, 1, vec5int, 0, MPI_COMM_WORLD);
Run Code Online (Sandbox Code Playgroud)

这是有效的,因为发送者构造5个类型元素的消息,MPI_INT而每个接收者将消息解释为5元素整数向量的单个实例.

(请注意,您指定要接收的元素的最大数量,MPI_Recv并且实际接收的数量可能更少,这可以通过以下方式获得MPI_Get_count.相反,您提供了要接收的预期元素数量recvcount,MPI_Scatter因此如果错误将被抛出,则收到的消息长度与承诺的长度不完全相同.)

可能你现在知道打印出的怪异等级是由堆栈损坏引起的,因为num只能包含1 int但是int接收到5 MPI_Scatter.

I am wondering though, why doing I need to differentiate between root and child processes? Seems like in this case, the source/root will also get a copy? Another thing is will other processes run scatter too? Probably not, but why? I thought all processes will run this code since its not in the typical if I see in MPI programs?

It is necessary to differentiate between root and other processes in the communicator (they are not child process of the root since they can be in a separate computer) in some operations such as Scatter and Gather, since these are collective communication (group communication) but with a single source/destination. The single source/destination (the odd one out) is therefore called root. It is necessary for all the processes to know the source/destination (root process) to set up send and receive correctly.

The root process, in case of Scatter, will also receive a piece of data (from itself), and in case of Gather, will also include its data in the final result. There is no exception for the root process, unless "in place" operations are used. This also applies to all collective communication functions.

There are also root-less global communication operations like MPI_Allgather, where one does not provide a root rank. Rather all ranks receive the data being gathered.

All processes in the communicator will run the function (try to exclude one process in the communicator and you will get a deadlock). You can imagine processes on different computer running the same code blindly. However, since each of them may belong to different communicator group and has different rank, the function will run differently. Each process knows whether it is member of the communicator, and each knows the rank of itself and can compare to the rank of the root process (if any), so they can set up the communication or do extra actions accordingly.