MPI_Waitall失败了

use*_*217 5 c parallel-processing mpi

我想知道是否有人可以为我阐明MPI_Waitall功能.我有一个程序使用MPI_Isend和MPI_Irecv传递信息.完成所有发送和接收后,程序中的一个进程(在本例中为进程0)将打印一条消息.我的Isend/Irecv正在工作,但是该消息在程序中的某个随机点打印出来; 所以我试图使用MPI_Waitall等到所有请求都完成后再打印消息.我收到以下错误消息:

Fatal error in PMPI_Waitall: Invalid MPI_Request, error stack:
PMPI_Waitall(311): MPI_Waitall(count=16, req_array=0x16f70d0, status_array=0x16f7260) failed
PMPI_Waitall(288): The supplied request in array element 1 was invalid (kind=0)
Run Code Online (Sandbox Code Playgroud)

这是一些相关的代码:

MPI_Status *status;
MPI_Request *request;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
status = (MPI_Status *) malloc(numtasks * sizeof(MPI_Status));
request = (MPI_Request *) malloc(numtasks * sizeof(MPI_Request));

/* Generate Data to send */
//Isend/Irecvs look like this:
MPI_Isend(&data, count, MPI_INT, dest, tag, MPI_COMM_WORLD, &request[taskid]);
MPI_Irecv(&data, count, MPI_INT, source, tag, MPI_COMM_WORLD, &request[taskid]);
MPI_Wait(&request[taskid], &status[taskid]

/* Calculations and such */

if (taskid == 0) {
        MPI_Waitall (numtasks, request, status);
        printf ("All done!\n");
}
MPI_Finalize();
Run Code Online (Sandbox Code Playgroud)

如果没有调用MPI_Waitall,程序运行干净,但只要进程0的Isend/Irecv消息完成,就会打印出"All done"消息,而不是在所有Isend/Irecv完成之后.

感谢您提供任何帮助.

Hri*_*iev 11

您只设置request数组的一个元素,即request[taskid](通过使用接收方覆盖发送请求句柄的方式,不可挽回地丢失前者).请记住,MPI用于编程分布式内存机器,每个MPI进程都有自己的request阵列副本.在排名taskid中设置一个元素并不会将该值神奇地传播到其他排名,即使这样,请求也只具有本地有效性.适当的实施将是:

MPI_Status status[2];
MPI_Request request[2];

MPI_Init(&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &taskid);
MPI_Comm_size (MPI_COMM_WORLD, &numtasks);

/* Generate Data to send */
//Isend/Irecvs look like this:
MPI_Isend (&data, count, MPI_INT, dest, tag, MPI_COMM_WORLD, &request[0]);
//          ^^^^
//           ||
//      data race !!
//           ||
//          vvvv
MPI_Irecv (&data, count, MPI_INT, source, tag, MPI_COMM_WORLD, &request[1]);
// Wait for both operations to complete
MPI_Waitall(2, request, status);

/* Calculations and such */

// Wait for all processes to reach this line in the code
MPI_Barrier(MPI_COMM_WORLD);

if (taskid == 0) {
  printf ("All done!\n");
}
MPI_Finalize();
Run Code Online (Sandbox Code Playgroud)

顺便说一句,您的代码中存在数据竞争.二者MPI_IsendMPI_Irecv用相同的数据缓冲区,这是不正确.如果你只是想发送的内容datadest,然后接收到它source,然后使用MPI_Sendrecv_replace来代替,而忘记了非阻塞操作:

MPI_Status status;

MPI_Init(&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &taskid);
MPI_Comm_size (MPI_COMM_WORLD, &numtasks);

/* Generate Data to send */
MPI_Sendrecv_replace (&data, count, MPI_INT, dest, tag, source, tag,
                      MPI_COMM_WORLD, &status);

/* Calculations and such */

// Wait for all processes to reach this line in the code
MPI_Barrier(MPI_COMM_WORLD);

if (taskid == 0) {
  printf ("All done!\n");
}
MPI_Finalize();
Run Code Online (Sandbox Code Playgroud)