我想使用MPI_Iprobe来测试具有给定标记的消息是否已经挂起.
但是,MPI_Iprobe的行为并不像我预期的那样.在下面的示例中,我将来自多个任务的消息发送到单个任务(等级0).然后在0级,我等待几秒钟,以便有足够的时间让MPI_Isends完成.然后当我运行MPI_Iprobe时,它返回标志为false.如果我在(阻塞)MPI_Probe之后重复,则返回true.
#include "mpi.h"
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
int rank;
int numprocs;
int tag;
int receive_tag;
int flag=0;
int number;
int recv_number=0;
MPI_Request request;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
// rank 0 receives messages, all others send messages
if (rank > 0 ) {
number = rank;
tag = rank;
MPI_Isend(&number, 1, MPI_INT, 0, tag, MPI_COMM_WORLD,&request); // send to rank 0
printf("Sending tag : %d \n",tag);
}
else if (rank == 0) {
sleep(5); // [seconds] allow plenty of time for all sends from other tasks to complete
receive_tag = 3; // just try and receive a single message from task 1
MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
printf("After MPI_Iprobe, flag = %d \n",flag);
MPI_Probe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&status);
printf("After MPI_Probe, found message with tag : %d \n",receive_tag);
MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
printf("After second MPI_Iprobe, flag = %d \n",flag);
// receive all the messages
for (int i=1;i<numprocs;i++){
MPI_Recv(&recv_number, 1, MPI_INT, MPI_ANY_SOURCE, i, MPI_COMM_WORLD,&status);
printf("Received : %d \n",recv_number);
}
}
MPI_Finalize();
}
Run Code Online (Sandbox Code Playgroud)
给出这个输出:
Sending tag : 4
Sending tag : 3
Sending tag : 2
Sending tag : 5
Sending tag : 1
After MPI_Iprobe, flag = 0
After MPI_Probe, found message with tag : 3
After second MPI_Iprobe, flag = 1
Received : 1
Received : 2
Received : 3
Received : 4
Received : 5
Run Code Online (Sandbox Code Playgroud)
为什么mpi_iprobe第一次返回'false'?
任何帮助将非常感激!
编辑:在Hristo Iliev的回答后,我现在有以下代码:
#include "mpi.h"
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
int rank;
int numprocs;
int tag;
int receive_tag;
int flag=0;
int number;
int recv_number=0;
MPI_Request request;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
// rank 0 receives messages, all others send messages
if (rank > 0 ) {
number = rank;
tag = rank;
MPI_Isend(&number, 1, MPI_INT, 0, tag, MPI_COMM_WORLD,&request); // send to rank 0
printf("Sending tag : %d \n",tag);
// do stuff
MPI_Wait(&request,&status);
printf("Sent tag : %d \n",tag);
}
else if (rank == 0) {
sleep(5); // [seconds] allow plenty of time for all sends from other tasks to complete
receive_tag = 3; // just try and receive a single message from task 1
MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
printf("After MPI_Iprobe, flag = %d \n",flag);
MPI_Probe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&status);
printf("After MPI_Probe, found message with tag : %d \n",receive_tag);
MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
printf("After second MPI_Iprobe, flag = %d \n",flag);
// receive all the other messages
for (int i=1;i<numprocs;i++){
MPI_Recv(&recv_number, 1, MPI_INT, MPI_ANY_SOURCE, i, MPI_COMM_WORLD,&status);
}
}
MPI_Finalize();
}
Run Code Online (Sandbox Code Playgroud)
其中给出了以下输出:
Sending tag : 5
Sending tag : 2
Sending tag : 1
Sending tag : 4
Sending tag : 3
Sent tag : 2
Sent tag : 1
Sent tag : 5
Sent tag : 4
Sent tag : 3
After MPI_Iprobe, flag = 0
After MPI_Probe, found message with tag : 3
After second MPI_Iprobe, flag = 1
Run Code Online (Sandbox Code Playgroud)
Hri*_*iev 13
您正在使用MPI_Isend
以发送消息.MPI_Isend
启动异步(后台)数据传输.除非在请求中进行了其中一个MPI_Wait*
或多个MPI_Test*
调用,否则可能不会发生实际的数据传输.一些MPI实现具有(或可以配置)后台进程线程,即使没有对请求进行等待/测试,也将推进发送操作,但是不应该依赖于这种行为.
只需更换MPI_Isend
用MPI_Send
或添加MPI_Wait(&request);
后前(记住,虽然是MPI_Isend
+ MPI_Wait
后立即相当于MPI_Send
).
MPI_Iprobe
旨在用于繁忙的等待,即:
while (condition)
{
MPI_Iprobe(...,&flag,...);
if (flag)
{
MPI_Recv(...);
...
}
// Do something, e.g. background tasks
}
Run Code Online (Sandbox Code Playgroud)
实际MPI实现中的实际消息传输是非常复杂的事情.操作通常分为多个部分,然后排队.执行这些部分称为进程,它在MPI库中的各个点完成,例如,当进行通信调用时,或者如果库实现后台进程线程,则在后台完成.呼叫MPI_Iprobe
肯定会取得进展,但不能保证单个呼叫就足够了.MPI标准规定:
MPI实现
MPI_PROBE
并且MPI_IPROBE
需要保证进度:如果某个MPI_PROBE
进程已发出调用,并且某个进程已启动与该探测匹配的发送,则该调用MPI_PROBE
将返回,除非该消息被另一个并发消息接收接收操作(在探测过程中由另一个线程执行).类似地,如果进程忙等待MPI_IPROBE
并且已发出匹配消息,则最终将返回调用,MPI_IPROBE
flag = true
除非该消息是由另一个并发接收操作接收的.
注意最终的使用.如何进行进展是特定于实施的.比较以下5次连续调用的输出MPI_Iprobe
(您的原始代码+紧密循环):
打开MPI 1.6.5 w/o进度线程:
# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
Run Code Online (Sandbox Code Playgroud)
观察到相同MPI程序的多次执行之间没有一致性,并且在第3次运行中,标志仍然false
在5次调用之后MPI_Iprobe
.
英特尔MPI 4.1.2:
# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
Run Code Online (Sandbox Code Playgroud)
显然,英特尔MPI的进展与Open MPI不同.
两个实现之间的区别可以解释MPI_Iprobe
为应该是一个微小的探测器,因此它应该花费尽可能少的时间.另一方面,进展需要时间,并且在单线程MPI实现中,可能进展的唯一时间点是调用MPI_Iprobe
(在该特定情况下).因此,MPI实施者必须决定每次呼叫实际进展多少,MPI_Iprobe
并在呼叫完成的工作量和所花费的时间之间取得平衡.
随着MPI_Probe
东西是不同的.这是一个阻塞呼叫,因此它能够不断前进,直到出现匹配的消息(更具体地说是其包络).
归档时间: |
|
查看次数: |
4031 次 |
最近记录: |