关于MPI并行循环的问题

tim*_*tim 2 parallel-processing fortran loops mpi openmp

嘿那里,我有一个关于fortran的openmpi的简短问题:我有一个这样的代码:

I) definitions of vars & linear code, setting up some vars for later usage
II) a while loop which works like that in pseudocode:

nr=1
while(true)
{
  filename='result'//nr//'.bin' (nr converted to string)
  if(!file_exists(filename))
    goto 100

  // file exists... so do something with it
  // calculations, read/write...
  nr=nr+1
}
100 continue
III) some more linear code...
Run Code Online (Sandbox Code Playgroud)

现在我想用openmpi进行并行计算.来自I)和III)的线性代码应该只计算一次,而while循环应该在几个处理器上运行......如何最好地实现它?我的问题是while循环是如何工作的:例如,当处理器1计算result1.bin时,如何直接告诉处理器2计算result2.bin?如果有30个文件并且我使用它将如何工作

mpirun -n 10 my_program

?MPI如何"知道"在完成计算一个文件之后,还有更多的文件"等待"处理:一个处理器处理完一个文件后,该处理器应该直接重新开始处理队列中的下一个文件.

谢谢到目前为止!

#

编辑:

#

嘿那里,它又是我...我也想尝试OpenMP,所以我使用了一大块代码来读取现有文件,然后循环它们(并处理它们):

nfiles = 0
do
  write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
  inquire(file=trim(filename),exist=exists)
  if (not(exists)) exit
    nfiles = nfiles + 1
enddo
Run Code Online (Sandbox Code Playgroud)

现在我尝试了以下代码:

call omp_set_num_threads(2)
!$OMP PARALLEL
!$OMP DO 
do i=startnum, endnum
  write(filename,FMT='(A,I0,A)'), prefix, i, suffix
  ...CODE DIRECTLY HERE TO PROCESS THE FILE...
enddo
!$OMP END DO
!$OMP END PARALLEL
Run Code Online (Sandbox Code Playgroud)

但它总是给我这样的错误:"分支出与Open MP DO或PARALLEL DO指令相关的DO循环是违法的."

总是关于使用这种代码的代码行:

read (F_RESULT,*,ERR=1) variable
Run Code Online (Sandbox Code Playgroud)

其中F_RESULT是一个文件句柄......它可能有什么问题?变量是在循环块之外定义的,我已经尝试将OpenMP指令设置为

private(variable) 
Run Code Online (Sandbox Code Playgroud)

这样每个线程都有自己的副本,但是没有用完!谢谢你的帮助!

Jon*_*rsi 5

可能最明智的方法是让其中一个进程事先计算文件总数,然后广播,然后让每个人都做"他们的"文件:

program processfiles
    use mpi
    implicit none

    integer :: rank, comsize, ierr
    integer :: nfiles
    character(len=6) :: prefix="result"
    character(len=4) :: suffix=".bin"
    character(len=50) :: filename
    integer :: i
    integer :: locnumfiles, startnum, endnum
    logical :: exists

    call MPI_Init(ierr)
    call MPI_Comm_size(MPI_COMM_WORLD, comsize, ierr)
    call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)

    ! rank zero finds number of files
    if (rank == 0) then
       nfiles = 0
       do
           write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
           inquire(file=trim(filename),exist=exists)
           if (not(exists)) exit
           nfiles = nfiles + 1
       enddo
    endif
    ! make sure everyone knows
    call MPI_Bcast(nfiles, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)

    if (nfiles /= 0) then
        ! calculate who gets what file
        locnumfiles = nfiles/comsize
        if (locnumfiles * comsize /= nfiles) locnumfiles = locnumfiles + 1
        startnum = locnumfiles * rank + 1
        endnum = startnum + locnumfiles - 1
        if (rank == comsize-1) endnum = nfiles
        do i=startnum, endnum
           write(filename,FMT='(A,I0,A)'), prefix, i, suffix
           call processfile(rank,filename)
        enddo
    else
        if (rank == 0) then
            print *,'No files found; exiting.'
        endif
    endif
    call MPI_Finalize(ierr)

    contains
        subroutine processfile(rank,filename)
            implicit none
            integer, intent(in) :: rank
            character(len=*), intent(in) :: filename
            integer :: unitno
            open(newunit=unitno, file=trim(filename))
            print '(I4,A,A)',rank,': Processing file ', filename
            close(unitno)
        end subroutine processfile
end program processfiles
Run Code Online (Sandbox Code Playgroud)

然后进行一个简单的测试:

$ seq 1 33 | xargs -I num touch "result"num".bin"
$ mpirun -np 2 ./processfiles

   0: Processing file result1.bin                                       
   0: Processing file result2.bin                                       
   0: Processing file result3.bin                                       
   0: Processing file result4.bin                                       
   0: Processing file result5.bin                                       
   0: Processing file result6.bin                                       
   1: Processing file result18.bin                                      
   0: Processing file result7.bin                                       
   0: Processing file result8.bin                                       
   1: Processing file result19.bin                                      
   0: Processing file result9.bin                                       
   1: Processing file result20.bin                                      
   0: Processing file result10.bin                                      
   1: Processing file result21.bin                                      
   1: Processing file result22.bin                                      
   0: Processing file result11.bin                                      
   1: Processing file result23.bin                                      
   0: Processing file result12.bin                                      
   1: Processing file result24.bin                                      
   1: Processing file result25.bin                                      
   0: Processing file result13.bin                                      
   0: Processing file result14.bin                                      
   1: Processing file result26.bin                                      
   1: Processing file result27.bin                                      
   0: Processing file result15.bin                                      
   0: Processing file result16.bin                                      
   1: Processing file result28.bin                                      
   1: Processing file result29.bin                                      
   1: Processing file result30.bin                                      
   0: Processing file result17.bin                                      
   1: Processing file result31.bin                                      
   1: Processing file result32.bin                                      
   1: Processing file result33.bin  
Run Code Online (Sandbox Code Playgroud)

更新以添加补充的OpenMP问题:

因此,在文件的并行处理开始之前,第一个循环是计算文件数的位置.在文件的并行处理发生之前,需要对文件进行计数,因为否则就不可能在处理器之间分配工作; 你需要知道在分割工作之前会有多少"工作单位".(这不是绝对做事的唯一方法,但它是最直接的).

类似地,OMP DO循环需要非常结构化的循环 - 需要一个简单的循环do i=1,n,然后可以很容易地在线程之间分解. n不需要编译,并且增量甚至不需要是1,但它必须是在实际执行循环之前可以确定的事物.因此,例如,由于某些外部原因(如不存在文件),您无法退出循环.

所以你想用OpenMP做的就是做同样的文件计数,然后单独留下,但是在处理循环中,使用并行的do构造.因此,在剥离MPI之后,你会看到以下内容:

    do
        write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
        inquire(file=trim(filename),exist=exists)
        if (.not.exists) exit
        nfiles = nfiles + 1
    enddo

    if (nfiles /= 0) then
        !$OMP PARALLEL SHARED(nfiles,prefix,suffix) PRIVATE(i,thread,filename)
        thread = omp_get_thread_num()
        !$OMP DO 
        do i=1, nfiles
           write(filename,FMT='(A,I0,A)'), prefix, i, suffix
           call processfile(thread,filename)
        enddo
        !$OMP END DO
        !$OMP END PARALLEL 
    else
        print *,'No files found; exiting.'
    endif
Run Code Online (Sandbox Code Playgroud)

但其他一切都是一样的.而且,如果你想要处理文件"内联"(例如,不在sburoutine中),你可以将文件处理代码放在'call processfile()'行的位置.