启动mpi进程时,Slime\Emacs comint挂起

ptb*_*ptb 5 emacs sbcl common-lisp slime mpi

我有一个简单的mpi程序来演示我的问题:

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
    int rank, csize;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &csize);

    printf("Hello from rank[%d/%d]\n", rank, csize);

    MPI_Finalize();
}
Run Code Online (Sandbox Code Playgroud)

编译之后,我可以使用mpirunsbcl repl 成功启动可执行文件:

* (uiop:run-program '("mpirun" "-np" "10" "./hello_world") :output :string)

"Hello from rank[7/10]
Hello from rank[9/10]
Hello from rank[5/10]
Hello from rank[8/10]
Hello from rank[0/10]
Hello from rank[1/10]
Hello from rank[2/10]
Hello from rank[3/10]
Hello from rank[4/10]
Hello from rank[6/10]
"
NIL
0
Run Code Online (Sandbox Code Playgroud)

然而,当我从粘液中运行时,粘液重复只是挂起.如果我直接运行可执行文件,而不是通过mpirun启动程序,那么一切运行正常:

CL-USER> (uiop:run-program '("./hello_world")
               :output :string)
"Hello from rank[0/1]
"
NIL
0
Run Code Online (Sandbox Code Playgroud)

我在linux工作站上使用sbcl-1.4.5和slime 2.20.有没有人有这个问题的解决方案或在哪里寻找?

更新:

问题源于emacs comint模式,这是基于slime的.我观察,如果我推出同样挂行为sbcl通过make-comint-in-buffer再使用uiop:run-program.

UPDATE2:

在读取了一些comint模式后,我能够从挂起过程中捕获一些输出.这个emacs lisp代码:

(make-comint "foo" "mpirun" nil "-np" "1" "/home/ptb/programming/c/hello_world")
Run Code Online (Sandbox Code Playgroud)

在挂起的进程上产生跟随错误:

[warn] Epoll MOD(1) on fd 14 failed.  Old events were 6; read change was 0 (none); write change was 2 (del): Bad file descriptor
[warn] Epoll MOD(4) on fd 14 failed.  Old events were 6; read change was 2 (del); write change was 0 (none): Bad file descriptor
Run Code Online (Sandbox Code Playgroud)

关于这意味着什么的想法?

Sva*_*nte 2

我猜想这是openmpi或libevent中重定向stdin/stdout的问题(过去也曾出现过这样的问题,例如bugzilla.redhat.com/show_bug.cgi?id=1235044)。您使用其中的哪个版本?