如何并行运行 perl 脚本并捕获文件中的输出？

Question

如何并行运行 perl 脚本并捕获文件中的输出？

dc_*_*erd 2 parallel-processing perl redirect background stdout

我需要并行运行 Perl 测试，并在每个测试文件的单独文件中捕获 STDOUT 和 STDERR。即使在一个文件中捕获，我也没有成功。我一直都这样，但没有运气。这就是我开始的地方（我不会给你带来所有的变化）。任何帮助是极大的赞赏。谢谢！

foreach my $file ( @files) {
    next unless $file =~ /\.t$/;
    print "\$file = $file\n";

    $file =~ /^(\w+)\.\w+/;
    
    my $file_pfx = $1;
    my $this_test_file_name = $file_pfx . '.txt';
    
    system("perl $test_dir\\$file > results\\$test_file_name.txt &") && die "cmd failed: $!\n";

}

Run Code Online (Sandbox Code Playgroud)

Answer 1

zdi*_*dim 5

这是一个使用Parallel::ForkManager生成单独进程的简单示例。

\n

在每个进程中，流STDOUT和STDERR流都被重定向，对于演示来说有两种方式：STDOUT重定向到变量，然后可以根据需要传递该变量（此处转储到文件中），以及STDERR直接重定向到文件。或者使用库，并在单独的代码片段中提供示例。

\n

这些数字1..6代表每个孩子将选择处理的数据批次。仅立即启动三个进程，然后当一个进程完成时，另一个进程将在其位置启动。^\xe2\x80\xa0 （在这里他们几乎立即退出，“工作”是微不足道的。）

\n

use warnings;\nuse strict;\nuse feature \'say\';\n\nuse Carp qw(carp)        \nuse Path::Tiny qw(path); \nuse Parallel::ForkManager; \n\nmy $pm = Parallel::ForkManager->new(3); \n\nforeach my $data (1..6) { \n    $pm->start and next;     # start a child process\n    proc_in_child($data);    # code that runs in the child process\n    $pm->finish;             # exit it\n}\n$pm->wait_all_children;      # reap all child processes\n\nsay "\\nParent $$ done\\n";\n    \nsub proc_in_child {\n    my ($data) = @_; \n    say "Process $$ with data $data";  # still shows on terminal\n\n    # Will dump all that was printed to streams to these files\n    my (outfile, $errfile) = \n        map { "proc_data-${data}_" . $_ . ".$$.out" } qw(stdout stderr);\n\n    # Redirect streams\n    # One way to do it, redirect to a variable (for STDOUT)...  \n    open my $fh_stdout, ">", \\my $so or carp "Can\'t open handle to variable: $!";\n    my $fh_STDOUT = select $fh_stdout;\n    # ...another way to do it, directly to a file (for any stream)\n    # (first \'dup\' it so it can be restored if needed)\n    open my $SAVEERR, ">&STDERR"  or carp "Can\'t dup STDERR: $!";\n    open *STDERR, ">", $errfile or carp "Can\'t redirect STDERR to $errfile: $!";\n\n    # Prints wind up in a variable (for STDOUT) and a file (for STDERR)\n    say  "STDOUT: Child process with pid $$, processing data #$data"; \n    warn "STDERR: Child process with pid $$, processing data #$data"; \n\n    close $fh_stdout;\n    # If needed to restore (not in this example which exits right away)\n    select $fh_STDOUT;\n    open STDERR, \'>&\', $SAVEERR  or carp "Can\'t reopen STDERR: $!";\n\n    # Dump all collected STDOUT to a file (or pass it around, it\'s a variable)\n    path( $outfile )->spew($so);\n\n    return 1\n}\n

Run Code Online (Sandbox Code Playgroud)\n

虽然STDOUT重定向到变量，但STDERR不能以这种方式重定向，这里它直接转到文件。见开。然而，也有一些方法可以将其捕获到变量中。

\n

然后，您可以使用模块从子进程返回到父进程的功能，然后父进程可以处理这些变量。例如，请参阅这篇文章、这篇文章和这篇文章。（还有更多，这些只是我所知道的。）或者实际上只是将它们转储到文件中，就像这里所做的那样。

\n

另一种方法是使用可以运行代码和重定向输出的模块，例如Capture::Tiny

\n

use Capture::Tiny qw(capture);\n\nsub proc_in_child {\n    my ($data) = @_; \n    say "Process $$ with data $data";  # on terminal\n\n    # Run code and capture all output\n    my ($stdout, $stderr, @results) = capture {\n          say  "STDOUT: Child process $$, processing data #$data";\n          warn "STDERR: Child process $$, processing data #$data"; \n\n          # return results perhaps...\n          1 .. 4;\n    }\n\n    # Do as needed with variables with collected STDOUT and STDERR\n    # Return to parent, or dump to file:\n    my ($outfile, $errfile) = \n        map { "proc_data-${data}_" . $_ . ".$$.out" } qw(stdout stderr);\n\n    path($outfile) -> spew( $stdout );\n    path($errfile) -> spew( $stderr );\n\n    return 1\n}    \n

Run Code Online (Sandbox Code Playgroud)\n

\n

^\xe2\x80\xa0这会保持相同数量的进程运行。或者，可以将其设置为等待整批完成，然后开始另一批。一些操作细节请看这篇文章

\n

归档时间：	4 年，3 月前
查看次数：	396 次
最近记录：	4 年，3 月前