如何在Perl中打开文件数组？

Question

如何在Perl中打开文件数组？

在perl中,我从一个目录中读取文件,我想同时打开它们(但是逐行),这样我就可以执行一个将所有第n行一起使用的函数(例如连接).

my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);
my @files;
for my $i (0..$#temps) {
  my $file;
  open($file,"<",$temps[$i]);
  push(@files,$file);
}
my $concat;
for my $i (0..$#files) {
  my @blah = <$files[$i]>;
  $concat.=$blah;
}
print $concat;

Run Code Online (Sandbox Code Playgroud)

我只是一堆错误,使用未初始化的值和GLOB(..)错误.那我怎么能做这个呢？

Answer 1

小智 15

很多问题.从调用"ls | grep"开始:)

让我们从一些代码开始:

首先,让我们获取文件列表:

my @files = glob( '*.txt' );

Run Code Online (Sandbox Code Playgroud)

但是测试给定名称是否与文件或目录相关会更好:

my @files = grep { -f } glob( '*.txt' );

Run Code Online (Sandbox Code Playgroud)

现在,让我们打开这些文件来阅读它们:

my @fhs = map { open my $fh, '<', $_; $fh } @files;

Run Code Online (Sandbox Code Playgroud)

但是,我们需要一种方法来处理错误 - 在我看来,最好的方法是添加:

use autodie;

Run Code Online (Sandbox Code Playgroud)

在脚本的开头(和autodie的安装,如果你还没有).或者你可以:

use Fatal qw( open );

Run Code Online (Sandbox Code Playgroud)

现在,我们拥有它,让我们从所有输入中获取第一行(如您在示例中所示),并将其连接起来:

my $concatenated = '';

for my $fh ( @fhs ) {
    my $line = <$fh>;
    $concatenated .= $line;
}

Run Code Online (Sandbox Code Playgroud)

哪个是完美的,可读的,但仍然可以缩短,同时保持(在我看来)可读性,:

my $concatenated = join '', map { scalar <$_> } @fhs;

Run Code Online (Sandbox Code Playgroud)

效果是相同的 - $ concatenated包含所有文件的第一行.

所以,整个程序看起来像这样:

#!/usr/bin/perl
use strict;
use warnings;
use autodie;
# use Fatal qw( open ); # uncomment if you don't have autodie

my @files        = grep { -f } glob( '*.txt' );
my @fhs          = map { open my $fh, '<', $_; $fh } @files;
my $concatenated = join '', map { scalar <$_> } @fhs;

Run Code Online (Sandbox Code Playgroud)

现在,可能不仅要连接第一行,而且要连接所有连接.在这种情况下,$concatenated = ...你需要这样的东西,而不是代码:

my $concatenated = '';

while (my $fh = shift @fhs) {
    my $line = <$fh>;
    if ( defined $line ) {
        push @fhs, $fh;
        $concatenated .= $line;
    } else {
        close $fh;
    }
}

Run Code Online (Sandbox Code Playgroud)

Answer 2

Chr*_*utz 8

这是你的问题:

for my $i (0..$#files) {
  my @blah = <$files[$i]>;
  $concat .= $blah;
}

Run Code Online (Sandbox Code Playgroud)

首先,<$files[$i]>不是有效的文件句柄读取.这是您的GLOB(...)错误的来源.请参阅mobrule的答案,了解为何会出现这种情况.所以改成它:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah;
}

Run Code Online (Sandbox Code Playgroud)

第二个问题,你正在混合@blah(一个名为数组blah)和$blah一个名为的标量blah.这是"未初始化的值"错误的来源 - $blah(标量)尚未初始化,但您正在使用它.如果你想要$n-th行@blah,请使用:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah[$n];
}

Run Code Online (Sandbox Code Playgroud)

我不想继续打死马,但我确实希望找到一个更好的方法来做某事:

my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);

Run Code Online (Sandbox Code Playgroud)

这将读入当前目录中所有文件的列表,其中包含".txt"扩展名.这工作,并且是有效的,但它可以是相当缓慢的-我们要调出该外壳,其中有到餐桌关运行ls和grep,并招致一些开销.此外,ls它grep是简单而常见的程序,但不完全便携.当然有更好的方法来做到这一点:

my @temps;
opendir(DIRHANDLE, ".");
while(my $file = readdir(DIRHANDLE)) {
  push @temps, $file if $file =~ /\.txt/;
}

Run Code Online (Sandbox Code Playgroud)

简单,简短,纯粹的Perl,没有分叉,没有非可移植的shell,我们不必读取字符串然后拆分它 - 我们只能存储我们真正需要的条目.另外,修改通过测试的文件的条件变得微不足道.假设我们最终意外地读取了该文件,test.txt.gz因为我们的正则表达式匹配:我们可以轻松地将该行更改为:

  push @temps, $file if $file =~ /\.txt$/;

Run Code Online (Sandbox Code Playgroud)

我们可以用grep(我相信)做到这一点,但是grep当Perl拥有内置的任何一个最强大的正则表达式库时,为什么要满足于有限的正则表达式呢？

归档时间：	16 年，4 月前
查看次数：	10342 次
最近记录：	16 年，4 月前