如何使用Perl正确计算CSV dcoument中的字段长度?

mro*_*opa 2 perl

我有一个数据et,喜欢while用Perl脚本做一个简单的操作.以下是数据集中的一个小提取:

"number","code","country","gamma","X1","X2","X3","X4","X5","X6"1,"DZA","Algeria"," 0.01",7.44,47.3,0.46,0,0,0.13 2,"AGO","安哥拉","0.00",6.79,"空",0.21,1,0,0.28 3,"BEN","贝宁" ," - 0.01",7.02,38.9,0.27,1,0,0.05 4,"BWA","博茨瓦纳","0.06",6.28,45.7,0.42,1,0,0.07 5,"HVO","布基纳Faso","0.00",6.15,36.3,0.08,1,0,0.05 6,"BDI","布隆迪","0.00",6.38,41.8,0.18,1,0,0

脚本应计算每个,分隔字段的长度,并将最高值存储到数组中.

但是,保存无法正常工作.这是代码的一部分:

@maxl = map length, @terms;

while(`<INFILE>`) {
$_ =~ s/[\"\n]//g ;
@terms = split/$sep/, $_;
@lengths = map length, @terms;
for($k = 0, $k <= $#terms, $k++) { 
    if($lengths[$k] > $maxl[$k]) {
    $maxl[$k] = $lenghts[$k];
    }
}
print "@lengths\n";
}
Run Code Online (Sandbox Code Playgroud)

现在@maxl使用代码中的早期部分,它使用数据集的第二行.当我使用print命令只是为了看到@maxl我得到的操作的值:

1 3 7 4 4 4 4 1 1 5

while循环中我使用另一个print语句来查看其他值,我得到:

1 3 6 4 4 4 4 1 1 4
1 3 5 5 4 4 4 1 1 4
1 3 8 4 4 4 4 1 1 4
1 3 12 4 4 4 4 1 1 4
1 3 7 4 4 4 4 1 1 1
1 3 8 4 4 4 4 1 1 4
1 3 10 4 4 4 4 1 1 4
1 3 16 5 4 4 4 1 1 4
2 3 4 5 3 4 4 1 1 4
2 3 7 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 8 4 4 4 4 1 1 4
2 3 5 4 4 4 1 1 1 4
Run Code Online (Sandbox Code Playgroud)

第四列例如具有明显大于3的值.while循环应该保存最大值并将这些值替换为@maxl.

什么地方出了错?


...在for循环中逗号是错误的

for($k = 0, $k <= $#terms, $k++)
Run Code Online (Sandbox Code Playgroud)

然而,清洁后,似乎仍有问题......

plu*_*lus 9

$maxl[$k] = $lenghts[$k]; 对于初学者来说这里有一个拼写错误 ('严格'会抓到)

考虑使用Text :: CSV来更可靠地解析以逗号分隔的数据(它还可以处理其他分隔符):

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;

my $csv = Text::CSV->new();
my @max_lengths;

while ( my $line = <INFILE> ) {

    die "Unable to parse '$line'" unless $csv->parse($line);

    my @column_lengths = map { length } $csv->fields();

    for my $i ( 0 .. $#column_lengths ) {
        if ( $column_lengths[$i] > ($max_lengths[$i] || 0) ) {
            $max_lengths[$i] = $column_lengths[$i];
        }
    }
}

print "MAX LENGTHS OF EACH FIELD: @max_lengths\n";
Run Code Online (Sandbox Code Playgroud)