在两行中有一些行的 csv 文件

0 perl awk sed

我有一个带有列的大 csv 文件

id,name,sex,ethnicity,hometown,organization,id_card_num,address,mobile_num,phone_num,education
Run Code Online (Sandbox Code Playgroud)

问题是用双引号括起来的组织列,并且在某些行上用 \n 换行符分成两行。我只需要加入符合此条件的行。

gle*_*man 8

由于您标记了,因此Text::CSV模块可以很好地处理数据中的换行符。

#! perl
use Text::CSV;
use autodie;

my $file = shift @ARGV;
my $csv = Text::CSV->new({binary => 1});
open my $fh, "<", $file;

while (my $row = $csv->getline($fh)) {
    $row->[2] =~ s/\n/ /g;
    $csv->say(*STDOUT, $row);
}
close $fh;
Run Code Online (Sandbox Code Playgroud)

file.csv作为

ID,NAME,DESCRIPTION,AGE
hello,hello,"some ""text"" goes here",1
something,anything,"and now a long
text split over two lines",2
stuff,otherstuff,"something between quotes",3
something,anything,"and now another long text
split over two lines or
even three in this case",4
stuff,stuff,"now I'm done",5
Run Code Online (Sandbox Code Playgroud)

perl joiner.pl file.csv 产出

ID,NAME,DESCRIPTION,AGE
hello,hello,"some ""text"" goes here",1
something,anything,"and now a long text split over two lines",2
stuff,otherstuff,"something between quotes",3
something,anything,"and now another long text split over two lines or even three in this case",4
stuff,stuff,"now I'm done",5
Run Code Online (Sandbox Code Playgroud)

如果要从所有字段中删除换行符,而不仅仅是指定的字段,请将循环的第一行更改为:

    s/\n/ /g for @$row;
Run Code Online (Sandbox Code Playgroud)