根据 column1 连接多行

pvk*_*hat 10 command-line awk text-processing merge join

我有一个像下面这样的文件..

abc, 12345
def, text and nos    
ghi, something else   
jkl, words and numbers

abc, 56345   
def, text and nos   
ghi, something else 
jkl, words and numbers

abc, 15475  
def, text and nos 
ghi, something else
jkl, words and numbers

abc, 123345
def, text and nos
ghi, something else  
jkl, words and numbers
Run Code Online (Sandbox Code Playgroud)

我想将其转换(加入)为:

abc, 12345, 56345, 15475, 123345
def, text and nos, text and nos,text and nos,text and nos
ghi, something else, something else, something else, something else   
jkl, words and numbers, words and numbers, words and numbers, words and numbers
Run Code Online (Sandbox Code Playgroud)

cuo*_*glm 11

如果您不介意输出顺序:

$ awk -F',' 'NF>1{a[$1] = a[$1]","$2};END{for(i in a)print i""a[i]}' file 
jkl, words and numbers, words and numbers, words and numbers, words and numbers
abc, 12345, 56345, 15475, 123345
ghi, something else, something else, something else, something else
def, text and nos, text and nos, text and nos, text and nos
Run Code Online (Sandbox Code Playgroud)

解释

  • NF>1 这意味着我们只需要处理非空白行。
  • 我们将所有第一个字段保存在关联数组中a,键是第一个字段,值是第二个字段(或行的其余部分)。如果键已经有值,我们连接两个值。
  • END块中,我们遍历关联数组a,打印其所有具有相应值的键。

或者使用perl将保持顺序:

$perl -F',' -anle 'next if /^$/;$h{$F[0]} = $h{$F[0]}.", ".$F[1];
    END{print $_,$h{$_},"\n" for sort keys %h}' file
abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos, text and nos, text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers
Run Code Online (Sandbox Code Playgroud)