xpd*_*ude 2 csv sorting bash shell uniq
我有一个csv分开;.我需要删除第2列和第3列的内容不唯一的行,并将材料传递到标准输出.
输入示例:
irrelevant;data1;data2;irrelevant;irrelevant
irrelevant;data3;data4;irrelevant;irrelevant
irrelevant;data5;data6;irrelevant;irrelevant
irrelevant;data7;data8;irrelevant;irrelevant
irrelevant;data1;data2;irrelevant;irrelevant
irrelevant;data9;data0;irrelevant;irrelevant
irrelevant;data1;data2;irrelevant;irrelevant
irrelevant;data3;data4;irrelevant;irrelevant
Run Code Online (Sandbox Code Playgroud)
期望的输出
irrelevant;data5;data6;irrelevant;irrelevant
irrelevant;data7;data8;irrelevant;irrelevant
irrelevant;data9;data0;irrelevant;irrelevant
Run Code Online (Sandbox Code Playgroud)
我找到了只有第一行打印到输出的解决方案:
sort -u -t ";" -k2,1 file
Run Code Online (Sandbox Code Playgroud)
但这还不够.
我试过使用,uniq -u但我找不到只检查几列的方法.
使用awk:
awk -F';' '!seen[$2,$3]++{data[$2,$3]=$0}
END{for (i in seen) if (seen[i]==1) print data[i]}' file
irrelevant;data5;data6;irrelevant;irrelevant
irrelevant;data7;data8;irrelevant;irrelevant
irrelevant;data9;data0;irrelevant;irrelevant
Run Code Online (Sandbox Code Playgroud)
说明:如果数组$2,$3中不存在组合,seen则带有键的新条目$2,$3将存储在data具有整个记录的数组中.每次$2,$3进入条目时,计数器$2,$3都会递增.然后最后counter==1打印出那些条目.