使用 awk 对重复字段进行分组

Question

使用 awk 对重复字段进行分组

我有以下文件：

ID|2018-04-29
ID|2018-04-29
ID|2018-04-29
ID1|2018-06-26
ID1|2018-06-26
ID1|2018-08-07
ID1|2018-08-22

Run Code Online (Sandbox Code Playgroud)

并使用 awk，我想添加$3基于$1和的重复 ID 分组，$2以便输出

ID|2018-04-29|group1
ID|2018-04-29|group1
ID|2018-04-29|group1
ID1|2018-06-26|group2
ID1|2018-06-26|group2
ID1|2018-08-07|group3
ID1|2018-08-22|group4

Run Code Online (Sandbox Code Playgroud)

我尝试了以下代码，但它没有给我所需的输出。另外，我不确定是否可以将其应用于包含日期的列。

awk -F"|" '{print $0,"group"++seen[$1,$3]}' OFS="|"

Run Code Online (Sandbox Code Playgroud)

任何有关如何使用 awk（如果可能，单行）实现它的提示将不胜感激。

Answer 1

Rav*_*h13 5

使用您显示的示例，请尝试以下awk代码。

awk -v OFS="|" '!arr[$0]++{count++} {print $0,"group"count}' Input_file

Run Code Online (Sandbox Code Playgroud)

说明：为以上添加详细说明。

awk '                     ##Starting awk program from here.
BEGIN{                    ##Starting BEGIN section of this program from here.
  OFS="|"                 ##Setting OFS to | here.
}
!arr[$0]++{               ##Checking if current line is NOT present in array then do following.
  count++                 ##Increasing count with 1 here.
}
{
  print $0,"group"count   ##Printing current line with group and count value here.
}
' Input_file              ##Mentioning Input_file name here.

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，5 月前
查看次数：	67 次
最近记录：	4 年，5 月前