awk具有重复值

use*_*952 2 bash scripting awk

文件:

22 Hello
22 Hi
1  What
34 Where
21 is
44 How
44 are
44 you
Run Code Online (Sandbox Code Playgroud)

期望的输出:

22 HelloHi
1  What
34 Where
21 is
44 Howareyou
Run Code Online (Sandbox Code Playgroud)

如果第一个字段($ 1)中存在重复值,则第二个字段应具有附加文本

如何使用awk实现这一目标?

谢谢

Ed *_*ton 10

$ awk '
!seen[$1]++ { keys[++numKeys] = $1 } 
{ str[$1] = str[$1] $2 }
END{
    for (keyNr=1; keyNr<=numKeys; keyNr++) {
        key = keys[keyNr]
        print key, str[key]
    }
}
' file
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
Run Code Online (Sandbox Code Playgroud)


anu*_*ava 6

使用awk:

awk '!($1 in a){a[$1]=$2;next} $1 in a{a[$1]=a[$1] $2} END{for (i in a) print i, a[i]}' file
22 HelloHi
44 Howareyou
34 Where
21 is
1 What
Run Code Online (Sandbox Code Playgroud)

编辑:保留订单:

awk '!($1 in a){b[++n]=$1; a[$1]=$2;next} $1 in a{a[$1] = a[$1] $2}
        END{for (i=1; i<=n; i++) print b[i], a[b[i]]}' file
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
Run Code Online (Sandbox Code Playgroud)


gle*_*man 5

要维护订单,您需要跟踪它:

awk '
    ! seen[$1]++ {order[++n] = $1}
    {value[$1] = value[$1] $2}
    END {for (i=1; i<=n; i++) print order[i], value[order[i]]}
' <<END
22 Hello
22 Hi
1  What
34 Where
21 is
44 How
44 are
44 you
END
Run Code Online (Sandbox Code Playgroud)
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
Run Code Online (Sandbox Code Playgroud)

如果您知道第1列中的值是连续的(如示例文本中所示),则:

awk '
    prev != $1 {printf "%s%s ", sep, $1; sep=RS} 
    {printf "%s", $2; prev = $1} 
    END {print ""}
'
Run Code Online (Sandbox Code Playgroud)

其他两种方法:

perl -lane '
        push @keys, $F[0] unless grep {$_ eq $F[0]} @keys;
        $val{$F[0]} .= $F[1]
    } END {
        print "$_ $val{$_}" for @keys
' file
Run Code Online (Sandbox Code Playgroud)

并且,进入利基区域

#!/usr/bin/env tclsh
while {[gets stdin line] != -1} {dict append val {*}$line}
dict for {k v} $val {puts "$k $v"}
Run Code Online (Sandbox Code Playgroud)