AWK 能否比 bash for 循环更快地转置这些数据?

275*_*560 1 bash awk

这是我的示例数据:

DATA='target1.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:49
target2.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:30
target3.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:49
target1.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target2.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target3.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target1.domain,LAST_VULN_CRED_SCAN,2022/12/07 04:59:06
target2.domain,LAST_VULN_CRED_SCAN,2022/12/07 04:59:06
target3.domain,LAST_VULN_CRED_SCAN,2022/12/07 03:03:52'
Run Code Online (Sandbox Code Playgroud)

这是我想要的输出:

Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
Run Code Online (Sandbox Code Playgroud)

这是我当前的 for 循环:

UNIQUETARGETS=$(echo "${DATA}" | cut -d , -f 1 | sort | uniq)

echo 'Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN'
for TARGET in $UNIQUETARGETS; do
    LAST_VULN_AGENT_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_AGENT_SCAN' | cut -d , -f 3)
    LAST_VULN_NONCRED_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_NONCRED_SCAN' | cut -d , -f 3)
    LAST_VULN_CRED_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_CRED_SCAN' | cut -d , -f 3)
    echo "${TARGET},${LAST_VULN_AGENT_SCAN},${LAST_VULN_NONCRED_SCAN},${LAST_VULN_CRED_SCAN}"
done
Run Code Online (Sandbox Code Playgroud)

虽然这种方法有效,但我相信 AWK 可以更快地做到这一点。我搜索了几个“转置”awk 片段,但我发现没有任何东西可以完全满足我的要求。任何帮助是极大的赞赏!

Ed *_*ton 6

使用任何 awk:

$ cat tst.awk
BEGIN { FS=OFS="," }
{
    names[$1]
    times[$1,$2] = $3
}
END {
    hdr = "Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN"
    print hdr

    numCols = split(hdr,scans)

    for ( name in names ) {
        printf "%s%s", name, OFS
        for ( colNr=2; colNr<=numCols; colNr++ ) {
            scan = scans[colNr]
            time = times[name,scan]
            printf "%s%s", time, (colNr<numCols ? OFS : ORS)
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

$ awk -f tst.awk file
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
Run Code Online (Sandbox Code Playgroud)

如果您关心输出行的顺序,则可以按照您喜欢的任何顺序对它们进行排序。