用awk比较文件

Question

用awk比较文件

Tit*_*llo 8 comparison awk compare two-columns

嗨我有两个相似的文件(都有3列).我想检查这两个文件是否包含相同的元素(但以不同的顺序列出).首先,我想比较第一列

FILE1.TXT

"aba" 0 0 
"abc" 0 1
"abd" 1 1 
"xxx" 0 0

Run Code Online (Sandbox Code Playgroud)

FILE2.TXT

"xyz" 0 0
"aba" 0 0
"xxx" 0 0
"abc" 1 1

Run Code Online (Sandbox Code Playgroud)

我怎么能用awk做到这一点？我试着环顾四周,但我发现只有复杂的例子.如果我想在比较中包含其他两列,该怎么办？输出应该给我匹配元素的数量.

Answer 1

Chr*_*our 27

要在两个文件中打印公共元素:

$ awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
"aba"
"abc"
"xxx"

Run Code Online (Sandbox Code Playgroud)

说明:

NR并且FNR是分别awk存储记录总数和当前文件中记录数的变量(默认记录是一行).

NR==FNR # Only true when in the first file 
{
    a[$1] # Build associative array on the first column of the file
    next  # Skip all proceeding blocks and process next line
}
($1 in a) # Check in the value in column one of the second files is in the array
{
    # If so print it
    print $1
}

Run Code Online (Sandbox Code Playgroud)

如果你想匹配整行,那么使用$0:

$ awk 'NR==FNR{a[$0];next}$0 in a{print $0}' file1 file2
"aba" 0 0
"xxx" 0 0

Run Code Online (Sandbox Code Playgroud)

或者一组特定的列:

$ awk 'NR==FNR{a[$1,$2,$3];next}($1,$2,$3) in a{print $1,$2,$3}' file1 file2
"aba" 0 0
"xxx" 0 0

Run Code Online (Sandbox Code Playgroud)

Answer 2

Ste*_*eve 6

要打印匹配元素的数量,这里有一种方法awk:

awk 'FNR==NR { a[$1]; next } $1 in a { c++ } END { print c }' file1.txt file2.txt

Run Code Online (Sandbox Code Playgroud)

结果使用您的输入:

Run Code Online (Sandbox Code Playgroud)

如果您想添加额外的列(例如,第一列,第二列和第三列),请使用伪多维数组:

awk 'FNR==NR { a[$1,$2,$3]; next } ($1,$2,$3) in a { c++ } END { print c }' file1.txt file2.txt

Run Code Online (Sandbox Code Playgroud)

结果使用您的输入:

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，12 月前
查看次数：	23804 次
最近记录：	8 年前