使用awk加入两个文件

cha*_*has 6 awk join file

我有两个文件,如下所示,以制表符分隔:

档案A.

chr1   123 aa b c d
chr1   234 a  b c d
chr1   345 aa b c d
chr1   456 a  b c d
....
Run Code Online (Sandbox Code Playgroud)

文件B.

xxxx  abcd    chr1   123    aa    c    d    e
yyyy  defg    chr1   345    aa    e    f    g
...
Run Code Online (Sandbox Code Playgroud)

我想基于3列"chr1","123"和"aa"加入这两个文件,并将文件B中的前两列添加到文件A中,这样输出如下所示:输出:

chr1   123    aa    b    c    d    xxxx    abcd
chr1   234    a     b    c    d
chr1   345    aa    b    c    d    yyyy    defg
chr1   456    a    b    c    d
Run Code Online (Sandbox Code Playgroud)

谁能帮助你在awk中做到这一点.如果可能使用awk oneliners?

Chr*_*our 11

以下是一种使用方法awk:

$ awk 'NR==FNR{a[$3,$4]=$1OFS$2;next}{$6=a[$1,$2];print}' OFS='\t' fileb filea
chr1    123     a    b    c     xxxx    abcd
chr1    234     a    b    c 
chr1    345     a    b    c     yyyy    defg
chr1    456     a    b    c 
Run Code Online (Sandbox Code Playgroud)

说明:

NR==FNR             # current recond num match the file record num i.e in filea
a[$3,$4]=$1OFS$2    # Create entry in array with fields 3 and 4 as the key
next                # Grab the next line (don't process the next block)
$6=a[$1,$2]         # Assign the looked up value to field 6 (+rebuild records)  
print               # Print the current line & the matching entry from fileb ($6)

OFS='\t'            # Seperate each field with a single TAB on output
Run Code Online (Sandbox Code Playgroud)

编辑:

对于3字段问题,您只需添加额外字段:

$ awk 'NR==FNR{a[$3,$4,$5]=$1OFS$2;next}{$6=a[$1,$2,$3];print}' OFS='\t' fileb filea
chr1    123    aa     b      c     xxxx     abcd
chr1    234    a      b      c  
chr1    345    aa     b      c     yyyy     defg
chr1    456    a      b      c 
Run Code Online (Sandbox Code Playgroud)