我有两个文件,如下所示,以制表符分隔:
档案A.
chr1 123 aa b c d
chr1 234 a b c d
chr1 345 aa b c d
chr1 456 a b c d
....
Run Code Online (Sandbox Code Playgroud)
文件B.
xxxx abcd chr1 123 aa c d e
yyyy defg chr1 345 aa e f g
...
Run Code Online (Sandbox Code Playgroud)
我想基于3列"chr1","123"和"aa"加入这两个文件,并将文件B中的前两列添加到文件A中,这样输出如下所示:输出:
chr1 123 aa b c d xxxx abcd
chr1 234 a b c d
chr1 345 aa b c d yyyy defg
chr1 456 a b c d
Run Code Online (Sandbox Code Playgroud)
谁能帮助你在awk中做到这一点.如果可能使用awk oneliners?
Chr*_*our 11
以下是一种使用方法awk:
$ awk 'NR==FNR{a[$3,$4]=$1OFS$2;next}{$6=a[$1,$2];print}' OFS='\t' fileb filea
chr1 123 a b c xxxx abcd
chr1 234 a b c
chr1 345 a b c yyyy defg
chr1 456 a b c
Run Code Online (Sandbox Code Playgroud)
说明:
NR==FNR # current recond num match the file record num i.e in filea
a[$3,$4]=$1OFS$2 # Create entry in array with fields 3 and 4 as the key
next # Grab the next line (don't process the next block)
$6=a[$1,$2] # Assign the looked up value to field 6 (+rebuild records)
print # Print the current line & the matching entry from fileb ($6)
OFS='\t' # Seperate each field with a single TAB on output
Run Code Online (Sandbox Code Playgroud)
编辑:
对于3字段问题,您只需添加额外字段:
$ awk 'NR==FNR{a[$3,$4,$5]=$1OFS$2;next}{$6=a[$1,$2,$3];print}' OFS='\t' fileb filea
chr1 123 aa b c xxxx abcd
chr1 234 a b c
chr1 345 aa b c yyyy defg
chr1 456 a b c
Run Code Online (Sandbox Code Playgroud)