CEL*_*CEL 1 bash terminal powershell awk
我有两个不同长度的 .txt 文件,并希望执行以下操作:
如果文件 1 的第 1 列中的值出现在文件 3 的第 1 列中,则打印文件 2 的第 2 列,然后打印与文件 1 对应的整行。
已经尝试过 awk 的排列,但是到目前为止还没有成功!
谢谢!
文件 1:
MARKERNAME EA NEA BETA SE
10:1000706 T C -0.021786390809225 0.519667838651725
1:715265 G C 0.0310128798578049 0.0403763946716293
10:1002042 CCTT C 0.0337857775471699 0.0403300629299562
Run Code Online (Sandbox Code Playgroud)
文件2:
CHR:BP SNP CHR BP GENPOS ALLELE1 ALLELE0 A1FREQ INFO
1:715265 rs12184267 1 715265 0.0039411 G C 0.964671
1:715367 rs12184277 1 715367 0.00394384 A G 0.964588
Run Code Online (Sandbox Code Playgroud)
所需文件 3:
SNP MARKERNAME EA NEA BETA SE
rs12184267 1:715265 G C 0.0310128798578049 0.0403763946716293
Run Code Online (Sandbox Code Playgroud)
尝试:
awk -F'|' 'NR==FNR { a[$1]=1; next } ($1 in a) { print $3, $0 }' file1 file2
awk 'NR==FNR{A[$1]=$2;next}$0 in A{$0=A[$0]}1' file1 file2
Run Code Online (Sandbox Code Playgroud)
使用您显示的样本,您能否尝试以下操作。
awk '
FNR==1{
if(++count==1){ col=$0 }
else{ print $2,col }
next
}
FNR==NR{
arr[$1]=$0
next
}
($1 in arr){
print $2,arr[$1]
}
' file1 file2
Run Code Online (Sandbox Code Playgroud)
说明:为以上添加详细说明。
awk ' ##Starting awk program from here.
FNR==1{ ##Checking condition if this is first line of file(s).
if(++count==1){ col=$0 } ##Checking if count is 1 then set col as current line.
else{ print $2,col } ##Checking if above is not true then print 2nd field and col here.
next ##next will skip all further statements from here.
}
FNR==NR{ ##This will be TRUE when file1 is being read.
arr[$1]=$0 ##Creating arr with 1st field index and value is current line.
next ##next will skip all further statements from here.
}
($1 in arr){ ##Checking condition if 1st field present in arr then do following.
print $2,arr[$1] ##Printing 2nd field, arr value here.
}
' file1 file2 ##Mentioning Input_files name here.
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
61 次 |
| 最近记录: |