bio*_*ech 0 regex perl awk parsing
如果找到或不找到字符串,我想只打印'+'o' - '符号.基本上,我有两个文件:
输入文件1(制表符分隔):
HPNK_00457
HPNK_00458
HPNK_00459
Run Code Online (Sandbox Code Playgroud)
输入文件2(制表符分隔):
HPNK_00457 AAA50325 1e-43 437 28 43 83 ATP-binding protein.
HPNK_00458 P25256 8e-43 429 28 43 82 RecName: Full=Tylosin resistance ATP-binding protein tlrC.
HPNK_00458 CAM96590 1e-42 429 27 42 87 ABC transporter ATP-binding protein [Streptomyces ambofaciens].
Run Code Online (Sandbox Code Playgroud)
期望的输出(制表符分隔,维护文件1中的字符串顺序):
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
Run Code Online (Sandbox Code Playgroud)
这是我一直在使用的,但需要更新:
while read vl; do grep "^$vl " file2 || printf -- "- -\n" ; done < file1
Run Code Online (Sandbox Code Playgroud)
谢谢,试着每天在这里学习.
这是使用的一种方式awk:
awk 'FNR==NR { a[$1]; next } { print $1, ($1 in a ? "+" : "-" ) }' file2 file1
Run Code Online (Sandbox Code Playgroud)
结果:
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
Run Code Online (Sandbox Code Playgroud)