解析简单表

bio*_*ech 0 perl awk

对于输入文件中的每一行,我想打印字符串'locus_tag ='的字段,如果没有字段匹配,则打印短划线.

输入文件(制表符分隔):

GeneID_2=7277058    location=890211..892127 locus_tag=HAPS_0907 orientation=+
GeneID_2=7278144    gene=rlmL   location=complement(1992599..1994776)   locus_tag=HAPS_2029
GeneID_2=7278145    gene=rlmT   location=complement(1992599..1994776)   timetoparse
Run Code Online (Sandbox Code Playgroud)

期望的输出:

locus_tag=HAPS_0907
locus_tag=HAPS_2029
-
Run Code Online (Sandbox Code Playgroud)

尝试了这个但没有工作:

awk -F'\t' '{ for(i=1; i<=NF; i++) if($i ~/locus_tag=/) {print $i}; {for(i=1; i<=NF; i++) if($i !=/locus_tag=/) {print "-"}} }' SNP_annotations_ON_PROTEIN
Run Code Online (Sandbox Code Playgroud)

Сух*_*й27 6

perl -lpe '($_)= (/(locus_tag=\S+)/, "-")' file
Run Code Online (Sandbox Code Playgroud)

产量

locus_tag=HAPS_0907
locus_tag=HAPS_2029
-
Run Code Online (Sandbox Code Playgroud)