我有一个 4 列制表符分隔的文件,最后一列有时有重复项。这是该文件的摘录:
chr7 116038644 116039744 GeneA
chr7 116030947 116032047 GeneA
chr7 115846040 115847140 GeneA
chr7 115824610 115825710 GeneA
chr7 115801509 115802609 GeneA
chr7 115994986 115996086 GeneA
chrX 143933024 143934124 GeneB
chrX 143933119 143934219 GeneB
chrY 143933129 143933229 GeneC
Run Code Online (Sandbox Code Playgroud)
对于该列中的每一组重复项,我想将它们转换为这样的(不真正触及该列中的非重复值):
chr7 116038644 116039744 GeneA-1
chr7 116030947 116032047 GeneA-2
chr7 115846040 115847140 GeneA-3
chr7 115824610 115825710 GeneA-4
chr7 115801509 115802609 GeneA-5
chr7 115994986 115996086 GeneA-6
chrX 143933024 143934124 GeneB-1
chrX 143933119 143934219 GeneB-2
chrY 143933129 143933229 GeneC
Run Code Online (Sandbox Code Playgroud)
我怎样才能用awk
orsed …