我有一个包含以下信息的文件:
gene 3025..3855
/gene="Sp34_10000100"
/ID="Sp34_10000100"
CDS join(3025..3106,3722..3855)
/gene="Sp34_10000100"
/codon_start=1
/ID="Sp34_10000100.t1.cds1,Sp34_10000100.t1.cds2"
mRNA 3025..3855
/ID="Sp34_10000100.t1"
/gene="Sp34_10000100"
gene 12640..13470
/gene="Sp34_10000200"
/ID="Sp34_10000200"
CDS join(12640..12721,13337..13470)
/gene="Sp34_10000200"
/codon_start=1
/ID="Sp34_10000200.t1.cds1,Sp34_10000200.t1.cds2"
mRNA 12640..13470
/ID="Sp34_10000200.t1"
/gene="Sp34_10000200"
gene 15959..20678
/gene="Sp34_10000300"
/ID="Sp34_10000300"
CDS join(15959..16080,16268..16367,18913..19116,20469..20524,20582..20678)
/gene="Sp34_10000300"
/codon_start=1
/ID="Sp34_10000300.t1.cds1,Sp34_10000300.t1.cds2,Sp34_10000300.t1.cds3,Sp34_10000300.t1.cds4,Sp34_10000300.t1.cds5"
mRNA 15959..20678
/ID="Sp34_10000300.t1"
/gene="Sp34_10000300"
gene 22255..23085
/gene="Sp34_10000400"
/ID="Sp34_10000400"
Run Code Online (Sandbox Code Playgroud)
我想删除所有基因部分,但CDS和mRNA信息应该在那里。输出应该是这样的:
CDS join(3025..3106,3722..3855)
/gene="Sp34_10000100"
/codon_start=1
/ID="Sp34_10000100.t1.cds1,Sp34_10000100.t1.cds2"
mRNA 3025..3855
/ID="Sp34_10000100.t1"
/gene="Sp34_10000100"
CDS join(12640..12721,13337..13470)
/gene="Sp34_10000200"
/codon_start=1
/ID="Sp34_10000200.t1.cds1,Sp34_10000200.t1.cds2"
mRNA 12640..13470
/ID="Sp34_10000200.t1"
/gene="Sp34_10000200"
CDS join(15959..16080,16268..16367,18913..19116,20469..20524,20582..20678)
/gene="Sp34_10000300"
/codon_start=1
/ID="Sp34_10000300.t1.cds1,Sp34_10000300.t1.cds2,Sp34_10000300.t1.cds3,Sp34_10000300.t1.cds4,Sp34_10000300.t1.cds5"
mRNA 15959..20678
/ID="Sp34_10000300.t1"
/gene="Sp34_10000300"
Run Code Online (Sandbox Code Playgroud)
请给我任何建议如何做到这一点。
我有以下信息表:
ko:K00624
ko:K20215
1.5.3.5
ko:K01106
2.3.41.5
Run Code Online (Sandbox Code Playgroud)
我想要这样的输出:
ko:K00624
ko:K20215
-
ko:K01106
-
Run Code Online (Sandbox Code Playgroud)
我使用了以下命令,但它不起作用。请建议我
cat filename | awk '{if($1!~"ko"); print "-") print }' | less
Run Code Online (Sandbox Code Playgroud) 你能给我建议如何从一行或一行中唯一地排序吗?我有这样的信息:
Special c1,c2,c5,c7,c1,c2
Special2 C6
Run Code Online (Sandbox Code Playgroud)
(这是Special
和之间的制表符c1...
)。
我想要这样的输出:
Special c1,c2,c5,c7
Special2 C6
Run Code Online (Sandbox Code Playgroud)
我怎样才能做到这一点?
">16RI1_0 M01230:42:000000000-AWMRD:1:1101:15012:1778 1:N:0:0
TATCCGGATTTACTGGGTGTAAAGGGAGCGTAGGCGGCCATGCAAGTCAGAAGTGAAAAC
">16RA2_1 M01230:42:000000000-AWMRD:1:1101:15923:1780 1:N:0:0
TTGTCCGGATTTATTGGGCGTAAAGCGAGCGCAGGCGGTTTCTTAAGTCTGATGTGAAAGC
">0VC3_7 M01230:42:000000000-AWMRD:1:1101:15805:1805 1:N:0:0 TCATGAAGAACTCCGATCGCGAAGGCAAGTGTCCGGGGTGCAACTGACGCTGAGGCTCGAA
">11VI2_15 M01230:42:000000000-AWMRD:1:1101:17657:1817 1:N:0:0
GCGGCTTACTGGACTGTAACTGACGTTGAGGCTCGAAAGCGTGGGGAGCAAACAGGGCTC
Run Code Online (Sandbox Code Playgroud)
您好,我有一个包含此类信息的文件。我想打印所有以“>”开头的行和下一行,但有一个条件,以“>”开头的行应该包含字母V。请帮助我。