Man*_*uel 2 sed text-processing bioinformatics
我想提取包含以下模式的文件中的所有行:“#1:”和“tree length for”。
输入:
#1: nexus0002_Pseudomonas_10M
branch t N S dN/dS dN dS N*dN S*dS
6..5 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
6..7 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
7..1 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
7..4 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
6..8 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
8..2 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
8..3 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
tree length for dN: 0.0023
tree length for dS: 0.0341
#1: nexus0003_Pseudomonas_10M
branch t N S dN/dS dN dS N*dN S*dS
6..5 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
6..7 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
7..1 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
7..4 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
6..8 0.000 390.0 195.0 0.0668 0.0000 0.0000 0.0 0.0
8..2 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
8..3 0.013 390.0 195.0 0.0668 0.0008 0.0114 0.3 2.2
tree length for dN: 0.0111
tree length for dS: 0.0444
Run Code Online (Sandbox Code Playgroud)
输出:
#1: nexus0002_Pseudomonas_10M
tree length for dN: 0.0023
tree length for dS: 0.0341
#1: nexus0003_Pseudomonas_10M
tree length for dN: 0.0111
tree length for dS: 0.0444
Run Code Online (Sandbox Code Playgroud)
有没有简单的sed解决方案?
用 grep
grep -E "^#1:|tree length for" infile.txt
Run Code Online (Sandbox Code Playgroud)
或者 sed
sed -n '/^#1:/p;/^tree length for/p' infile.txt
Run Code Online (Sandbox Code Playgroud)