Grep使用-A和-B标志输出奇怪的字符进行fastq分析

The*_*man 1 bash sequencing fastq

我有一个看起来像这样的文件:

@HISEQ:331:C85AMANXX:8:1101:16636:1980 1:N:0:ATCACGAC
NTCTATAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFF
@HISEQ:331:C85AMANXX:8:1101:2337:2047 1:N:0:ATCACGAC
CTGTGAAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFF<FFF<FF<BFFFF<FFFFBFFFBFFFFF<B
Run Code Online (Sandbox Code Playgroud)

我使用以下grep命令:

grep -B 1 -A 2 'AGGCATCAGCCGGA' file.fastq | head > out.fastq
Run Code Online (Sandbox Code Playgroud)

输出看起来像这样,你可以在第5行和第10行看到输出两个破折号,我不希望它如此:

@HISEQ:331:C85AMANXX:8:1101:16636:1980 1:N:0:ATCACGAC
NTCTATAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFF
--
@HISEQ:331:C85AMANXX:8:1101:2337:2047 1:N:0:ATCACGAC
CTGTGAAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFF<FFF<FF<BFFFF<FFFFBFFFBFFFFF<B
--
Run Code Online (Sandbox Code Playgroud)

有没有办法在第5行和第10行没有破折号的情况下输出?

Sam*_*nen 5

默认情况grep下,按分隔符分隔上下文组--.一个块中可能有多个匹配,因此行数不是恒定的.分隔符将显示块开始和结束的位置.

您可以添加该选项--no-group-separator以禁止此功能(如果您的版本可用)grep.