如何将不以特定模式开头的行连接到UNIX中的上一行?

ins*_*246 4 unix bash shell awk sed

请查看下面的示例文件和所需的输出,以了解我在寻找什么。

可以通过shell脚本中的循环来完成,但是我在努力获得一个awk/ sed班轮。

SampleFile.txt

These are leaves.
These are branches.
These are greenery which gives
oxygen, provides control over temperature
and maintains cleans the air.
These are tigers
These are bears
and deer and squirrels and other animals.
These are something you want to kill
Which will see you killed in the end.
These are things you must to think to save your tomorrow.
Run Code Online (Sandbox Code Playgroud)

所需的输出

These are leaves.
These are branches.
These are greenery which gives oxygen, provides control over temperature and maintains cleans the air.
These are tigers
These are bears and deer and squirrels and other animals.
These are something you want to kill Which will see you killed in the end.
These are things you must to think to save your tomorrow.
Run Code Online (Sandbox Code Playgroud)

GMi*_*ael 6

请尝试以下操作:

awk 'BEGIN {accum_line = "";} /^These/{if(length(accum_line)){print accum_line; accum_line = "";}} {accum_line = accum_line " " $0;} END {if(length(accum_line)){print accum_line; }}' < data.txt
Run Code Online (Sandbox Code Playgroud)

该代码包括三个部分:

  1. The block marked by BEGIN is executed before anything else. It's useful for global initialization
  2. The block marked by END is executed when the regular processing finished. It is good for wrapping the things. Like printing the last collected data if this line has no These at the beginning (this case)
  3. The rest is the code performed for each line. First, the pattern is searched for and the relevant things are done. Second, data collection is done regardless of the string contents.


Ben*_* W. 5

与sed:

sed ':a;N;/\nThese/!s/\n/ /;ta;P;D' infile
Run Code Online (Sandbox Code Playgroud)

导致

sed ':a;N;/\nThese/!s/\n/ /;ta;P;D' infile
Run Code Online (Sandbox Code Playgroud)

下面是它的工作原理:

sed '
:a                   # Label to jump to
N                    # Append next line to pattern space
/\nThese/!s/\n/ /    # If the newline is NOT followed by "These", append
                     # the line by replacing the newline with a space
ta                   # If we changed something, jump to label
P                    # Print part until newline
D                    # Delete part until newline
' infile
Run Code Online (Sandbox Code Playgroud)

N;P;D是在模式空间中保留多行的惯用方式。条件分支部分照顾了我们追加多行的情况。

这适用于GNU sed;对于Mac OS中的其他sed,必须将oneliner分开,以便将分支和标签放在单独的命令中,换行符可能必须转义,并且我们需要一个额外的分号:

sed -e ':a' -e 'N;/'$'\n''These/!s/'$'\n''/ /;ta' -e 'P;D;' infile
Run Code Online (Sandbox Code Playgroud)

最后一条命令未经测试;有关不同sed之间的差异以及如何处理它们的信息,请参见此答案

另一种选择是按字面意义输入换行符:

sed -e ':a' -e 'N;/\
These/!s/\
/ /;ta' -e 'P;D;' infile
Run Code Online (Sandbox Code Playgroud)

但是,从定义上讲,它不再是单线的。