使用sed仅打印每个段落的第一个单词

use*_*099 5 sed

我想知道如何用sed one-liner打印每个段落的第一个单词.本例中的段落由2个换行符后面的文本定义.

例如

This is a paragraph with some text. Some random text that is not really important.

This is another paragraph with some text.
However this sentence is still in the same paragraph.
Run Code Online (Sandbox Code Playgroud)

这应该转化为

This

This
Run Code Online (Sandbox Code Playgroud)

Ste*_*eve 7

想想 段落模式

By a special dispensation, an empty string as the value of RS indicates that 
records are separated by one or more blank lines. 

awk或者perl支持"段落模式",或者做出更好的选择sed:

awk '{ print $1 }' RS= ORS="\n\n" file
Run Code Online (Sandbox Code Playgroud)

要么

perl -00 -lane 'print $F[0]' file
Run Code Online (Sandbox Code Playgroud)

结果:

This

This
Run Code Online (Sandbox Code Playgroud)


Lev*_*sky 1

一个可能的GNU sed解决方案是:

sed -rn ':a;/^ *$/{n;ba};s/( |$).*//p;:b;n;/^ *$/ba;bb'
Run Code Online (Sandbox Code Playgroud)

输出:

This
This
Run Code Online (Sandbox Code Playgroud)

它将仅包含空格的行视为空行,并理解段落之间任意数量的空行。还可以正确处理单字段落。