Thi*_*men 2 grep text-processing
如何使用 unix 终端打印一行的每个二元组?标点符号被视为“单词”。
例如,如果我必须遵循以下输入:
This is ! line .
This is ! second line .
Run Code Online (Sandbox Code Playgroud)
如果搜索每个二元组,我的输出应如下所示:
This is
is !
! line
line .
This is
is !
! second
second line
line .
Run Code Online (Sandbox Code Playgroud)
如果搜索每个 trigam,我的输出应该如下所示:
This is !
is ! line
! line .
this is !
is ! second
! second line
second line .
Run Code Online (Sandbox Code Playgroud)
命令
grep -Eio '[a-z!.]+ [a-z!.]+'
Run Code Online (Sandbox Code Playgroud)
退货
This is
! line
This is
! second
line .
Run Code Online (Sandbox Code Playgroud)
这很接近,但还不是我需要的。
你可以像这样使用 perl
二元组
perl -lne 'while(/(\S+\s+\S*){1}/){print $&;s/\S+\s+//}' file
This is
is !
! line
line .
This is
is !
! second
second line
line .
Run Code Online (Sandbox Code Playgroud)
三元组
perl -lne 'while(/(\S+\s+\S*){2}/){print $&;s/\S+\s+//}' file
This is !
is ! line
! line .
This is !
is ! second
! second line
second line .
Run Code Online (Sandbox Code Playgroud)
更改大括号中的数字,以显示每行所需的数量 (-1)。