Mar*_*nes 2 regex bash awk grep sed
我正在尝试编写一个脚本来帮助我进行语言学实验.此实验向主题显示文本短语,他们需要逐字阅读短语.例如,假设我有以下短语:
The girl was upset with her boyfriend.
Run Code Online (Sandbox Code Playgroud)
我需要将这个短语分成几个小部分,以便只向需要进行实验的主题显示这些小部分.显示主题短语的软件采用以下输入:
The ---- --- ----- ---- --- ----------
--- girl --- ----- ---- --- ----------
--- ---- was ----- ---- --- ----------
--- ---- --- upset ---- --- ----------
--- ---- --- ----- with --- ----------
--- ---- --- ----- ---- her ----------
--- ---- --- ----- ---- --- boyfriend.
Run Code Online (Sandbox Code Playgroud)
请注意,完整的短语绝不是输入.我需要将小部件提供给软件,以便在计算机屏幕上显示短语.此外,屏幕上没有出现的单词必须更改为短划线,长度与原始单词相同.
我正在考虑使用其中一种bash工具,比如sed,grep,awk等来解决我的问题.例如,我可以将原始短语写为
The | girl | was | upset | with | her | boyfriend.
Run Code Online (Sandbox Code Playgroud)
复制它七次,对于每个副本,使用短划线来表示我不需要的单词.请注意,单词总是在两个"|"之间,以便于识别它们.
(事实上,有时我需要替换的不仅仅是单词.例如,我可以一次性替换"女孩")
有关如何实现这一点的任何想法?
看到这个awk单行,如果它有帮助:
awk '{for(i=1;i<=NF;i++){t=$0;w=$i;gsub(/\S/,"-");$i=w;print;$0=t}}' file
Run Code Online (Sandbox Code Playgroud)
用你的例子测试:
kent$ cat f
The girl was upset with her boyfriend.
Yes @Kent, you are right. – grandeabobora 6 mins ago
kent$ awk '{for(i=1;i<=NF;i++){t=$0;w=$i;gsub(/\S/,"-");$i=w;print;$0=t}}' f
The ---- --- ----- ---- --- ----------
--- girl --- ----- ---- --- ----------
--- ---- was ----- ---- --- ----------
--- ---- --- upset ---- --- ----------
--- ---- --- ----- with --- ----------
--- ---- --- ----- ---- her ----------
--- ---- --- ----- ---- --- boyfriend.
Yes ------ --- --- ------ - ------------- - ---- ---
--- @Kent, --- --- ------ - ------------- - ---- ---
--- ------ you --- ------ - ------------- - ---- ---
--- ------ --- are ------ - ------------- - ---- ---
--- ------ --- --- right. - ------------- - ---- ---
--- ------ --- --- ------ – ------------- - ---- ---
--- ------ --- --- ------ - grandeabobora - ---- ---
--- ------ --- --- ------ - ------------- 6 ---- ---
--- ------ --- --- ------ - ------------- - mins ---
--- ------ --- --- ------ - ------------- - ---- ago
Run Code Online (Sandbox Code Playgroud)