Chr*_*ris 2 shell text-processing regular-expression
我们的输入看起来像
2012-04-17 [GBPGBP]
2012-04-13 [GBP GBP]
2012-04-13 [GBP]
2012-04-11 [GBPGBP]
2012-04-11 [GBP GBP]
2012-04-10 [GBPGBP]
2012-04-06 [GBP GBP GBP]
2012-04-17 [GBPGBP]
2012-04-13 [GBP CDN]
2012-04-13 [GBP]
2012-04-11 [GBPCDN]
2012-04-11 [GBP DL DL]
2012-04-10 [PSGBP]
2012-04-06 [PS PS]
Run Code Online (Sandbox Code Playgroud)
我们希望得到这样的输出
2012-04-17 [GBP]
2012-04-13 [GBP]
2012-04-13 [GBP]
2012-04-11 [GBP]
2012-04-11 [GBP]
2012-04-10 [GBP]
2012-04-06 [GBP]
2012-04-17 [GBP]
2012-04-13 [GBP CDN]
2012-04-13 [GBP]
2012-04-11 [GBPCDN]
2012-04-11 [GBP DL]
2012-04-10 [PSGBP]
2012-04-06 [PS]
Run Code Online (Sandbox Code Playgroud)
基本上删除括号内的任何重复字符串。有什么建议?
sed -e ': a' -e 's/\(\[[^][]*\)\([A-Z][A-Z][A-Z]*\)\([^][]*\)\2/\1\2\3/' -e 't a'
Run Code Online (Sandbox Code Playgroud)
: a
在脚本的开头设置一个标签。s/\(wibble\)\(foo\)\(bar\)\2/\1\2\3/
用 wibblefoobar 替换 wibblefoobarfoo。[A-Z][A-Z][A-Z]*
匹配两个或更多字母t a
a
如果前一个s
命令进行了替换,则循环回到标签。