内联删除括号之间的重复单词

Question

内联删除括号之间的重复单词

Chr*_*ris 2 shell text-processing regular-expression

我们的输入看起来像

2012-04-17  [GBPGBP]
2012-04-13  [GBP GBP]
2012-04-13  [GBP]
2012-04-11  [GBPGBP]
2012-04-11  [GBP GBP]
2012-04-10  [GBPGBP]
2012-04-06  [GBP GBP GBP]
2012-04-17  [GBPGBP]
2012-04-13  [GBP CDN]
2012-04-13  [GBP]
2012-04-11  [GBPCDN]
2012-04-11  [GBP DL DL]
2012-04-10  [PSGBP]
2012-04-06  [PS PS]

Run Code Online (Sandbox Code Playgroud)

我们希望得到这样的输出

2012-04-17  [GBP]
2012-04-13  [GBP]
2012-04-13  [GBP]
2012-04-11  [GBP]
2012-04-11  [GBP]
2012-04-10  [GBP]
2012-04-06  [GBP]
2012-04-17  [GBP]
2012-04-13  [GBP CDN]
2012-04-13  [GBP]
2012-04-11  [GBPCDN]
2012-04-11  [GBP DL]
2012-04-10  [PSGBP]
2012-04-06  [PS]

Run Code Online (Sandbox Code Playgroud)

基本上删除括号内的任何重复字符串。有什么建议？

Answer 1

Gil*_*il' 5

sed -e ': a' -e 's/\(\[[^][]*\)\([A-Z][A-Z][A-Z]*\)\([^][]*\)\2/\1\2\3/' -e 't a'

Run Code Online (Sandbox Code Playgroud)

: a 在脚本的开头设置一个标签。
s/\(wibble\)\(foo\)\(bar\)\2/\1\2\3/ 用 wibblefoobar 替换 wibblefoobarfoo。
[A-Z][A-Z][A-Z]* 匹配两个或更多字母
t aa如果前一个s命令进行了替换，则循环回到标签。

归档时间：	13 年，7 月前
查看次数：	308 次
最近记录：	13 年，7 月前