如何使用sed删除重复的字母？

Question

如何使用sed删除重复的字母？

使用 sed，如何从文本文件中的 HEADERS 中删除重复的字母？

NNAAMMEE
       nice - run a program with modified scheduling priority

SSYYNNOOPPSSIISS
       nice     [-n    adjustment]    [-adjustment]    [--adjustment=adjustment] [command [a$

Run Code Online (Sandbox Code Playgroud)

上面是一个例子。我希望解析后的输出sed是：

NAME
       nice - run a program with modified scheduling priority

SYNOPSIS
       nice     [-n    adjustment]    [-adjustment]    [--adjustment=adjustment] [command [a$

Run Code Online (Sandbox Code Playgroud)

Answer 1

slm*_*slm 12

方法#1

您可以使用此sed命令来执行此操作：

$ sed 's/\([A-Za-z]\)\1\+/\1/g' file.txt

Run Code Online (Sandbox Code Playgroud)

例子

使用您上面的示例输入，我创建了一个文件sample.txt.

$ sed 's/\([A-Za-z]\)\1\+/\1/g' sample.txt 
NAME
       nice - run a program with modified scheduling priority

       SYNOPSIS
              nice     [-n    adjustment]    [-adjustment] [--adjustment=adjustment] [command [a$

Run Code Online (Sandbox Code Playgroud)

方法#2

还有这个方法可以删除所有重复的字符：

$ sed 's/\(.\)\1/\1/g' file.txt

Run Code Online (Sandbox Code Playgroud)

例子

$ sed 's/\(.\)\1/\1/g' sample.txt 
NAME
    nice - run a program with modified scheduling priority

    SYNOPSIS
       nice   [-n  adjustment]  [-adjustment] [-adjustment=adjustment] [command [a$

Run Code Online (Sandbox Code Playgroud)

方法#3（只是大写）

OP询问您是否可以修改它以便只删除大写字符，这是使用修改后的方法#1的方法。

例子

$ sed 's/\([A-Z]\)\1\+/\1/g' sample.txt 
NAME
       nice - run a program with modified scheduling priority

       SYNOPSIS
              nice     [-n    adjustment]    [-adjustment] [--adjustment=adjustment] [command [a$

Run Code Online (Sandbox Code Playgroud)

上述方法的详细信息

所有的例子都使用了一种技术，当第一次遇到字符集中的字符 AZ 或 az 时，它的值被保存。将括号包裹在字符周围告诉sed将它们保存以备后用。然后将该值存储在一个临时变量中，您可以立即或稍后访问该变量。这些变量被命名为\1 和\2。

所以我们使用的技巧是匹配第一个字母。

\([A-Za-z]\)

Run Code Online (Sandbox Code Playgroud)

然后我们转过身来使用我们刚刚保存为次要字符的值，该值必须紧跟在上面的第一个字符之后，因此：

\([A-Za-z]\)\1.

Run Code Online (Sandbox Code Playgroud)

在sed我们还利用搜索和替换设施s/../../g。这g意味着我们在全球范围内这样做。

因此，当我们遇到一个字符，然后是另一个字符时，我们将其替换掉，然后只用相同字符中的一个替换它。

归档时间：	12 年，2 月前
查看次数：	19612 次
最近记录：	7 年，8 月前