从文件中删除重复的相邻行

Question

从文件中删除重复的相邻行

假设我们有这样的文件：

foo1
bar
foo2
foo2
bar
bar
bar
foo3

Run Code Online (Sandbox Code Playgroud)

我希望它减少到：

foo1
bar
foo2
bar
foo3

Run Code Online (Sandbox Code Playgroud)

基本上只有当它们相邻时才会删除重复项......我开始编写一个 bash 函数，但意识到我不知道如何做到这一点：

remove_duplicate_adjacent_lines(){
   prev='';
   while read line; do
     if test "$line" != "$prev"; then
        prev="$line";
        echo "$line"
     fi
   done;
}

Run Code Online (Sandbox Code Playgroud)

但问题是prev不在 while 循环的范围内 - 有没有办法用 bash 以某种方式做到这一点？

Answer 1

Joh*_*024 6

这正是该uniq实用程序的用途：

$ uniq <File
foo1
bar
foo2
bar
foo3

Run Code Online (Sandbox Code Playgroud)

一个很好的例子可能是 bash 历史：

history | uniq

Run Code Online (Sandbox Code Playgroud)

由于行号，以上将不起作用，但这将：

cat ~/.bash_history | uniq

Run Code Online (Sandbox Code Playgroud)

将删除重复的相邻命令

来自man uniq：

过滤来自 INPUT（或标准输入）的相邻匹配行，写入 OUTPUT（或标准输出）。如果没有选项，匹配的行将合并到第一次出现。[强调]

归档时间：	6 年，4 月前
查看次数：	204 次
最近记录：	6 年，4 月前