如何过滤出具有 2 个相似值的行?

vv0*_*dyk 5 command-line text-processing

我想过滤掉具有相同编号的行->相同编号

从这段文字

    [325194/777232]/var/cache/apt/srcpkgcache.bin:  100%  extents: 5 -> 1   [ OK ]
    [325195/777232]/var/cache/apt/pkgcache.bin: 100%  extents: 4 -> 1   [ OK ]
    [325255/777232]/var/cache/man/de/index.db:  100%  extents: 2 -> 2   [ OK ]
    [325521/777232]/var/log/syslog: 100%  extents: 7 -> 1   [ OK ]
    [325525/777232]/var/log/lastlog:    100%  extents: 2 -> 2   [ OK ]
    [325531/777232]/var/log/syslog.1:   100%  extents: 5 -> 1   [ OK ]
    [325572/777232]/var/log/kern.log:   100%  extents: 6 -> 1   [ OK ]
    [325589/777232]/var/log/auth.log:   100%  extents: 3 -> 1   [ OK ]
    [325621/777232]/var/log/faillog:    100%  extents: 2 -> 2   [ OK ]
    [325625/777232]/var/log/wtmp:   100%  extents: 3 -> 1   [ OK ]
    [325627/777232]/var/log/kern.log.1: 100%  extents: 2 -> 1   [ OK ]
    [325644/777232]/var/log/cups/access_log.1:  100%  extents: 2 -> 1   [ OK ]
    [325810/777232]/var/log/auth.log.1: 100%  extents: 2 -> 1   [ OK ]
Run Code Online (Sandbox Code Playgroud)

hee*_*ayl 10

为了获得该行拥有same_number - > same_number模式:

grep -E '([[:digit:]]+)[[:blank:]]+->[[:blank:]]+\1[[:blank:]]'
Run Code Online (Sandbox Code Playgroud)
  • -E 启用 ERE(扩展正则表达式)

  • ([[:digit:]]+) 匹配一位或多位数字并放入捕获的组 1

  • [[:blank:]]+ 匹配一个或多个水平空白

  • -> 字面上匹配

  • \1 指的是第一个捕获的组

  • [[:blank:]] 之后匹配一个空格

您可以将类似的逻辑与其他流行的文本处理工具/语言(如sedawk、 )一起使用perl

要获取没有模式的行,只需添加-v选项:

grep -vE '([[:digit:]]+)[[:blank:]]+->[[:blank:]]+\1[[:blank:]]'
Run Code Online (Sandbox Code Playgroud)

例子:

% cat file.txt
[325194/777232]/var/cache/apt/srcpkgcache.bin:  100%  extents: 5 -> 1   [ OK ]
[325195/777232]/var/cache/apt/pkgcache.bin: 100%  extents: 4 -> 1   [ OK ]
[325255/777232]/var/cache/man/de/index.db:  100%  extents: 2 -> 2   [ OK ]
[325521/777232]/var/log/syslog: 100%  extents: 7 -> 1   [ OK ]
[325525/777232]/var/log/lastlog:    100%  extents: 2 -> 2   [ OK ]
[325531/777232]/var/log/syslog.1:   100%  extents: 5 -> 1   [ OK ]
[325572/777232]/var/log/kern.log:   100%  extents: 6 -> 1   [ OK ]
[325589/777232]/var/log/auth.log:   100%  extents: 3 -> 1   [ OK ]
[325621/777232]/var/log/faillog:    100%  extents: 2 -> 2   [ OK ]
[325625/777232]/var/log/wtmp:   100%  extents: 3 -> 1   [ OK ]
[325627/777232]/var/log/kern.log.1: 100%  extents: 2 -> 1   [ OK ]
[325644/777232]/var/log/cups/access_log.1:  100%  extents: 2 -> 1   [ OK ]
[325810/777232]/var/log/auth.log.1: 100%  extents: 2 -> 1   [ OK ]

% grep -E '([[:digit:]]+)[[:blank:]]+->[[:blank:]]+\1[[:blank:]]' file.txt
[325255/777232]/var/cache/man/de/index.db:  100%  extents: 2 -> 2   [ OK ]
[325525/777232]/var/log/lastlog:    100%  extents: 2 -> 2   [ OK ]
[325621/777232]/var/log/faillog:    100%  extents: 2 -> 2   [ OK ]

% grep -vE '([[:digit:]]+)[[:blank:]]+->[[:blank:]]+\1[[:blank:]]' file.txt
[325194/777232]/var/cache/apt/srcpkgcache.bin:  100%  extents: 5 -> 1   [ OK ]
[325195/777232]/var/cache/apt/pkgcache.bin: 100%  extents: 4 -> 1   [ OK ]
[325521/777232]/var/log/syslog: 100%  extents: 7 -> 1   [ OK ]
[325531/777232]/var/log/syslog.1:   100%  extents: 5 -> 1   [ OK ]
[325572/777232]/var/log/kern.log:   100%  extents: 6 -> 1   [ OK ]
[325589/777232]/var/log/auth.log:   100%  extents: 3 -> 1   [ OK ]
[325625/777232]/var/log/wtmp:   100%  extents: 3 -> 1   [ OK ]
[325627/777232]/var/log/kern.log.1: 100%  extents: 2 -> 1   [ OK ]
[325644/777232]/var/log/cups/access_log.1:  100%  extents: 2 -> 1   [ OK ]
[325810/777232]/var/log/auth.log.1: 100%  extents: 2 -> 1   [ OK ]
Run Code Online (Sandbox Code Playgroud)