从文件末尾查找字符串的第二次出现

Pi4*_*All 7 sed awk text-processing

我有一个看起来像这样的日志文件,

Another thousand lines above this
I 10/03/15 12:04AM 42 [Important] 4th to last
I 10/03/15 04:31AM 42 (534642712) [1974,2106,258605080,0,0,32817,30711]
I 10/03/15 04:33AM 42 (2966927) [91,0,2966927,0,0,291,291]
I 10/03/15 04:52AM 42 (3026559) [93,0,3026559,0,0,314,314]
I 10/03/15 04:55AM 42 (3065494) [94,0,3065494,0,0,301,301]
I 10/03/15 05:04AM 42 [Important] 3rd to last
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42  (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)

整个文件中唯一保持不变的模式是[Important],其他一切都发生了变化,包括每次出现之间的行数[Important]

我试图取这个文件的结尾,忽略最后一次出现并找到倒数第二个,然后将文件的剩余内容提取到另一个中。

这是我一直在尝试但无法用 tac 挑出倒数第二次出现的情况。我在尝试什么..

<logfile tac | sed '/Important/q' | tac >  output_file
Run Code Online (Sandbox Code Playgroud)

输出应该是这样的..

I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42  (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)

小智 6

找到所有带有“Important”的行,选择最后两行,取行号,打印范围:

sed -n `grep -n Important log | tail -n 2 | cut -d : -f 1 | tr '\n' ',' | sed -e 's#,$#p#'` log
Run Code Online (Sandbox Code Playgroud)

按预期输出:

I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42  (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)

作为脚本:

#!/bin/bash
lines=`grep -n Important log | tail -n 2 | cut -d : -f 1`
range=`echo "${lines}" | tr '\n' ',' | sed -e 's#,$#p#'`
sed -n "${range}" log
Run Code Online (Sandbox Code Playgroud)


Joh*_*024 5

$ awk '/Important/{pen=s; s=$0;next} s{s=s"\n"$0} END{print pen "\n" s}' logfile
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42  (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)

这个怎么运作

awk 隐式循环输入文件中的所有行。每次出现后Important,我们将行保存在变量中s。当我们到达一个带有Importantin的新行时,旧的一组重要行被转移到变量中pen,我们开始将新行保存在s.

pen有倒数第二个(倒数第二个)Important部分。 s具有最终(最后)Important部分。最后,我们打印pens

更详细地:

  • /Important/{pen=s; s=$0;next}

    如果此行包含Important,则将变量的内容移动spen,将当前行保存在 中s。然后,跳过其余命令并跳转到下一行。

  • s{s=s"\n"$0}

    如果我们到达这里,则当前行不包含Important.

    如果s已设置为一个值,则将当前行附加到它。

  • END{print pen "\n" s}

    在我们到达文件末尾后,打印pens