Pi4*_*All 7 sed awk text-processing
我有一个看起来像这样的日志文件,
Another thousand lines above this
I 10/03/15 12:04AM 42 [Important] 4th to last
I 10/03/15 04:31AM 42 (534642712) [1974,2106,258605080,0,0,32817,30711]
I 10/03/15 04:33AM 42 (2966927) [91,0,2966927,0,0,291,291]
I 10/03/15 04:52AM 42 (3026559) [93,0,3026559,0,0,314,314]
I 10/03/15 04:55AM 42 (3065494) [94,0,3065494,0,0,301,301]
I 10/03/15 05:04AM 42 [Important] 3rd to last
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42 (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)
整个文件中唯一保持不变的模式是[Important]
,其他一切都发生了变化,包括每次出现之间的行数[Important]
我试图取这个文件的结尾,忽略最后一次出现并找到倒数第二个,然后将文件的剩余内容提取到另一个中。
这是我一直在尝试但无法用 tac 挑出倒数第二次出现的情况。我在尝试什么..
<logfile tac | sed '/Important/q' | tac > output_file
Run Code Online (Sandbox Code Playgroud)
输出应该是这样的..
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42 (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)
小智 6
找到所有带有“Important”的行,选择最后两行,取行号,打印范围:
sed -n `grep -n Important log | tail -n 2 | cut -d : -f 1 | tr '\n' ',' | sed -e 's#,$#p#'` log
Run Code Online (Sandbox Code Playgroud)
按预期输出:
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42 (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)
作为脚本:
#!/bin/bash
lines=`grep -n Important log | tail -n 2 | cut -d : -f 1`
range=`echo "${lines}" | tr '\n' ',' | sed -e 's#,$#p#'`
sed -n "${range}" log
Run Code Online (Sandbox Code Playgroud)
$ awk '/Important/{pen=s; s=$0;next} s{s=s"\n"$0} END{print pen "\n" s}' logfile
I 10/04/15 12:04AM 42 [Important] 2nd to last occurence
I 10/04/15 04:31AM 42 (7,30711]55
I 10/04/15 04:33AM 42 dfsadfs,0,0,291,291]
I 10/04/15 04:52AM 42 (30,0,314,314]
I 10/04/15 04:55AM 42 (30,301]
I 10/04/15 05:04AM 42 [Important] - last occurence
Run Code Online (Sandbox Code Playgroud)
awk 隐式循环输入文件中的所有行。每次出现后Important
,我们将行保存在变量中s
。当我们到达一个带有Important
in的新行时,旧的一组重要行被转移到变量中pen
,我们开始将新行保存在s
.
pen
有倒数第二个(倒数第二个)Important
部分。 s
具有最终(最后)Important
部分。最后,我们打印pen
和s
。
更详细地:
/Important/{pen=s; s=$0;next}
如果此行包含Important
,则将变量的内容移动s
到pen
,将当前行保存在 中s
。然后,跳过其余命令并跳转到下一行。
s{s=s"\n"$0}
如果我们到达这里,则当前行不包含Important
.
如果s
已设置为一个值,则将当前行附加到它。
END{print pen "\n" s}
在我们到达文件末尾后,打印pen
和s
。