解析日志文件以获取值更改

dot*_*hen 1 logs text-processing

这个答案我已经减少了一个日志文件:

Timestamp:1359021601 2013-01-17 15:00:01
size: 10G   /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-17 16:00:01
size: 11G   /mnt/SolrFiles/solr/api/
...snip hundreds of lines...
Timestamp:1359021601 2013-01-24 10:00:01
size: 11G   /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-24 11:00:01
size: 11G   /mnt/SolrFiles/solr/api/
Timestamp:1359028801 2013-01-24 12:00:01
size: 11G   /mnt/SolrFiles/solr/api/
Timestamp:1359032401 2013-01-24 13:00:01
size: 12G   /mnt/SolrFiles/solr/api/
Run Code Online (Sandbox Code Playgroud)

这种模式将持续数百行。我想减少文件以仅在大小更改时显示时间戳和大小,如下所示:

Timestamp:1359021601 2013-01-17 15:00:01
size: 10G   /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-17 16:00:01
size: 11G   /mnt/SolrFiles/solr/api/
Timestamp:1359032401 2013-01-24 13:00:01
size: 12G   /mnt/SolrFiles/solr/api/
Run Code Online (Sandbox Code Playgroud)

这可以使用常见的 Linux CLI 工具(例如 grep 和 sed)来完成吗?

Sté*_*las 7

这是一个典型的工作awk

awk '/^Timestamp/{t=$0; next}
     /^size/ && $2 != last_size {
        print t
        print
        last_size = $2
     }'
Run Code Online (Sandbox Code Playgroud)

如果你想让它变得模糊和简洁,你可以这样做:

awk '!(/^T/&&t=$0)&&$2!=l&&(l=$2)&&$0=t RS$0'
Run Code Online (Sandbox Code Playgroud)