dot*_*hen 1 logs text-processing
从这个答案我已经减少了一个日志文件:
Timestamp:1359021601 2013-01-17 15:00:01
size: 10G /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-17 16:00:01
size: 11G /mnt/SolrFiles/solr/api/
...snip hundreds of lines...
Timestamp:1359021601 2013-01-24 10:00:01
size: 11G /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-24 11:00:01
size: 11G /mnt/SolrFiles/solr/api/
Timestamp:1359028801 2013-01-24 12:00:01
size: 11G /mnt/SolrFiles/solr/api/
Timestamp:1359032401 2013-01-24 13:00:01
size: 12G /mnt/SolrFiles/solr/api/
Run Code Online (Sandbox Code Playgroud)
这种模式将持续数百行。我想减少文件以仅在大小更改时显示时间戳和大小,如下所示:
Timestamp:1359021601 2013-01-17 15:00:01
size: 10G /mnt/SolrFiles/solr/api/
Timestamp:1359025201 2013-01-17 16:00:01
size: 11G /mnt/SolrFiles/solr/api/
Timestamp:1359032401 2013-01-24 13:00:01
size: 12G /mnt/SolrFiles/solr/api/
Run Code Online (Sandbox Code Playgroud)
这可以使用常见的 Linux CLI 工具(例如 grep 和 sed)来完成吗?
这是一个典型的工作awk
:
awk '/^Timestamp/{t=$0; next}
/^size/ && $2 != last_size {
print t
print
last_size = $2
}'
Run Code Online (Sandbox Code Playgroud)
如果你想让它变得模糊和简洁,你可以这样做:
awk '!(/^T/&&t=$0)&&$2!=l&&(l=$2)&&$0=t RS$0'
Run Code Online (Sandbox Code Playgroud)