Linux工具 - 如何计算和列出文件中正则表达式的出现次数

Question

我有一个包含大量类似字符串的文件.我想计算一个正则表达式的唯一出现次数,并显示它们是什么,例如对于Profile: (\w*)文件上的模式:

Profile: blah
Profile: another
Profile: trees
Profile: blah

我想发现有3次出现,并返回结果:

blah, another, trees

Answer 1

试试这个:

egrep "Profile: (\w*)" test.text -o | sed 's/Profile: \(\w*\)/\1/g' | sort | uniq

输出:

another
blah
trees

描述

egrepwith -o选项将获取文件中的匹配模式.

sed 只会获取捕获部分

sort接下来uniq将列出一系列独特元素

要获取结果列表中的元素数,请附加命令 wc -l

egrep "Profile: (\w*)" test.text -o | sed 's/Profile: \(\w*\)/\1/g' | sort | uniq | wc -l

输出: