Bash在大文件中查找多个字符串的计数

Question

Bash在大文件中查找多个字符串的计数

我正在尝试使用bash命令在大型txt文件中获取各种字符串的计数.

也就是说,使用bash找到字符串'pig','horse'和'cat'的计数,得到一个输出'pig:7,horse:3,cat:5'.我想要一种方法只搜索一次txt文件,因为它非常大(所以我不想通过整个txt文件搜索'pig',然后返回搜索'horse'等)

任何有关命令的帮助将不胜感激.谢谢!

Answer 1

ric*_*ici 5

grep -Eo 'pig|horse|cat' txt.file | sort | uniq -c | awk '{print $2": "$1}'

Run Code Online (Sandbox Code Playgroud)

把它分成几块:

grep -Eo 'pig|horse|cat'  Print all the occurrences (-o) of the
                          extended (-e) regex 
sort                      Sort the resulting words
uniq -c                   Output unique values (of sorted input)
                          with the count (-c) of each value
awk '{print $2": "$1}'    For each line, print the second field (the word)
                          then a colon and a space, and then the first
                          field (the count).

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，1 月前
查看次数：	2487 次
最近记录：	11 年，1 月前