确定唯一值的数量，然后确定这些值在文件中出现的次数

Question

确定唯一值的数量，然后确定这些值在文件中出现的次数

Dan*_*Max 4 shell-script text-processing

我有一个包含 15000 行的数据文件，但只有 400 个唯一值。我正在寻找一种方法来识别唯一值的数量，然后确定这些值在文件中的出现次数。我想出了以下内容，但速度非常慢。有什么想法吗？

for value in `cat mylist.txt | uniq`
do
    counter=`grep $value mylist.txt |wc -l`
    echo $value $counter
done

Run Code Online (Sandbox Code Playgroud)

Answer 1

ter*_*don 6

只需使用排序和 uniq：

sort mylist.txt | uniq | wc -l

Run Code Online (Sandbox Code Playgroud)

这将为您提供唯一值的数量。要获取每个唯一值的出现次数，请使用uniq-c 选项：

sort mylist.txt | uniq -c

Run Code Online (Sandbox Code Playgroud)

从uniq手册页：

   -c, --count
               prefix lines by the number of occurrences

Run Code Online (Sandbox Code Playgroud)

此外，为了将来参考，grep's -c 选项通常很有用：

 -c, --count
              Suppress  normal  output;  instead  print  a  count  of
              matching  lines  for  each  input  file.   With the -v,
              --invert-match option (see below),  count  non-matching
              lines.  (-c is specified by POSIX.)

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，11 月前
查看次数：	17190 次
最近记录：	9 年前