Dyn*_*mic 1 sorting perl hash file cpu-word
如何.txt
使用Perl 在文件中找到前100个最常用的字符串(单词)?到目前为止,我有以下内容:
use 5.012;
use warnings;
open(my $file, "<", "file.txt");
my %word_count;
while (my $line = <$file>) {
foreach my $word (split ' ', $line) {
$word_count{$word}++;
}
}
for my $word (sort keys %word_count) {
print "'$word': $word_count{$word}\n";
}
Run Code Online (Sandbox Code Playgroud)
但这只计算每个单词,并按字母顺序组织.我想要文件中前100个最常用的单词,按出现次数排序.有任何想法吗?
通过阅读精细的perlfaq4(1)联机帮助页,可以了解如何按值对哈希进行排序.所以试试吧.它比你的方法更具惯用性"perlian".
#!/usr/bin/env perl
use v5.12;
use strict;
use warnings;
use warnings FATAL => "utf8";
use open qw(:utf8 :std);
my %seen;
while (<>) {
$seen{$_}++ for split /\W+/; # or just split;
}
my $count = 0;
for (sort {
$seen{$b} <=> $seen{$a}
||
lc($a) cmp lc($b) # XXX: should be v5.16's fc() instead
||
$a cmp $b
} keys %seen)
{
next unless /\w/;
printf "%-20s %5d\n", $_, $seen{$_};
last if ++$count > 100;
}
Run Code Online (Sandbox Code Playgroud)
当对自己运行时,前10行输出是:
seen 6
use 5
_ 3
a 3
b 3
cmp 2
count 2
for 2
lc 2
my 2
Run Code Online (Sandbox Code Playgroud)