如何从Perl中删除数组中的重复项?

Dav*_*vid 153 arrays perl unique duplicates

我在Perl中有一个数组:

my @my_array = ("one","two","three","two","three");
Run Code Online (Sandbox Code Playgroud)

如何从阵列中删除重复项?

Gre*_*ill 161

你可以做这样的事情,如perlfaq4所示:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";
Run Code Online (Sandbox Code Playgroud)

输出:

one two three
Run Code Online (Sandbox Code Playgroud)

如果要使用模块,请尝试使用该uniq功能List::MoreUtils

  • 请不要在示例中使用$ a或$ b,因为它们是sort()的神奇全局变量 (27认同)
  • `sub uniq {my%seen; grep!$ seen {$ _} ++,@ _}`是一个更好的实现,因为它免费保留订单.或者甚至更好,使用List :: MoreUtils中的那个. (18认同)
  • @BrianVandenberg欢迎来到1987年的世界 - 当它被创建 - 并且几乎100%用于perl的后备词 - 所以它无法被淘汰. (5认同)
  • 在这个范围内,这是一个"我的"词汇,所以没关系.话虽如此,可能会选择更具描述性的变量名称. (2认同)
  • @ephemient是的,但是如果你要在这个函数中添加排序那么它会胜过`$ :: a`和`$ :: b`,不是吗? (2认同)
  • @szabgab,如果是这样的话,那对于`sort`来说使用非局部变量是一个非常糟糕的设计决定. (2认同)
  • Perl v5.26.0 开始, `List::Util` 有 `uniq` ,所以不再需要 MoreUtils (2认同)

Joh*_*usa 119

Perl文档附带了很多常见问题解答.您的问题经常被问到:

% perldoc -q duplicate
Run Code Online (Sandbox Code Playgroud)

从上面命令的输出中复制并粘贴的答案如下所示:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;

  • 约翰伊兹在mah anzers窃取mah rep! (16认同)
  • 我认为你应该获得实际查找问题的奖励积分. (5认同)
  • 我喜欢最好的答案是95%的复制粘贴和3个句子的OC.要非常清楚,这**是最好的答案; 我发现这个事实很有趣. (2认同)

小智 68

从CPAN 安装List :: MoreUtils

然后在你的代码中:

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);
Run Code Online (Sandbox Code Playgroud)

  • 事实上,List :: MoreUtils没有捆绑w/perl有点损害了使用它的项目的可移植性:((我不会) (4认同)
  • @Ranguard:`@ dup_list`应该在`uniq`调用中,而不是`@ dups` (3认同)
  • 这就是答案!但我只能投票给你一次. (2认同)

Xet*_*ius 23

我通常这样做的方法是:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;
Run Code Online (Sandbox Code Playgroud)

如果您使用哈希并将项添加到哈希.您还可以知道每个项目在列表中出现的次数.

  • 如果需要,这有不保留原始订单的缺点. (2认同)

Kam*_*yan 9

方法 1:使用哈希

逻辑:散列只能有唯一的键,因此遍历数组,为数组的每个元素分配任何值,将元素作为该散列的键。返回散列的键,它是您唯一的数组。

my @unique = keys {map {$_ => 1} @array};
Run Code Online (Sandbox Code Playgroud)

方法 2:方法 1 的可重用性扩展

如果我们应该在我们的代码中多次使用这个功能,最好制作一个子程序。

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);
Run Code Online (Sandbox Code Playgroud)

方法三:使用模块 List::MoreUtils

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);
Run Code Online (Sandbox Code Playgroud)


小智 8

变量@array是具有重复元素的列表

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;
Run Code Online (Sandbox Code Playgroud)


Haw*_*awk 7

可以使用简单的Perl one衬垫完成.

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.
Run Code Online (Sandbox Code Playgroud)

PFM块执行此操作:

@in中的数据被送入MAP.MAP构建匿名哈希.从哈希中提取密钥并将其提供给@out