在Python中以numpy/scipy计算数组中的唯一元素的有效方法

Question

我有一个scipy数组,例如

a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]])

我想计算数组中每个唯一元素的出现次数.例如,对于上面的数组a,我想知道[1次,0次,1次]出现1次,[1,1,1]出现1次,[1,0,1]出现1次.

我想到的一种方法是:

from collections import defaultdict
d = defaultdict(int)

for elt in a:
  d[elt] += 1

有更好/更有效的方式吗？

谢谢.

Answer 1

如果坚持使用Python 2.7(或3.1)不是问题,并且您可以使用这两个Python版本中的任何一个,那么如果您坚持使用像元组这样的可清除元素,那么新的collections.Counter可能适合您:

>>> from collections import Counter
>>> c = Counter([(0,0,1), (1,1,1), (1,1,1), (1,0,1)])
>>> c
Counter({(1, 1, 1): 2, (0, 0, 1): 1, (1, 0, 1): 1})

但是,我没有对这两种方法进行任何性能测试.

defaultdict会更快.John Machin在今天早些时候的答案(http://stackoverflow.com/questions/4036474/add-new-keys-to-a-dictionary-while-incrementing-existing-values)中给出了时间安排. (5认同)