Sue*_*Sue 4 python arrays comparison numpy rank
让我用一个简单的例子来阐述我的问题.我有一个= [a1,a2,a3,a4],所有ai都是一个数值.
我想得到的是'a'中的成对比较,例如I(a1> = a2),I(a1> = a3),I(a1> = a4),,,,, I(a4> = a1) ,I(a4> = a2),I(a4> = a3),其中I是指标函数.所以我使用了以下代码.
res=[x>=y for x in a for y in a]
Run Code Online (Sandbox Code Playgroud)
但它也给出了比较结果,如I(a1> = a1),..,I(a4> = a4),它总是一个.为了摆脱这些麻烦,我将res转换为numpy数组并找到off对角元素.
res1=numpy.array(res)
Run Code Online (Sandbox Code Playgroud)
这给出了我想要的结果,但我认为应该有更有效或更简单的方法来进行成对比较并提取非对角线元素.你对此有什么想法吗?提前致谢.
你可以使用NumPy broadcasting-
# Get the mask of comparisons in a vectorized manner using broadcasting
mask = a[:,None] >= a
# Select the elements other than diagonal ones
out = mask[~np.eye(a.size,dtype=bool)]
Run Code Online (Sandbox Code Playgroud)
如果您更喜欢将对角线元素设置为False输入mask,那么mask将是输出,就像这样 -
mask[np.eye(a.size,dtype=bool)] = 0
Run Code Online (Sandbox Code Playgroud)
样品运行 -
In [56]: a
Out[56]: array([3, 7, 5, 8])
In [57]: mask = a[:,None] >= a
In [58]: mask
Out[58]:
array([[ True, False, False, False],
[ True, True, True, False],
[ True, False, True, False],
[ True, True, True, True]], dtype=bool)
In [59]: mask[~np.eye(a.size,dtype=bool)] # Selecting non-diag elems
Out[59]:
array([False, False, False, True, True, False, True, False, False,
True, True, True], dtype=bool)
In [60]: mask[np.eye(a.size,dtype=bool)] = 0 # Setting diag elems as False
In [61]: mask
Out[61]:
array([[False, False, False, False],
[ True, False, True, False],
[ True, False, False, False],
[ True, True, True, False]], dtype=bool)
Run Code Online (Sandbox Code Playgroud)
运行时测试
使用理由NumPy broadcasting?性能!让我们看看如何使用大型数据集 -
In [34]: def pairwise_comp(A): # Using NumPy broadcasting
...: a = np.asarray(A) # Convert to array if not already so
...: mask = a[:,None] >= a
...: out = mask[~np.eye(a.size,dtype=bool)]
...: return out
...:
In [35]: a = np.random.randint(0,9,(1000)).tolist() # Input list
In [36]: %timeit [x >= y for i,x in enumerate(a) for j,y in enumerate(a) if i != j]
1 loop, best of 3: 185 ms per loop # @Sixhobbits's loopy soln
In [37]: %timeit pairwise_comp(a)
100 loops, best of 3: 5.76 ms per loop
Run Code Online (Sandbox Code Playgroud)
也许你想要:
[x >= y for i,x in enumerate(a) for j,y in enumerate(a) if i != j]
Run Code Online (Sandbox Code Playgroud)
这不会将任何项目与其自身进行比较,而是将每个项目相互比较.