相关疑难解决方法(0)

Python group by array a,并汇总数组b - Performance

给定两个相同长度a和b的无序数组:

a = [7,3,5,7,5,7]
b = [0.2,0.1,0.3,0.1,0.1,0.2]
Run Code Online (Sandbox Code Playgroud)

我想按以下要素分组:

aResult = [7,3,5]
Run Code Online (Sandbox Code Playgroud)

总结b中的元素(用于概括概率密度函数的示例):

bResult = [0.2 + 0.1 + 0.2, 0.1, 0.3 + 0.1] = [0.5, 0.1, 0.4]
Run Code Online (Sandbox Code Playgroud)

或者,在python中随机a和b:

import numpy as np
a = np.random.randint(1,10,10000)
b = np.array([1./len(a)]*len(a))
Run Code Online (Sandbox Code Playgroud)

我有两种方法,肯定远远低于性能较低的边界.方法1(至少好又短):时间:0.769315958023

def approach_2(a,b):
    bResult = [sum(b[i == a]) for i in np.unique(a)]
    aResult = np.unique(a)
Run Code Online (Sandbox Code Playgroud)

方法2(numpy.groupby,非常慢)时间:4.65299129486

def approach_2(a,b): 
    tmp = [(a[i],b[i]) for i in range(len(a))]
    tmp2 = np.array(tmp, dtype = [('a', float),('b', float)])
    tmp2 = np.sort(tmp2, order='a') 

    bResult = []
    aResult = …
Run Code Online (Sandbox Code Playgroud)

python sorting performance group-by numpy

6
推荐指数
2
解决办法
2839
查看次数

标签 统计

group-by ×1

numpy ×1

performance ×1

python ×1

sorting ×1