Python中的分位数/中值/ 2D分级

Mar*_*ark 7 python statistics numpy scipy

您是否知道针对以下问题的快速/优雅的Python/Scipy/Numpy解决方案:您有一组x,y坐标及相关值w(所有1D数组).现在将bin x和y放到2D网格(大小BINSxBINS)上并计算每个bin的w值的分位数(如中值),最终应该得到具有所需分位数的BINSxBINS 2D数组.

这对于一些嵌套循环来说很容易,但我确信有一个更优雅的解决方案.

谢谢,马克

Bi *_*ico 5

这就是我提出的,我希望它有用.它不一定比使用循环更干净或更好,但它可能会让你开始更好地开始.

import numpy as np
bins_x, bins_y = 1., 1.
x = np.array([1,1,2,2,3,3,3])
y = np.array([1,1,2,2,3,3,3])
w = np.array([1,2,3,4,5,6,7], 'float')

# You can get a bin number for each point like this
x = (x // bins_x).astype('int')
y = (y // bins_y).astype('int')
shape = [x.max()+1, y.max()+1]
bin = np.ravel_multi_index([x, y], shape)

# You could get the mean by doing something like:
mean = np.bincount(bin, w) / np.bincount(bin)

# Median is a bit harder
order = bin.argsort()
bin = bin[order]
w = w[order]
edges = (bin[1:] != bin[:-1]).nonzero()[0] + 1
med_index = (np.r_[0, edges] + np.r_[edges, len(w)]) // 2
median = w[med_index]

# But that's not quite right, so maybe
median2 = [np.median(i) for i in np.split(w, edges)]
Run Code Online (Sandbox Code Playgroud)

另外看看numpy.histogram2d