通过平均或重新调整numpy 2d数组来调整大小

And*_*nca 26 python numpy slice binning

我试图在python中重新实现一个IDL函数:

http://star.pst.qub.ac.uk/idl/REBIN.html

通过求平均值减去2d阵列的整数因子.

例如:

>>> a=np.arange(24).reshape((4,6))
>>> a
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
Run Code Online (Sandbox Code Playgroud)

我想通过取相关样本的平均值将其调整为(2,3),预期输出为:

>>> b = rebin(a, (2, 3))
>>> b
array([[  3.5,   5.5,  7.5],
       [ 15.5, 17.5,  19.5]])
Run Code Online (Sandbox Code Playgroud)

b[0,0] = np.mean(a[:2,:2]), b[0,1] = np.mean(a[:2,2:4])等等.

我相信我应该重塑为4维数组,然后在正确的切片上取平均值,但无法弄清楚算法.你有什么提示吗?

jfs*_*jfs 34

以下是基于您链接的答案的示例(为清晰起见):

>>> import numpy as np
>>> a = np.arange(24).reshape((4,6))
>>> a
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
>>> a.reshape((2,a.shape[0]//2,3,-1)).mean(axis=3).mean(1)
array([[  3.5,   5.5,   7.5],
       [ 15.5,  17.5,  19.5]])
Run Code Online (Sandbox Code Playgroud)

作为一个功能:

def rebin(a, shape):
    sh = shape[0],a.shape[0]//shape[0],shape[1],a.shape[1]//shape[1]
    return a.reshape(sh).mean(-1).mean(1)
Run Code Online (Sandbox Code Playgroud)


der*_*icw 12

JF Sebastian对2D分档有很好的答案.这是他的"rebin"函数的一个版本,适用于N维:

def bin_ndarray(ndarray, new_shape, operation='sum'):
    """
    Bins an ndarray in all axes based on the target shape, by summing or
        averaging.

    Number of output dimensions must match number of input dimensions and 
        new axes must divide old ones.

    Example
    -------
    >>> m = np.arange(0,100,1).reshape((10,10))
    >>> n = bin_ndarray(m, new_shape=(5,5), operation='sum')
    >>> print(n)

    [[ 22  30  38  46  54]
     [102 110 118 126 134]
     [182 190 198 206 214]
     [262 270 278 286 294]
     [342 350 358 366 374]]

    """
    operation = operation.lower()
    if not operation in ['sum', 'mean']:
        raise ValueError("Operation not supported.")
    if ndarray.ndim != len(new_shape):
        raise ValueError("Shape mismatch: {} -> {}".format(ndarray.shape,
                                                           new_shape))
    compression_pairs = [(d, c//d) for d,c in zip(new_shape,
                                                  ndarray.shape)]
    flattened = [l for p in compression_pairs for l in p]
    ndarray = ndarray.reshape(flattened)
    for i in range(len(new_shape)):
        op = getattr(ndarray, operation)
        ndarray = op(-1*(i+1))
    return ndarray
Run Code Online (Sandbox Code Playgroud)


Mar*_*ark 5

这是一种使用矩阵乘法完成您要求的方法,不需要新的数组维度来划分旧的数组维度。

首先,我们生成一个行压缩器矩阵和一个列压缩器矩阵(我确信有一种更简洁的方法,甚至可以单独使用 numpy 操作):

def get_row_compressor(old_dimension, new_dimension):
    dim_compressor = np.zeros((new_dimension, old_dimension))
    bin_size = float(old_dimension) / new_dimension
    next_bin_break = bin_size
    which_row = 0
    which_column = 0
    while which_row < dim_compressor.shape[0] and which_column < dim_compressor.shape[1]:
        if round(next_bin_break - which_column, 10) >= 1:
            dim_compressor[which_row, which_column] = 1
            which_column += 1
        elif next_bin_break == which_column:

            which_row += 1
            next_bin_break += bin_size
        else:
            partial_credit = next_bin_break - which_column
            dim_compressor[which_row, which_column] = partial_credit
            which_row += 1
            dim_compressor[which_row, which_column] = 1 - partial_credit
            which_column += 1
            next_bin_break += bin_size
    dim_compressor /= bin_size
    return dim_compressor


def get_column_compressor(old_dimension, new_dimension):
    return get_row_compressor(old_dimension, new_dimension).transpose()
Run Code Online (Sandbox Code Playgroud)

......所以,例如,get_row_compressor(5, 3)给你:

[[ 0.6  0.4  0.   0.   0. ]
 [ 0.   0.2  0.6  0.2  0. ]
 [ 0.   0.   0.   0.4  0.6]]
Run Code Online (Sandbox Code Playgroud)

get_column_compressor(3, 2)给你:

[[ 0.66666667  0.        ]
 [ 0.33333333  0.33333333]
 [ 0.          0.66666667]]
Run Code Online (Sandbox Code Playgroud)

然后简单地乘以行压缩器并后乘以列压缩器得到压缩矩阵:

def compress_and_average(array, new_shape):
    # Note: new shape should be smaller in both dimensions than old shape
    return np.mat(get_row_compressor(array.shape[0], new_shape[0])) * \
           np.mat(array) * \
           np.mat(get_column_compressor(array.shape[1], new_shape[1]))
Run Code Online (Sandbox Code Playgroud)

使用这种技术,

compress_and_average(np.array([[50, 7, 2, 0, 1],
                               [0, 0, 2, 8, 4],
                               [4, 1, 1, 0, 0]]), (2, 3))
Run Code Online (Sandbox Code Playgroud)

产量:

[[ 21.86666667   2.66666667   2.26666667]
 [  1.86666667   1.46666667   1.86666667]]
Run Code Online (Sandbox Code Playgroud)