如何使用numpy在二维数组上执行max/mean pooling

rap*_*ock 26 python arrays numpy matrix max-pooling

给定2D(M x N)矩阵和2D内核(K x L),我如何返回一个矩阵,该矩阵是使用图像上给定内核的最大或均值池的结果?

如果可能的话,我想使用numpy.

注意:M,N,K,L可以是偶数或奇数,并且它们不需要彼此完全可分,例如:7x5矩阵和2x2内核.

例如,最大池:

matrix:
array([[  20,  200,   -5,   23],
       [ -13,  134,  119,  100],
       [ 120,   32,   49,   25],
       [-120,   12,   09,   23]])
kernel: 2 x 2
soln:
array([[  200,  119],
       [  120,   49]])
Run Code Online (Sandbox Code Playgroud)

mdh*_*mdh 48

你可以使用scikit-image block_reduce:

import numpy as np
import skimage.measure

a = np.array([
      [  20,  200,   -5,   23],
      [ -13,  134,  119,  100],
      [ 120,   32,   49,   25],
      [-120,   12,    9,   23]
])
skimage.measure.block_reduce(a, (2,2), np.max)
Run Code Online (Sandbox Code Playgroud)

得到:

array([[200, 119],
       [120,  49]])
Run Code Online (Sandbox Code Playgroud)

  • 这是一个很好的答案,如果您只需要跨过您的缩小尺寸即可。这个 API(还)不允许修改你的步幅(遗憾的是)。已投赞成票:-) (4认同)

Ell*_*iot 13

如果图像大小可以被内核大小整除,则可以重新整形数组并使用maxmean根据需要使用

import numpy as np

mat = np.array([[  20,  200,   -5,   23],
       [ -13,  134,  119,  100],
       [ 120,   32,   49,   25],
       [-120,   12,   9,   23]])

M, N = mat.shape
K = 2
L = 2

MK = M // K
NL = N // L
print(mat[:MK*K, :NL*L].reshape(MK, K, NL, L).max(axis=(1, 3)))
# [[200, 119], [120, 49]] 
Run Code Online (Sandbox Code Playgroud)

如果没有偶数个内核,则必须单独处理边界.(正如评论中指出的那样,这会导致矩阵被复制,这将影响性能).

mat = np.array([[20,  200,   -5,   23, 7],
                [-13,  134,  119,  100, 8],
                [120,   32,   49,   25, 12],
                [-120,   12,   9,   23, 15],
                [-57,   84,   19,   17, 82],
                ])
# soln
# [200, 119, 8]
# [120, 49, 15]
# [84, 19, 82]
M, N = mat.shape
K = 2
L = 2

MK = M // K
NL = N // L

# split the matrix into 'quadrants'
Q1 = mat[:MK * K, :NL * L].reshape(MK, K, NL, L).max(axis=(1, 3))
Q2 = mat[MK * K:, :NL * L].reshape(-1, NL, L).max(axis=2)
Q3 = mat[:MK * K, NL * L:].reshape(MK, K, -1).max(axis=1)
Q4 = mat[MK * K:, NL * L:].max()

# compose the individual quadrants into one new matrix
soln = np.vstack([np.c_[Q1, Q3], np.c_[Q2, Q4]])
print(soln)
# [[200 119   8]
#  [120  49  15]
#  [ 84  19  82]]
Run Code Online (Sandbox Code Playgroud)


Jas*_*son 7

与其像Elliot的答案所示那样制作“ quadrant”,不如将其填充以使其均匀可分,然后执行max或mean pooling。

由于CNN中经常使用池化,因此输入数组通常为3D。因此,我做了一个可以在2D或3D阵列上工作的函数。

def pooling(mat,ksize,method='max',pad=False):
    '''Non-overlapping pooling on 2D or 3D data.

    <mat>: ndarray, input array to pool.
    <ksize>: tuple of 2, kernel size in (ky, kx).
    <method>: str, 'max for max-pooling, 
                   'mean' for mean-pooling.
    <pad>: bool, pad <mat> or not. If no pad, output has size
           n//f, n being <mat> size, f being kernel size.
           if pad, output has size ceil(n/f).

    Return <result>: pooled matrix.
    '''

    m, n = mat.shape[:2]
    ky,kx=ksize

    _ceil=lambda x,y: int(numpy.ceil(x/float(y)))

    if pad:
        ny=_ceil(m,ky)
        nx=_ceil(n,kx)
        size=(ny*ky, nx*kx)+mat.shape[2:]
        mat_pad=numpy.full(size,numpy.nan)
        mat_pad[:m,:n,...]=mat
    else:
        ny=m//ky
        nx=n//kx
        mat_pad=mat[:ny*ky, :nx*kx, ...]

    new_shape=(ny,ky,nx,kx)+mat.shape[2:]

    if method=='max':
        result=numpy.nanmax(mat_pad.reshape(new_shape),axis=(1,3))
    else:
        result=numpy.nanmean(mat_pad.reshape(new_shape),axis=(1,3))

    return result
Run Code Online (Sandbox Code Playgroud)

有时您可能想要执行重叠池化,步幅不等于内核大小。这是一个有或没有填充的函数:

def asStride(arr,sub_shape,stride):
    '''Get a strided sub-matrices view of an ndarray.
    See also skimage.util.shape.view_as_windows()
    '''
    s0,s1=arr.strides[:2]
    m1,n1=arr.shape[:2]
    m2,n2=sub_shape
    view_shape=(1+(m1-m2)//stride[0],1+(n1-n2)//stride[1],m2,n2)+arr.shape[2:]
    strides=(stride[0]*s0,stride[1]*s1,s0,s1)+arr.strides[2:]
    subs=numpy.lib.stride_tricks.as_strided(arr,view_shape,strides=strides)
    return subs

def poolingOverlap(mat,ksize,stride=None,method='max',pad=False):
    '''Overlapping pooling on 2D or 3D data.

    <mat>: ndarray, input array to pool.
    <ksize>: tuple of 2, kernel size in (ky, kx).
    <stride>: tuple of 2 or None, stride of pooling window.
              If None, same as <ksize> (non-overlapping pooling).
    <method>: str, 'max for max-pooling,
                   'mean' for mean-pooling.
    <pad>: bool, pad <mat> or not. If no pad, output has size
           (n-f)//s+1, n being <mat> size, f being kernel size, s stride.
           if pad, output has size ceil(n/s).

    Return <result>: pooled matrix.
    '''

    m, n = mat.shape[:2]
    ky,kx=ksize
    if stride is None:
        stride=(ky,kx)
    sy,sx=stride

    _ceil=lambda x,y: int(numpy.ceil(x/float(y)))

    if pad:
        ny=_ceil(m,sy)
        nx=_ceil(n,sx)
        size=((ny-1)*sy+ky, (nx-1)*sx+kx) + mat.shape[2:]
        mat_pad=numpy.full(size,numpy.nan)
        mat_pad[:m,:n,...]=mat
    else:
        mat_pad=mat[:(m-ky)//sy*sy+ky, :(n-kx)//sx*sx+kx, ...]

    view=asStride(mat_pad,ksize,stride)

    if method=='max':
        result=numpy.nanmax(view,axis=(2,3))
    else:
        result=numpy.nanmean(view,axis=(2,3))

    return result
Run Code Online (Sandbox Code Playgroud)

  • 与 scikit 的 block_reduce 相比,它的工作速度快了 30 倍。`block_reduce`:`9093 个函数在 0.035 秒内调用``pooling`:`10 个函数在 0.001 秒内调用` (2认同)
  • @Tyathalae,我可能在您的评论中缺少一些有关分析的上下文,但在我看来,如果在 0.035 秒内有 9093 个对 Scikit 的 `block_reduce` 的函数调用(每 ~3.85μs 1 个),而对上述函数的调用只有 10 个池化函数只需 0.001 秒(每约 0.1 毫秒 1 次),这是否意味着 Scikit 的“block_reduce”实际上比上述实现快约 26 倍?另外,如果我正确地阅读了这篇文章,那么样本大小(即函数调用)的差异非常大。你能澄清一下吗?谢谢! (2认同)
  • 嗨@Greenstick,你是对的,我的评论有点模棱两可。它显示为完成该操作而进行的函数(子调用)的数量。因此,scikit 对于单个“block_reduce()”调用总共调用了“9093”个子函数。旁注:该输出格式来自“cProfile”。 (2认同)