缺少数据的python中的2d卷积

Art*_*hur 4 python numpy convolution scipy

我知道有 scipy.signal.convolve2d 函数来处理 2d numpy 数组的二维卷积,并且有 numpy.ma 模块来处理丢失的数据,但这两种方法似乎并不兼容(这意味着即使你在 numpy 中屏蔽了一个二维数组,convolve2d 中的过程不会受到影响)。有没有办法仅使用 numpy 和 scipy 包来处理卷积中的缺失值?

例如:

            1 - 3 4 5
            1 2 - 4 5
   Array =  1 2 3 - 5
            - 2 3 4 5
            1 2 3 4 -

  Kernel =  1  0
            0 -1
Run Code Online (Sandbox Code Playgroud)

卷积所需的结果(数组,内核,边界 =“包裹”):

               -1  - -1 -1 4
               -1 -1  - -1 4
    Result =   -1 -1 -1  - 5
                - -1 -1  4 4
                1 -1 -1 -1 -
Run Code Online (Sandbox Code Playgroud)

感谢Aguy的建议,这是一个非常好的方法来帮助卷积后的结果计算。现在假设我们可以从 Array.mask 获取 Array 的掩码,这会给我们一个结果

                   False True  False False False                       
                   False False True  False False
    Array.mask ==  False False False True  False
                   True  False False False False
                   False False False False True
Run Code Online (Sandbox Code Playgroud)

如何使用此掩码将卷积后的结果转换为掩码数组?

Jas*_*son 7

我不认为用 0 替换是这样做的正确方法,您正在将 covolved 值推向 0。这些缺失应该被视为“缺失”。因为它们代表了缺失的信息,没有理由假设它们可能是 0,它们根本不应该参与任何计算。

我尝试将缺失值设置为numpy.nan然后进行卷积,结果表明内核和任何缺失之间的任何重叠都会nan在结果中给出一个,即使重叠是内核中的 0,所以你会得到一个扩大的缺失洞结果。根据您的应用程序,这可能是所需的结果。

但在某些情况下,您不想仅仅因为 1 个缺失就丢弃这么多信息(也许 <= 50% 的缺失仍然可以容忍)。在这种情况下,我发现了另一个具有更好实现的模块astropynumpy.nans 被忽略(或替换为内插值?)。

因此,使用astropy,您将执行以下操作:

from astropy.convolution import convolve
inarray=numpy.where(inarray.mask,numpy.nan,inarray) # masking still doesn't work, has to set to numpy.nan
result=convolve(inarray,kernel)
Run Code Online (Sandbox Code Playgroud)

但是,您仍然无法控制可以容忍多少缺失。为了实现这一点,我创建了一个函数,该函数将scipy.ndimage.convolve()用于初始卷积,但numpy.nan在涉及缺失值 ( )时手动重新计算值:

def convolve2d(slab,kernel,max_missing=0.5,verbose=True):
    '''2D convolution with missings ignored

    <slab>: 2d array. Input array to convolve. Can have numpy.nan or masked values.
    <kernel>: 2d array, convolution kernel, must have sizes as odd numbers.
    <max_missing>: float in (0,1), max percentage of missing in each convolution
                   window is tolerated before a missing is placed in the result.

    Return <result>: 2d array, convolution result. Missings are represented as
                     numpy.nans if they are in <slab>, or masked if they are masked
                     in <slab>.

    '''

    from scipy.ndimage import convolve as sciconvolve

    assert numpy.ndim(slab)==2, "<slab> needs to be 2D."
    assert numpy.ndim(kernel)==2, "<kernel> needs to be 2D."
    assert kernel.shape[0]%2==1 and kernel.shape[1]%2==1, "<kernel> shape needs to be an odd number."
    assert max_missing > 0 and max_missing < 1, "<max_missing> needs to be a float in (0,1)."

    #--------------Get mask for missings--------------
    if not hasattr(slab,'mask') and numpy.any(numpy.isnan(slab))==False:
        has_missing=False
        slab2=slab.copy()

    elif not hasattr(slab,'mask') and numpy.any(numpy.isnan(slab)):
        has_missing=True
        slabmask=numpy.where(numpy.isnan(slab),1,0)
        slab2=slab.copy()
        missing_as='nan'

    elif (slab.mask.size==1 and slab.mask==False) or numpy.any(slab.mask)==False:
        has_missing=False
        slab2=slab.copy()

    elif not (slab.mask.size==1 and slab.mask==False) and numpy.any(slab.mask):
        has_missing=True
        slabmask=numpy.where(slab.mask,1,0)
        slab2=numpy.where(slabmask==1,numpy.nan,slab)
        missing_as='mask'

    else:
        has_missing=False
        slab2=slab.copy()

    #--------------------No missing--------------------
    if not has_missing:
        result=sciconvolve(slab2,kernel,mode='constant',cval=0.)
    else:
        H,W=slab.shape
        hh=int((kernel.shape[0]-1)/2)  # half height
        hw=int((kernel.shape[1]-1)/2)  # half width
        min_valid=(1-max_missing)*kernel.shape[0]*kernel.shape[1]

        # dont forget to flip the kernel
        kernel_flip=kernel[::-1,::-1]

        result=sciconvolve(slab2,kernel,mode='constant',cval=0.)
        slab2=numpy.where(slabmask==1,0,slab2)

        #------------------Get nan holes------------------
        miss_idx=zip(*numpy.where(slabmask==1))

        if missing_as=='mask':
            mask=numpy.zeros([H,W])

        for yii,xii in miss_idx:

            #-------Recompute at each new nan in result-------
            hole_ys=range(max(0,yii-hh),min(H,yii+hh+1))
            hole_xs=range(max(0,xii-hw),min(W,xii+hw+1))

            for hi in hole_ys:
                for hj in hole_xs:
                    hi1=max(0,hi-hh)
                    hi2=min(H,hi+hh+1)
                    hj1=max(0,hj-hw)
                    hj2=min(W,hj+hw+1)

                    slab_window=slab2[hi1:hi2,hj1:hj2]
                    mask_window=slabmask[hi1:hi2,hj1:hj2]
                    kernel_ij=kernel_flip[max(0,hh-hi):min(hh*2+1,hh+H-hi), 
                                     max(0,hw-hj):min(hw*2+1,hw+W-hj)]
                    kernel_ij=numpy.where(mask_window==1,0,kernel_ij)

                    #----Fill with missing if not enough valid data----
                    ksum=numpy.sum(kernel_ij)
                    if ksum<min_valid:
                        if missing_as=='nan':
                            result[hi,hj]=numpy.nan
                        elif missing_as=='mask':
                            result[hi,hj]=0.
                            mask[hi,hj]=True
                    else:
                        result[hi,hj]=numpy.sum(slab_window*kernel_ij)

        if missing_as=='mask':
            result=numpy.ma.array(result)
            result.mask=mask

    return result
Run Code Online (Sandbox Code Playgroud)

下图显示了输出。左边是一个 30x30 的随机地图,有 3numpy.nan个洞,大小为:

  1. 1x1
  2. 3x3
  3. 5x5

在此处输入图片说明

右侧是卷积输出,由 5x5 内核(全为 1)和 50% ( max_missing=0.5)的容差级别构成。

因此,前 2 个较小的孔使用附近的值填充,而在最后一个中,因为缺失的数量 > 0.5x5x5 = 12.5numpy.nans 被放置来表示缺失的信息。


小智 5

我发现了一个黑客。使用虚数代替 nan(将 nan 更改为 1i)运行卷积并设置只要虚数高于阈值,它就是 nan。每当它低于时,就取实际值。这是一个代码片段:

frames_complex = np.zeros_like(frames_, dtype=np.complex64)
frames_complex[np.isnan(frames_)] = np.array((1j))
frames_complex[np.bitwise_not(np.isnan(frames_))] =                         
frames_[np.bitwise_not(np.isnan(frames_))]
convolution = signal.convolve(frames_complex, gaussian_window, 'valid')
convolution[np.imag(convolution)>0.2] = np.nan
convolution = convolution.astype(np.float32)
Run Code Online (Sandbox Code Playgroud)

  • 这不是把 nan 的实际值归零吗? (2认同)