numpy数组的边界框

a.s*_*iet 24 python arrays numpy transformation

假设您有一个2D numpy数组,其中包含一些随机值和周围的零.

示例"倾斜矩形":

import numpy as np
from skimage import transform

img1 = np.zeros((100,100))
img1[25:75,25:75] = 1.
img2 = transform.rotate(img1, 45)
Run Code Online (Sandbox Code Playgroud)

现在我想找到所有非零数据的最小边界矩形.例如:

a = np.where(img2 != 0)
bbox = img2[np.min(a[0]):np.max(a[0])+1, np.min(a[1]):np.max(a[1])+1]
Run Code Online (Sandbox Code Playgroud)

实现这一结果的最快方法是什么?我确信有更好的方法,因为如果我使用1000x1000数据集,np.where函数需要相当长的时间.

编辑:还应该在3D中工作......

ali*_*i_m 53

您可以通过使用np.any将包含非零值的行和列减少到1D向量来大致减半执行时间,而不是使用以下方法查找所有非零值的索引np.where:

def bbox1(img):
    a = np.where(img != 0)
    bbox = np.min(a[0]), np.max(a[0]), np.min(a[1]), np.max(a[1])
    return bbox

def bbox2(img):
    rows = np.any(img, axis=1)
    cols = np.any(img, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]

    return rmin, rmax, cmin, cmax
Run Code Online (Sandbox Code Playgroud)

一些基准:

%timeit bbox1(img2)
10000 loops, best of 3: 63.5 µs per loop

%timeit bbox2(img2)
10000 loops, best of 3: 37.1 µs per loop
Run Code Online (Sandbox Code Playgroud)

将此方法扩展到3D案例只需沿每对轴执行缩减:

def bbox2_3D(img):

    r = np.any(img, axis=(1, 2))
    c = np.any(img, axis=(0, 2))
    z = np.any(img, axis=(0, 1))

    rmin, rmax = np.where(r)[0][[0, -1]]
    cmin, cmax = np.where(c)[0][[0, -1]]
    zmin, zmax = np.where(z)[0][[0, -1]]

    return rmin, rmax, cmin, cmax, zmin, zmax
Run Code Online (Sandbox Code Playgroud)

通过迭代每个独特的轴组合来执行减少,可以很容易地将其推广到Nitertools.combinations:

import itertools

def bbox2_ND(img):
    N = img.ndim
    out = []
    for ax in itertools.combinations(reversed(range(N)), N - 1):
        nonzero = np.any(img, axis=ax)
        out.extend(np.where(nonzero)[0][[0, -1]])
    return tuple(out)
Run Code Online (Sandbox Code Playgroud)

如果您知道原始边界框的角坐标,旋转角度和旋转中心,您可以通过计算相应的仿射变换矩阵并将其与输入点缀来直接获得变换的边界框角的坐标.坐标:

def bbox_rotate(bbox_in, angle, centre):

    rmin, rmax, cmin, cmax = bbox_in

    # bounding box corners in homogeneous coordinates
    xyz_in = np.array(([[cmin, cmin, cmax, cmax],
                        [rmin, rmax, rmin, rmax],
                        [   1,    1,    1,    1]]))

    # translate centre to origin
    cr, cc = centre
    cent2ori = np.eye(3)
    cent2ori[:2, 2] = -cr, -cc

    # rotate about the origin
    theta = np.deg2rad(angle)
    rmat = np.eye(3)
    rmat[:2, :2] = np.array([[ np.cos(theta),-np.sin(theta)],
                             [ np.sin(theta), np.cos(theta)]])

    # translate from origin back to centre
    ori2cent = np.eye(3)
    ori2cent[:2, 2] = cr, cc

    # combine transformations (rightmost matrix is applied first)
    xyz_out = ori2cent.dot(rmat).dot(cent2ori).dot(xyz_in)

    r, c = xyz_out[:2]

    rmin = int(r.min())
    rmax = int(r.max())
    cmin = int(c.min())
    cmax = int(c.max())

    return rmin, rmax, cmin, cmax
Run Code Online (Sandbox Code Playgroud)

这比使用np.any小示例数组要快得多:

%timeit bbox_rotate([25, 75, 25, 75], 45, (50, 50))
10000 loops, best of 3: 33 µs per loop
Run Code Online (Sandbox Code Playgroud)

但是,由于此方法的速度与输入数组的大小无关,因此对于较大的数组,它可以快得多.

将转换方法扩展到3D稍微复杂一点,因为旋转现在有三个不同的组件(一个围绕x轴,一个围绕y轴,一个围绕z轴),但基本方法是相同的:

def bbox_rotate_3d(bbox_in, angle_x, angle_y, angle_z, centre):

    rmin, rmax, cmin, cmax, zmin, zmax = bbox_in

    # bounding box corners in homogeneous coordinates
    xyzu_in = np.array(([[cmin, cmin, cmin, cmin, cmax, cmax, cmax, cmax],
                         [rmin, rmin, rmax, rmax, rmin, rmin, rmax, rmax],
                         [zmin, zmax, zmin, zmax, zmin, zmax, zmin, zmax],
                         [   1,    1,    1,    1,    1,    1,    1,    1]]))

    # translate centre to origin
    cr, cc, cz = centre
    cent2ori = np.eye(4)
    cent2ori[:3, 3] = -cr, -cc -cz

    # rotation about the x-axis
    theta = np.deg2rad(angle_x)
    rmat_x = np.eye(4)
    rmat_x[1:3, 1:3] = np.array([[ np.cos(theta),-np.sin(theta)],
                                 [ np.sin(theta), np.cos(theta)]])

    # rotation about the y-axis
    theta = np.deg2rad(angle_y)
    rmat_y = np.eye(4)
    rmat_y[[0, 0, 2, 2], [0, 2, 0, 2]] = (
        np.cos(theta), np.sin(theta), -np.sin(theta), np.cos(theta))

    # rotation about the z-axis
    theta = np.deg2rad(angle_z)
    rmat_z = np.eye(4)
    rmat_z[:2, :2] = np.array([[ np.cos(theta),-np.sin(theta)],
                               [ np.sin(theta), np.cos(theta)]])

    # translate from origin back to centre
    ori2cent = np.eye(4)
    ori2cent[:3, 3] = cr, cc, cz

    # combine transformations (rightmost matrix is applied first)
    tform = ori2cent.dot(rmat_z).dot(rmat_y).dot(rmat_x).dot(cent2ori)
    xyzu_out = tform.dot(xyzu_in)

    r, c, z = xyzu_out[:3]

    rmin = int(r.min())
    rmax = int(r.max())
    cmin = int(c.min())
    cmax = int(c.max())
    zmin = int(z.min())
    zmax = int(z.max())

    return rmin, rmax, cmin, cmax, zmin, zmax
Run Code Online (Sandbox Code Playgroud)

我已经基本上只是修改上面的函数使用从旋转矩阵表达在这里 -我还没来得及写测试用例还,所以请谨慎使用.


rth*_*rth 5

这是一种计算N维数组的边界框的算法,

def get_bounding_box(x):
    """ Calculates the bounding box of a ndarray"""
    mask = x == 0
    bbox = []
    all_axis = np.arange(x.ndim)
    for kdim in all_axis:
        nk_dim = np.delete(all_axis, kdim)
        mask_i = mask.all(axis=tuple(nk_dim))
        dmask_i = np.diff(mask_i)
        idx_i = np.nonzero(dmask_i)[0]
        if len(idx_i) != 2:
            raise ValueError('Algorithm failed, {} does not have 2 elements!'.format(idx_i))
        bbox.append(slice(idx_i[0]+1, idx_i[1]+1))
    return bbox
Run Code Online (Sandbox Code Playgroud)

可以与2D,3D等数组一起使用,如下所示,

In [1]: print((img2!=0).astype(int))
   ...: bbox = get_bounding_box(img2)
   ...: print((img2[bbox]!=0).astype(int))
   ...: 
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0]
 [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0]
 [0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0]
 [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0]
 [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0]
 [0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0]
 [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0]
 [0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 1 1 0 0 0 0 0 0]
 [0 0 0 0 0 1 1 1 1 0 0 0 0 0]
 [0 0 0 0 1 1 1 1 1 1 0 0 0 0]
 [0 0 0 1 1 1 1 1 1 1 1 0 0 0]
 [0 0 1 1 1 1 1 1 1 1 1 1 0 0]
 [0 1 1 1 1 1 1 1 1 1 1 1 1 0]
 [1 1 1 1 1 1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 1 1 1 1 1 1]
 [0 1 1 1 1 1 1 1 1 1 1 1 1 0]
 [0 0 1 1 1 1 1 1 1 1 1 1 0 0]
 [0 0 0 1 1 1 1 1 1 1 1 0 0 0]
 [0 0 0 0 1 1 1 1 1 1 0 0 0 0]
 [0 0 0 0 0 1 1 1 1 0 0 0 0 0]
 [0 0 0 0 0 0 1 1 0 0 0 0 0 0]]
Run Code Online (Sandbox Code Playgroud)

虽然用一个替换np.diffand np.nonzero调用np.where可能会更好。


All*_*ner 5

np.where通过替换np.argmax和处理布尔掩码,我能够挤出更多的性能。

\n\n
\ndef bbox(img):\n img = (img > 0)\n rows = np.any(img, axis=1)\n cols = np.any(img, axis=0)\n rmin, rmax = np.argmax(行), img.shape[0] - 1 - np.argmax(np.flipud(行))\n cmin, cmax = np.argmax(列), img.shape[1] - 1 - np .argmax(np.flipud(cols))\n 返回 rmin、rmax、cmin、cmax
\n\n

对于我来说,在相同的基准测试中,这比上面的 bbox2 解决方案快了大约 10\xc2\xb5s。还应该有一种方法只使用 argmax 的结果来查找非零行和列,避免使用 完成额外的搜索np.any,但这可能需要一些棘手的索引,我无法使用简单的方法有效地工作矢量化代码。

\n