在NumPy数组中概括切片操作

cs9*_*s95 8 python arrays indexing numpy

这个问题基于这个较老的问题:

给定一个数组:

In [122]: arr = np.array([[1, 3, 7], [4, 9, 8]]); arr
Out[122]: 
array([[1, 3, 7],
       [4, 9, 8]])
Run Code Online (Sandbox Code Playgroud)

鉴于其指数:

In [127]: np.indices(arr.shape)
Out[127]: 
array([[[0, 0, 0],
        [1, 1, 1]],

       [[0, 1, 2],
        [0, 1, 2]]])
Run Code Online (Sandbox Code Playgroud)

我怎样才能将它们整齐地叠在一起形成一个新的2D阵列?这就是我想要的:

array([[0, 0, 1],
       [0, 1, 3],
       [0, 2, 7],
       [1, 0, 4],
       [1, 1, 9],
       [1, 2, 8]])
Run Code Online (Sandbox Code Playgroud)

Divakar的这个解决方案是我目前用于2D阵列的解决方案:

def indices_merged_arr(arr):
    m,n = arr.shape
    I,J = np.ogrid[:m,:n]
    out = np.empty((m,n,3), dtype=arr.dtype)
    out[...,0] = I
    out[...,1] = J
    out[...,2] = arr
    out.shape = (-1,3)
    return out
Run Code Online (Sandbox Code Playgroud)

现在,如果我想传递一个3D数组,我需要修改这个函数:

def indices_merged_arr(arr):
    m,n,k = arr.shape   # here
    I,J,K = np.ogrid[:m,:n,:k]   # here
    out = np.empty((m,n,k,4), dtype=arr.dtype)   # here
    out[...,0] = I
    out[...,1] = J
    out[...,2] = K     # here
    out[...,3] = arr
    out.shape = (-1,4)   # here
    return out
Run Code Online (Sandbox Code Playgroud)

但此功能现在仅适用于3D阵列 - 我无法将2D数组传递给它.

有什么方法可以概括为什么适用于任何维度?这是我的尝试:

def indices_merged_arr_general(arr):
    tup = arr.shape   
    idx = np.ogrid[????]   # not sure what to do here....
    out = np.empty(tup + (len(tup) + 1, ), dtype=arr.dtype) 
    for i, j in enumerate(idx):
        out[...,i] = j
    out[...,len(tup) - 1] = arr
    out.shape = (-1, len(tup)
    return out
Run Code Online (Sandbox Code Playgroud)

我遇到这条线路有问题:

idx = np.ogrid[????]   
Run Code Online (Sandbox Code Playgroud)

我怎样才能使这个工作?

Div*_*kar 10

这是处理通用ndarrays的扩展 -

def indices_merged_arr_generic(arr, arr_pos="last"):
    n = arr.ndim
    grid = np.ogrid[tuple(map(slice, arr.shape))]
    out = np.empty(arr.shape + (n+1,), dtype=np.result_type(arr.dtype, int))

    if arr_pos=="first":
        offset = 1
    elif arr_pos=="last":
        offset = 0
    else:
        raise Exception("Invalid arr_pos")        

    for i in range(n):
        out[...,i+offset] = grid[i]
    out[...,-1+offset] = arr
    out.shape = (-1,n+1)

    return out
Run Code Online (Sandbox Code Playgroud)

样品运行

2D案例:

In [252]: arr
Out[252]: 
array([[37, 32, 73],
       [95, 80, 97]])

In [253]: indices_merged_arr_generic(arr)
Out[253]: 
array([[ 0,  0, 37],
       [ 0,  1, 32],
       [ 0,  2, 73],
       [ 1,  0, 95],
       [ 1,  1, 80],
       [ 1,  2, 97]])

In [254]: indices_merged_arr_generic(arr, arr_pos='first')
Out[254]: 
array([[37,  0,  0],
       [32,  0,  1],
       [73,  0,  2],
       [95,  1,  0],
       [80,  1,  1],
       [97,  1,  2]])
Run Code Online (Sandbox Code Playgroud)

3D案例​​:

In [226]: arr
Out[226]: 
array([[[35, 45, 33],
        [48, 38, 20],
        [69, 31, 90]],

       [[73, 65, 73],
        [27, 51, 45],
        [89, 50, 74]]])

In [227]: indices_merged_arr_generic(arr)
Out[227]: 
array([[ 0,  0,  0, 35],
       [ 0,  0,  1, 45],
       [ 0,  0,  2, 33],
       [ 0,  1,  0, 48],
       [ 0,  1,  1, 38],
       [ 0,  1,  2, 20],
       [ 0,  2,  0, 69],
       [ 0,  2,  1, 31],
       [ 0,  2,  2, 90],
       [ 1,  0,  0, 73],
       [ 1,  0,  1, 65],
       [ 1,  0,  2, 73],
       [ 1,  1,  0, 27],
       [ 1,  1,  1, 51],
       [ 1,  1,  2, 45],
       [ 1,  2,  0, 89],
       [ 1,  2,  1, 50],
       [ 1,  2,  2, 74]])
Run Code Online (Sandbox Code Playgroud)


unu*_*tbu 6

对于大型阵列,AFAIK,senderle的cartesian_product是使用NumPy生成笛卡尔积的最快方法1:


In [372]: A = np.random.random((100,100,100))

In [373]: %timeit indices_merged_arr_generic_using_cp(A)
100 loops, best of 3: 16.8 ms per loop

In [374]: %timeit indices_merged_arr_generic(A)
10 loops, best of 3: 28.9 ms per loop
Run Code Online (Sandbox Code Playgroud)

这是我用来进行基准测试的设置.下面indices_merged_arr_generic_using_cp是senderle的修改,cartesian_product包括扁平化阵列旁边的笛卡尔积:

import numpy as np
import functools

def indices_merged_arr_generic_using_cp(arr):
    """
    Based on cartesian_product
    http://stackoverflow.com/a/11146645/190597 (senderle)
    """
    shape = arr.shape
    arrays = [np.arange(s, dtype='int') for s in shape]
    broadcastable = np.ix_(*arrays)
    broadcasted = np.broadcast_arrays(*broadcastable)
    rows, cols = functools.reduce(np.multiply, broadcasted[0].shape), len(broadcasted)+1
    out = np.empty(rows * cols, dtype=arr.dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    out[start:] = arr.flatten()
    return out.reshape(cols, rows).T

def indices_merged_arr_generic(arr):
    """
    https://stackoverflow.com/a/46135084/190597 (Divakar)
    """
    n = arr.ndim
    grid = np.ogrid[tuple(map(slice, arr.shape))]
    out = np.empty(arr.shape + (n+1,), dtype=arr.dtype)
    for i in range(n):
        out[...,i] = grid[i]
    out[...,-1] = arr
    out.shape = (-1,n+1)
    return out
Run Code Online (Sandbox Code Playgroud)

1请注意,上面我实际上使用了senderle的cartesian_product_transpose.对我来说,这是最快的版本.对于其他人,包括发送者,cartesian_product更快.