cs9*_*s95 8 python arrays indexing numpy
这个问题基于这个较老的问题:
给定一个数组:
Run Code Online (Sandbox Code Playgroud)In [122]: arr = np.array([[1, 3, 7], [4, 9, 8]]); arr Out[122]: array([[1, 3, 7], [4, 9, 8]])鉴于其指数:
Run Code Online (Sandbox Code Playgroud)In [127]: np.indices(arr.shape) Out[127]: array([[[0, 0, 0], [1, 1, 1]], [[0, 1, 2], [0, 1, 2]]])我怎样才能将它们整齐地叠在一起形成一个新的2D阵列?这就是我想要的:
Run Code Online (Sandbox Code Playgroud)array([[0, 0, 1], [0, 1, 3], [0, 2, 7], [1, 0, 4], [1, 1, 9], [1, 2, 8]])
Divakar的这个解决方案是我目前用于2D阵列的解决方案:
def indices_merged_arr(arr):
    m,n = arr.shape
    I,J = np.ogrid[:m,:n]
    out = np.empty((m,n,3), dtype=arr.dtype)
    out[...,0] = I
    out[...,1] = J
    out[...,2] = arr
    out.shape = (-1,3)
    return out
现在,如果我想传递一个3D数组,我需要修改这个函数:
def indices_merged_arr(arr):
    m,n,k = arr.shape   # here
    I,J,K = np.ogrid[:m,:n,:k]   # here
    out = np.empty((m,n,k,4), dtype=arr.dtype)   # here
    out[...,0] = I
    out[...,1] = J
    out[...,2] = K     # here
    out[...,3] = arr
    out.shape = (-1,4)   # here
    return out
但此功能现在仅适用于3D阵列 - 我无法将2D数组传递给它.
有什么方法可以概括为什么适用于任何维度?这是我的尝试:
def indices_merged_arr_general(arr):
    tup = arr.shape   
    idx = np.ogrid[????]   # not sure what to do here....
    out = np.empty(tup + (len(tup) + 1, ), dtype=arr.dtype) 
    for i, j in enumerate(idx):
        out[...,i] = j
    out[...,len(tup) - 1] = arr
    out.shape = (-1, len(tup)
    return out
我遇到这条线路有问题:
idx = np.ogrid[????]   
我怎样才能使这个工作?
Div*_*kar 10
这是处理通用ndarrays的扩展 -
def indices_merged_arr_generic(arr, arr_pos="last"):
    n = arr.ndim
    grid = np.ogrid[tuple(map(slice, arr.shape))]
    out = np.empty(arr.shape + (n+1,), dtype=np.result_type(arr.dtype, int))
    if arr_pos=="first":
        offset = 1
    elif arr_pos=="last":
        offset = 0
    else:
        raise Exception("Invalid arr_pos")        
    for i in range(n):
        out[...,i+offset] = grid[i]
    out[...,-1+offset] = arr
    out.shape = (-1,n+1)
    return out
样品运行
2D案例:
In [252]: arr
Out[252]: 
array([[37, 32, 73],
       [95, 80, 97]])
In [253]: indices_merged_arr_generic(arr)
Out[253]: 
array([[ 0,  0, 37],
       [ 0,  1, 32],
       [ 0,  2, 73],
       [ 1,  0, 95],
       [ 1,  1, 80],
       [ 1,  2, 97]])
In [254]: indices_merged_arr_generic(arr, arr_pos='first')
Out[254]: 
array([[37,  0,  0],
       [32,  0,  1],
       [73,  0,  2],
       [95,  1,  0],
       [80,  1,  1],
       [97,  1,  2]])
3D案例:
In [226]: arr
Out[226]: 
array([[[35, 45, 33],
        [48, 38, 20],
        [69, 31, 90]],
       [[73, 65, 73],
        [27, 51, 45],
        [89, 50, 74]]])
In [227]: indices_merged_arr_generic(arr)
Out[227]: 
array([[ 0,  0,  0, 35],
       [ 0,  0,  1, 45],
       [ 0,  0,  2, 33],
       [ 0,  1,  0, 48],
       [ 0,  1,  1, 38],
       [ 0,  1,  2, 20],
       [ 0,  2,  0, 69],
       [ 0,  2,  1, 31],
       [ 0,  2,  2, 90],
       [ 1,  0,  0, 73],
       [ 1,  0,  1, 65],
       [ 1,  0,  2, 73],
       [ 1,  1,  0, 27],
       [ 1,  1,  1, 51],
       [ 1,  1,  2, 45],
       [ 1,  2,  0, 89],
       [ 1,  2,  1, 50],
       [ 1,  2,  2, 74]])
对于大型阵列,AFAIK,senderle的cartesian_product是使用NumPy生成笛卡尔积的最快方法1:
In [372]: A = np.random.random((100,100,100))
In [373]: %timeit indices_merged_arr_generic_using_cp(A)
100 loops, best of 3: 16.8 ms per loop
In [374]: %timeit indices_merged_arr_generic(A)
10 loops, best of 3: 28.9 ms per loop
这是我用来进行基准测试的设置.下面indices_merged_arr_generic_using_cp是senderle的修改,cartesian_product包括扁平化阵列旁边的笛卡尔积:
import numpy as np
import functools
def indices_merged_arr_generic_using_cp(arr):
    """
    Based on cartesian_product
    http://stackoverflow.com/a/11146645/190597 (senderle)
    """
    shape = arr.shape
    arrays = [np.arange(s, dtype='int') for s in shape]
    broadcastable = np.ix_(*arrays)
    broadcasted = np.broadcast_arrays(*broadcastable)
    rows, cols = functools.reduce(np.multiply, broadcasted[0].shape), len(broadcasted)+1
    out = np.empty(rows * cols, dtype=arr.dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    out[start:] = arr.flatten()
    return out.reshape(cols, rows).T
def indices_merged_arr_generic(arr):
    """
    https://stackoverflow.com/a/46135084/190597 (Divakar)
    """
    n = arr.ndim
    grid = np.ogrid[tuple(map(slice, arr.shape))]
    out = np.empty(arr.shape + (n+1,), dtype=arr.dtype)
    for i in range(n):
        out[...,i] = grid[i]
    out[...,-1] = arr
    out.shape = (-1,n+1)
    return out
1请注意,上面我实际上使用了senderle的cartesian_product_transpose.对我来说,这是最快的版本.对于其他人,包括发送者,cartesian_product更快.
| 归档时间: | 
 | 
| 查看次数: | 456 次 | 
| 最近记录: |