How to use numpy.argsort() as indices in more than 2 dimensions?

131*_*13e 8 python arrays sorting numpy

I know something similar to this question has been asked many times over already, but all answers given to similar questions only seem to work for arrays with 2 dimensions.

My understanding of np.argsort() is that np.sort(array) == array[np.argsort(array)] should be True. I have found out that this is indeed correct if np.ndim(array) == 2, but it gives different results if np.ndim(array) > 2.

Example:

>>> array = np.array([[[ 0.81774634,  0.62078744],
                       [ 0.43912609,  0.29718462]],
                      [[ 0.1266578 ,  0.82282054],
                       [ 0.98180375,  0.79134389]]])
>>> np.sort(array)
array([[[ 0.62078744,  0.81774634],
        [ 0.29718462,  0.43912609]],

       [[ 0.1266578 ,  0.82282054],
        [ 0.79134389,  0.98180375]]])
>>> array.argsort()
array([[[1, 0],
        [1, 0]],

       [[0, 1],
        [1, 0]]])
>>> array[array.argsort()]
array([[[[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]],



       [[[[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]],

         [[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]]])
Run Code Online (Sandbox Code Playgroud)

So, can anybody explain to me how exactly np.argsort() can be used as the indices to obtain the sorted array? The only way I can come up with is:

args = np.argsort(array)
array_sort = np.zeros_like(array)
for i in range(array.shape[0]):
    for j in range(array.shape[1]):
        array_sort[i, j] = array[i, j, args[i, j]]
Run Code Online (Sandbox Code Playgroud)

which is extremely tedious and cannot be generalized for any given number of dimensions.

Pau*_*zer 10

这是一种通用方法:

import numpy as np

array = np.array([[[ 0.81774634,  0.62078744],
                   [ 0.43912609,  0.29718462]],
                  [[ 0.1266578 ,  0.82282054],
                   [ 0.98180375,  0.79134389]]])

a = 1 # or 0 or 2

order = array.argsort(axis=a)

idx = np.ogrid[tuple(map(slice, array.shape))]
# if you don't need full ND generality: in 3D this can be written
# much more readable as
# m, n, k = array.shape
# idx = np.ogrid[:m, :n, :k]

idx[a] = order

print(np.all(array[idx] == np.sort(array, axis=a)))
Run Code Online (Sandbox Code Playgroud)

输出:

True
Run Code Online (Sandbox Code Playgroud)

说明:我们必须为输出数组的每个元素指定输入数组的相应元素的完整索引。因此,输入数组中的每个索引都具有与输出数组相同的形状,或者必须可广播到该形状。

我们未沿其进行排序/ argsort的轴的索引保持不变。因此,我们需要为每个传递一个可广播范围(array.shape [i])。最简单的方法是使用ogrid为所有维度创建这样的范围(如果直接使用此范围,则数组将返回不变。),然后将对应于排序轴的索引替换为的输出argsort

2019年3月更新:

Numpy在强制以元组形式传递多轴索引方面变得越来越严格。当前,array[idx]将触发弃用警告。为了将来证明,请array[tuple(idx)]改用。(感谢@Nathan)

或使用numpy的新功能(1.15.0版)take_along_axis

np.take_along_axis(array, order, a)
Run Code Online (Sandbox Code Playgroud)

  • 您现在要添加`idx = tuple(idx)`以避免`FutureWarning`(以及稍后的错误/错误结果)。argsort文档中确实需要这样的东西。应该有一种简单/标准的方法来从排序后的索引到排序后的数组。 (2认同)