Numpy 数组获取非 NaN 的数组的子集/切片

Question

Numpy 数组获取非 NaN 的数组的子集/切片

She*_*284 5 python arrays numpy slice python-2.7

我有一个大小为：(50, 50) 的数组。该数组中有一个大小为 (20,10) 的切片。只有这个切片包含数据，其余部分全部设置为nan。

如何从我的大数组中切出这个切片？

Answer 1

您可以使用花哨的索引来收集以下项目not NaN：

a = a[ np.logical_not( np.isnan(a) ) ].reshape(20,10)

Run Code Online (Sandbox Code Playgroud)

或者，按照 Joe Kington 的建议：

a = a[ ~np.isnan(a) ]

Run Code Online (Sandbox Code Playgroud)

尽管如此，你必须提前知道非 NaN 区域的形状。OP，你提前知道吗？ (2认同)

Answer 2

Jam*_*ter 1

你知道 NaN 在哪里吗？如果是这样，这样的事情应该有效：

newarray = np.copy(oldarray[xstart:xend,ystart:yend])

Run Code Online (Sandbox Code Playgroud)

其中xstart和xend是您想要的 x 维度切片的开头和结尾，对于 y 也类似。如果不再需要旧数组，则可以删除旧数组以释放内存。

如果您不知道 NaN 在哪里，这应该可以解决问题：

# in this example, the starting array is A, numpy is imported as np
boolA = np.isnan(A) #get a boolean array of where the nans are
nonnanidxs = zip(*np.where(boolA == False)) #all the indices which are non NaN
#slice out the nans
corner1 = nonnanidxs[0]
corner2 = nonnanidxs[-1]
xdist = corner2[0] - corner1[0] + 1
ydist = corner2[1] - corner1[1] + 1
B = copy(A[corner1[0]:corner1[0]+xdist,corner1[1]:corner1[1]+ydist])
#B is now the array you want

Run Code Online (Sandbox Code Playgroud)

请注意，对于大型数组来说，这会非常慢，因为np.where要遍历整个数组。数字错误跟踪器中存在一个未解决的问题，该方法发现第一个索引等于某个值然后停止。可能有一种更优雅的方法来做到这一点，这只是我想到的第一件事。

编辑：忽略，sgpc 的答案要好得多。

归档时间：	12 年，7 月前
查看次数：	2704 次
最近记录：	8 年，10 月前