如何从numpy.ndarray数据中排除行/列

Question

如何从numpy.ndarray数据中排除行/列

假设我们有一个numpy.ndarray数据,比如说有形状(100,200),你还有一个要从数据中排除的索引列表.你会怎么做？像这样的东西:

a = numpy.random.rand(100,200)
indices = numpy.random.randint(100,size=20)
b = a[-indices,:] # imaginary code, what to replace here?

Run Code Online (Sandbox Code Playgroud)

谢谢.

Answer 1

小智 12

您可以使用 b = numpy.delete(a, indices, axis=0)

来源:NumPy文档.

对于索引的数字列表,`np.delete`使用您之前拒绝占用太多内存的`mask`解决方案. (3认同)

Answer 2

Tho*_*sen 6

你可以试试：

a = numpy.random.rand(100,200)
indices = numpy.random.randint(100,size=20)
b = a[np.setdiff1d(np.arange(100),indices),:]

Run Code Online (Sandbox Code Playgroud)

这避免了mask在/sf/answers/1471592741/ 中创建与您的数据相同大小的数组。请注意，此示例创建了一个二维数组，b而不是后一个答案中的扁平数组。

对这种方法与/sf/answers/2119141251/的运行时与内存成本的粗略调查似乎表明delete速度更快，同时索引setdiff1d更容易消耗内存：

In [75]: %timeit b = np.delete(a, indices, axis=0)
The slowest run took 7.47 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 24.7 µs per loop

In [76]: %timeit c = a[np.setdiff1d(np.arange(100),indices),:]
10000 loops, best of 3: 48.4 µs per loop

In [77]: %memit b = np.delete(a, indices, axis=0)
peak memory: 52.27 MiB, increment: 0.85 MiB

In [78]: %memit c = a[np.setdiff1d(np.arange(100),indices),:]
peak memory: 52.39 MiB, increment: 0.12 MiB

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，5 月前
查看次数：	12323 次
最近记录：	8 年，5 月前