numpy：有条件的总和

Question

numpy：有条件的总和

我有以下numpy数组：

import numpy as np
arr = np.array([[1,2,3,4,2000],
                [5,6,7,8,2000],
                [9,0,1,2,2001],
                [3,4,5,6,2001],
                [7,8,9,0,2002],
                [1,2,3,4,2002],
                [5,6,7,8,2003],
                [9,0,1,2,2003]
              ])

Run Code Online (Sandbox Code Playgroud)

我知道np.sum(arr, axis=0)可以提供以下结果：

array([   40,    28,    36,    34, 16012])

Run Code Online (Sandbox Code Playgroud)

我想做的（没有for循环）是根据最后一列的值对各列求和，以便提供的结果是：

array([[   6,    8,   10,   12, 4000],
       [  12,    4,    6,    8, 4002],
       [   8,   10,   12,    4, 4004],
       [  14,    6,    8,   10, 4006]])

Run Code Online (Sandbox Code Playgroud)

我意识到这可能是一个无循环的尝试，但希望能做到最好……

如果必须使用for循环，那将如何工作？

我试过了np.sum(arr[:, 4]==2000, axis=0)（我会2000用for循环中的变量代替），但是结果是2

Answer 1

Mad*_*ist 4

np.diff您可以使用和的巧妙应用在纯 numpy 中完成此操作np.add.reduceat。np.diff将为您提供最右列发生变化的索引：

d = np.diff(arr[:, -1])

Run Code Online (Sandbox Code Playgroud)

np.where会将您的布尔索引转换d为期望的整数索引np.add.reduceat：

d = np.where(d)[0]

Run Code Online (Sandbox Code Playgroud)

reduceat还期望看到一个零索引，并且所有内容都需要移一：

indices = np.r_[0, e + 1]

Run Code Online (Sandbox Code Playgroud)

在这里使用np.r_比它允许标量更方便一些np.concatenate。那么总和就变成：

result = np.add.reduceat(arr, indices, axis=0)

Run Code Online (Sandbox Code Playgroud)

当然，这可以合并成一行：

>>> result = np.add.reduceat(arr, np.r_[0, np.where(np.diff(arr[:, -1]))[0] + 1], axis=0)
>>> result
array([[   6,    8,   10,   12, 4000],
       [  12,    4,    6,    8, 4002],
       [   8,   10,   12,    4, 4004],
       [  14,    6,    8,   10, 4006]])

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，6 月前
查看次数：	4990 次
最近记录：	7 年，6 月前