熊猫数据框中的行求和返回NAN

Question

熊猫数据框中的行求和返回NAN

Jus*_*tin 0 numpy nan dataframe python-2.7 pandas

我正在尝试获取我的Pandas Dataframe中每一行的总和：

new_df['cash_change'] = new_df.sum(axis=0)

但是我的结果不断返回 NaN

我认为将位置转换为十进制进行乘法运算时可能与它有关：

pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

我必须将各列相乘才能做到这一点。但是，我将其退回：

cash_change[security.symbol] = cash_change[security.symbol].astype(float)

这是完整的方法。它的目标是对每种证券执行一些列乘法，然后将总数求和：

def get_cash_change(self):
    """
    Calculate daily cash to be transacted every day. Cash change depends on
    the position (either buy or sell) multiplied by the adjusted closing price
    of the equity multiplied by the trade amount.
    :return:
    """
    cash_change = pd.DataFrame(index=self.positions.index)
    try:

        for security in self.market_on_close_securities:
            # First convert all the positions from floating-point to decimals
            pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

            cash_change['positions'] = pos_to_dec
            cash_change['bars'] = security.bars['adj_close_price'].values

            # Perform calculation for cash change
            cash_change[security.symbol] = cash_change['positions'] * cash_change['bars'] * self.trade_amount

            cash_change[security.symbol] = cash_change[security.symbol].astype(float)

            # Clean up for next security
            cash_change.drop('positions', axis=1, inplace=True)
            cash_change.drop('bars', axis=1, inplace=True)

    except InvalidOperation as e :
        print("Invalid input : " + str(e))

    # Sum each equities change in cash
    new_df = cash_change.dropna()

    new_df['cash_change'] = new_df.sum(axis=0)

    return cash_change

Run Code Online (Sandbox Code Playgroud)

我的数据new_df框最终看起来像这样：

                MTD       ESS      SIG       SNA  cash_change
price_date                                                   
2000-01-04      0.0      0.00     0.00      0.00          NaN
2000-01-05      0.0      0.00     0.00      0.00          NaN
2000-01-06      0.0      0.00     0.00      0.00          NaN
2000-01-07      0.0      0.00     0.00      0.00          NaN
2000-01-10      0.0      0.00     0.00      0.00          NaN
2000-01-11      0.0      0.00     0.00      0.00          NaN
2000-01-12      0.0      0.00     0.00      0.00          NaN
2000-01-13      0.0      0.00     0.00      0.00          NaN
2000-01-14      0.0      0.00     0.00      0.00          NaN
2000-01-18      0.0      0.00     0.00      0.00          NaN
2000-01-19      0.0      0.00     0.00      0.00          NaN
2000-01-20      0.0      0.00     0.00      0.00          NaN
2000-01-21      0.0      0.00     0.00      0.00          NaN
2000-01-24      0.0   1747.83  1446.71      0.00          NaN
2000-01-25   3419.0      0.00     0.00      0.00          NaN
2000-01-26      0.0      0.00     0.00   1660.38          NaN
2000-01-27      0.0      0.00 -1293.27      0.00          NaN
2000-01-28      0.0      0.00     0.00      0.00          NaN

Run Code Online (Sandbox Code Playgroud)

关于我在做什么错的任何建议吗？还是另一种总计每一行列的方法？

Answer 1

Nic*_*eli 5

当您提供axis=0该DF.sum方法时，它将沿着索引执行求和（如果更容易理解，则为垂直方向）。结果，您只获得与数据框的4列相对应的4个计算值。然后，您将此结果分配给数据框的新列。由于它们不共享要重新索引的同一索引轴，因此会得到一系列NaN元素。

您实际上想要跨列（水平方向）求和。

将该行更改为：

new_df['cash_change'] = new_df.sum(axis=1)  # sum row-wise across each column

Run Code Online (Sandbox Code Playgroud)

现在您将获得有限的计算总和值。

归档时间：	9 年前
查看次数：	2745 次
最近记录：	9 年前