Geo*_*nty 5 python add rows dataframe pandas
我有一个像这样的pandas数据框:
AAPL IBM GOOG XOM
2011-01-10 16:00:00 1500 0 0 0
2011-01-11 16:00:00 0 0 0 0
2011-01-12 16:00:00 0 0 0 0
2011-01-13 16:00:00 -1500 4000 0 0
2011-01-14 16:00:00 0 0 0 0
2011-01-18 16:00:00 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
我的目标是通过添加前面的行值来填充行.结果如下所示:
AAPL IBM GOOG XOM
2011-01-10 16:00:00 1500 0 0 0
2011-01-11 16:00:00 1500 0 0 0
2011-01-12 16:00:00 1500 0 0 0
2011-01-13 16:00:00 0 4000 0 0
2011-01-14 16:00:00 0 4000 0 0
2011-01-18 16:00:00 0 4000 0 0
Run Code Online (Sandbox Code Playgroud)
我尝试用数据帧索引进行迭代
for date in df.index:
Run Code Online (Sandbox Code Playgroud)
并用日期增加日期
dt_nextDate = date + dt.timedelta(days=1)
Run Code Online (Sandbox Code Playgroud)
但数据框索引中存在间隙,表示周末.
我可以从第二行到结尾遍历索引,返回上一行并添加值吗?
您的示例结果不是示例算法的输出,因此我不确定您要求的是什么?
您显示的所需结果是累积总和,您可以使用:
>>> df.cumsum()
AAPL IBM GOOG XOM
index
2011-01-1016:00:00 1500 0 0 0
2011-01-1116:00:00 1500 0 0 0
2011-01-1216:00:00 1500 0 0 0
2011-01-1316:00:00 0 4000 0 0
2011-01-1416:00:00 0 4000 0 0
2011-01-1816:00:00 0 4000 0 0
Run Code Online (Sandbox Code Playgroud)
但是你想要解释的东西和你展示的算法,更有可能是窗口大小等于2的滚动总和:
>>> result = pd.rolling_sum(df, 2)
>>> result
AAPL IBM GOOG XOM
index
2011-01-1016:00:00 NaN NaN NaN NaN
2011-01-1116:00:00 1500 0 0 0
2011-01-1216:00:00 0 0 0 0
2011-01-1316:00:00 -1500 4000 0 0
2011-01-1416:00:00 -1500 4000 0 0
2011-01-1816:00:00 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
要修复NaNs只做:
>>> result.iloc[0,:] = df.iloc[0,:]
>>> result
AAPL IBM GOOG XOM
index
2011-01-1016:00:00 1500 0 0 0
2011-01-1116:00:00 1500 0 0 0
2011-01-1216:00:00 0 0 0 0
2011-01-1316:00:00 -1500 4000 0 0
2011-01-1416:00:00 -1500 4000 0 0
2011-01-1816:00:00 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5740 次 |
| 最近记录: |