如果我有这样的 DataFrame
Date value
04 May 2015 1
06 May 2015 1
07 May 2015 1
11 May 2015 1
11 May 2015 1
Run Code Online (Sandbox Code Playgroud)
如何获得日期索引的差异?即下面的第三列:
Date value Diff
04 May 2015 1 NA
06 May 2015 1 2
07 May 2015 1 1
11 May 2015 1 4
11 May 2015 1 0
Run Code Online (Sandbox Code Playgroud)
你可以使用pandas.Series.diff
>>> df['Diff'] = df.index.to_series().diff()
value Diff
Date
2015-05-04 1 NaT
2015-05-06 1 2 days
2015-05-07 1 1 days
2015-05-11 1 4 days
2015-05-11 1 0 days
Run Code Online (Sandbox Code Playgroud)
转换为浮动的优雅方式是
df['Diff'] = df.index.to_series().diff().dt.days
>>df
value Diff
Date
2015-05-04 1 NaN
2015-05-06 1 2.0
2015-05-07 1 1.0
2015-05-11 1 4.0
2015-05-11 1 0.0
Run Code Online (Sandbox Code Playgroud)
更快的方法是将类型转换为天
df.index.to_series().diff().astype('timedelta64[D]')
Run Code Online (Sandbox Code Playgroud)
转换为整数(熊猫版 >= 0.24)
df.index.to_series().diff().astype('timedelta64[D]').astype('Int64')
>>df
value Diff
Date
2015-05-04 1 NaN
2015-05-06 1 2
2015-05-07 1 1
2015-05-11 1 4
2015-05-11 1 0
Run Code Online (Sandbox Code Playgroud)
注意:Int64 是Pandas Nullable Integer数据类型(不是 int64)
你的意思是这样的:
df["Diff"] = df.index
df["Diff"] = (df['Diff'] - df['Diff'].shift())
print(df)
value Diff
Date
2015-05-04 1 NaT
2015-05-06 1 2 days
2015-05-07 1 1 days
2015-05-11 1 4 days
2015-05-11 1 0 days
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7102 次 |
| 最近记录: |