Pandas 索引与日期值的差异

dra*_*ram 9 python pandas

如果我有这样的 DataFrame

Date           value
04 May 2015     1
06 May 2015     1
07 May 2015     1
11 May 2015     1
11 May 2015     1
Run Code Online (Sandbox Code Playgroud)

如何获得日期索引的差异?即下面的第三列:

Date           value   Diff
04 May 2015     1      NA
06 May 2015     1       2
07 May 2015     1       1
11 May 2015     1       4
11 May 2015     1       0
Run Code Online (Sandbox Code Playgroud)

Shi*_*ith 8

你可以使用pandas.Series.diff

>>> df['Diff'] = df.index.to_series().diff()

            value     Diff
Date                    
2015-05-04      1      NaT
2015-05-06      1   2 days
2015-05-07      1   1 days
2015-05-11      1   4 days
2015-05-11      1   0 days
Run Code Online (Sandbox Code Playgroud)

转换为浮动的优雅方式是

df['Diff'] = df.index.to_series().diff().dt.days
>>df
            value  Diff
Date                   
2015-05-04      1   NaN
2015-05-06      1   2.0
2015-05-07      1   1.0
2015-05-11      1   4.0
2015-05-11      1   0.0
Run Code Online (Sandbox Code Playgroud)

更快的方法是将类型转换为天

df.index.to_series().diff().astype('timedelta64[D]')
Run Code Online (Sandbox Code Playgroud)

转换为整数(熊猫版 >= 0.24

df.index.to_series().diff().astype('timedelta64[D]').astype('Int64') 
>>df
            value  Diff
Date                   
2015-05-04      1   NaN
2015-05-06      1     2
2015-05-07      1     1
2015-05-11      1     4
2015-05-11      1     0
Run Code Online (Sandbox Code Playgroud)

注意:Int64 是Pandas Nullable Integer数据类型(不是 int64)


Pad*_*ham 5

你的意思是这样的:

df["Diff"] = df.index
df["Diff"] = (df['Diff'] - df['Diff'].shift())

print(df)
            value   Diff
Date                    
2015-05-04      1    NaT
2015-05-06      1 2 days
2015-05-07      1 1 days
2015-05-11      1 4 days
2015-05-11      1 0 days
Run Code Online (Sandbox Code Playgroud)