Bri*_*gan 16 python python-2.7 pandas
我有一个由日期时间日期键入的字典名称date_dict,其值对应于观察的整数计数.我将其转换为稀疏系列/数据框,其中包含我想要加入或转换为具有连续日期的系列/数据框的审查观察.令人讨厌的列表理解是我解决这样一个事实:大熊猫显然不会自动将日期时间日期对象转换为适当的DateTime索引.
df1 = pd.DataFrame(data=date_dict.values(),
index=[datetime.datetime.combine(i, datetime.time())
for i in date_dict.keys()],
columns=['Name'])
df1 = df1.sort(axis=0)
Run Code Online (Sandbox Code Playgroud)
此示例有1258个观察值,DateTime索引从2003-06-24到2012-11-07运行.
df1.head()
Name
Date
2003-06-24 2
2003-08-13 1
2003-08-19 2
2003-08-22 1
2003-08-24 5
Run Code Online (Sandbox Code Playgroud)
我可以创建一个带有连续DateTime索引的空数据框,但这会引入一个不需要的列,看起来很笨拙.我觉得我错过了一个更优雅的解决方案,涉及一个联接.
df2 = pd.DataFrame(data=None,columns=['Empty'],
index=pd.DateRange(min(date_dict.keys()),
max(date_dict.keys())))
df3 = df1.join(df2,how='right')
df3.head()
Name Empty
2003-06-24 2 NaN
2003-06-25 NaN NaN
2003-06-26 NaN NaN
2003-06-27 NaN NaN
2003-06-30 NaN NaN
Run Code Online (Sandbox Code Playgroud)
是否有更简单或更优雅的方法从稀疏数据帧中填充连续数据帧,以便存在(1)连续索引,(2)NaN为0,以及(3)中没有剩余空列数据帧?
Name
2003-06-24 2
2003-06-25 0
2003-06-26 0
2003-06-27 0
2003-06-30 0
Run Code Online (Sandbox Code Playgroud)
Mat*_*ohn 22
您可以使用日期范围在时间序列上使用reindex.此外,它看起来像你会关闭使用,而不是数据框一个TimeSeries的(看到更好的文档),虽然重建索引也是添加缺失索引值DataFrames以及正确的方法.
例如,从以下开始:
date_index = pd.DatetimeIndex([pd.datetime(2003,6,24), pd.datetime(2003,8,13),
pd.datetime(2003,8,19), pd.datetime(2003,8,22), pd.datetime(2003,8,24)])
ts = pd.Series([2,1,2,1,5], index=date_index)
Run Code Online (Sandbox Code Playgroud)
为您提供类似于示例数据框头的时间序列:
2003-06-24 2
2003-08-13 1
2003-08-19 2
2003-08-22 1
2003-08-24 5
Run Code Online (Sandbox Code Playgroud)
干脆做
ts.reindex(pd.date_range(min(date_index), max(date_index)))
Run Code Online (Sandbox Code Playgroud)
然后给你一个完整的索引,你的缺失值使用NaNs(你可以使用,fillna
如果你想用其他一些值填充缺失的值 - 见这里):
2003-06-24 2
2003-06-25 NaN
2003-06-26 NaN
2003-06-27 NaN
2003-06-28 NaN
2003-06-29 NaN
2003-06-30 NaN
2003-07-01 NaN
2003-07-02 NaN
2003-07-03 NaN
2003-07-04 NaN
2003-07-05 NaN
2003-07-06 NaN
2003-07-07 NaN
2003-07-08 NaN
2003-07-09 NaN
2003-07-10 NaN
2003-07-11 NaN
2003-07-12 NaN
2003-07-13 NaN
2003-07-14 NaN
2003-07-15 NaN
2003-07-16 NaN
2003-07-17 NaN
2003-07-18 NaN
2003-07-19 NaN
2003-07-20 NaN
2003-07-21 NaN
2003-07-22 NaN
2003-07-23 NaN
2003-07-24 NaN
2003-07-25 NaN
2003-07-26 NaN
2003-07-27 NaN
2003-07-28 NaN
2003-07-29 NaN
2003-07-30 NaN
2003-07-31 NaN
2003-08-01 NaN
2003-08-02 NaN
2003-08-03 NaN
2003-08-04 NaN
2003-08-05 NaN
2003-08-06 NaN
2003-08-07 NaN
2003-08-08 NaN
2003-08-09 NaN
2003-08-10 NaN
2003-08-11 NaN
2003-08-12 NaN
2003-08-13 1
2003-08-14 NaN
2003-08-15 NaN
2003-08-16 NaN
2003-08-17 NaN
2003-08-18 NaN
2003-08-19 2
2003-08-20 NaN
2003-08-21 NaN
2003-08-22 1
2003-08-23 NaN
2003-08-24 5
Freq: D, Length: 62
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
6793 次 |
最近记录: |