我想使用每小时时间戳向尚未记录数据的 Panadas 数据帧添加零值。
即我希望输出为:
DataFrame: quantity
created_at
2018-01-21 14:00:00 0
...
2018-01-22 12:00:00 0
2018-01-22 13:00:00 0
2018-01-22 14:00:00 31
Run Code Online (Sandbox Code Playgroud)
在下面的代码中,当我重新索引时,数量列中的值设置为 Nan。
如何保留现有值,但在缺少的地方添加零值的小时时间索引?
data = {'date_time': ['2018-01-22 14:47:05.486877'],
'quantity': [31]}
df = pd.DataFrame(data, columns = ['date_time', 'quantity'])
df.index = df['date_time']
del df['date_time']
df.index = pd.to_datetime(df.index)
#want to sum data by hour
df = df.resample('H').sum()
#set minutes etc to zero for indexing
current_date = datetime.now().replace(microsecond=0,second=0,minute=0)
d2 = current_date - timedelta(hours = 24)
all_times = pd.date_range(d2, current_date, freq = "H")
#ensure index format is exactly same as df (may be unecessary?)
df.index =df.index.map(lambda t: t.strftime('%Y-%m-%d %H:%M:%S'))
#this sets everything to Nan and wipes existing quantity data
df = df.reindex(all_times)
df = df.fillna(0)
Run Code Online (Sandbox Code Playgroud)
有任何想法吗?
我认为您需要将日期时间转换为小时floor并更改重新索引的范围 - 例如+- 24 hour,如果需要,从当前日期时间开始 - 它主要取决于current_date和Datetimeindex:
data = {'date_time': ['2018-01-22 14:47:05.486877'],
'quantity': [31]}
df = pd.DataFrame(data, columns = ['date_time', 'quantity'])
#print (df)
df.date_time = pd.to_datetime(df.date_time)
df = df.set_index('date_time')
df = df.resample('H').sum()
current_date = pd.datetime.now()
print (current_date)
2018-01-22 10:31:37.663110
all_times = pd.date_range(current_date - pd.Timedelta(hours = 24),
current_date + pd.Timedelta(hours = 24), freq = "H").floor('H')
#print (all_times)
df = df.reindex(all_times, fill_value=0)
Run Code Online (Sandbox Code Playgroud)
print (df)
quantity
2018-01-21 10:00:00 0
2018-01-21 11:00:00 0
2018-01-21 12:00:00 0
2018-01-21 13:00:00 0
2018-01-21 14:00:00 0
2018-01-21 15:00:00 0
2018-01-21 16:00:00 0
2018-01-21 17:00:00 0
2018-01-21 18:00:00 0
2018-01-21 19:00:00 0
2018-01-21 20:00:00 0
2018-01-21 21:00:00 0
2018-01-21 22:00:00 0
2018-01-21 23:00:00 0
2018-01-22 00:00:00 0
2018-01-22 01:00:00 0
2018-01-22 02:00:00 0
2018-01-22 03:00:00 0
2018-01-22 04:00:00 0
2018-01-22 05:00:00 0
2018-01-22 06:00:00 0
2018-01-22 07:00:00 0
2018-01-22 08:00:00 0
2018-01-22 09:00:00 0
2018-01-22 10:00:00 0
2018-01-22 11:00:00 0
2018-01-22 12:00:00 0
2018-01-22 13:00:00 0
2018-01-22 14:00:00 31
2018-01-22 15:00:00 0
2018-01-22 16:00:00 0
2018-01-22 17:00:00 0
2018-01-22 18:00:00 0
2018-01-22 19:00:00 0
2018-01-22 20:00:00 0
2018-01-22 21:00:00 0
2018-01-22 22:00:00 0
2018-01-22 23:00:00 0
2018-01-23 00:00:00 0
2018-01-23 01:00:00 0
2018-01-23 02:00:00 0
2018-01-23 03:00:00 0
2018-01-23 04:00:00 0
2018-01-23 05:00:00 0
2018-01-23 06:00:00 0
2018-01-23 07:00:00 0
2018-01-23 08:00:00 0
2018-01-23 09:00:00 0
2018-01-23 10:00:00 0
Run Code Online (Sandbox Code Playgroud)