Pandas TimeGrouper on multiindex

Question

Pandas TimeGrouper on multiindex

我有一个multiIndex pandas数据帧,其中第一级索引是一个组,第二级索引是时间.我想要做的是,在每个组中,采用日内观察的平均值重新采样到每日频率.

import pandas as pd
import numpy as np

data = pd.concat([pd.DataFrame([['A']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T,
                  pd.DataFrame([['B']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T,
                  pd.DataFrame([['C']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T],
                  axis = 0).set_index(['Group', 'Time'])

Run Code Online (Sandbox Code Playgroud)

这是我到目前为止所尝试的:

daily_counts = data.groupby(pd.TimeGrouper('D'), level = ['Time']).mean()

Run Code Online (Sandbox Code Playgroud)

但是我收到以下错误:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'

Run Code Online (Sandbox Code Playgroud)

不知道怎么解决这个问题？

Answer 1

jez*_*ael 9

您需要首先将列转换为float然后使用Grouper:

data['Value'] = data['Value'].astype(float)
daily_counts = data.groupby([pd.Grouper(freq='D', level='Time'), 
                             pd.Grouper(level='Group')])['Value'].mean()

print (daily_counts) 
Time        Group
2011-01-01  A        0.548358
            B        0.612878
            C        0.544822
2011-01-02  A        0.529880
            B        0.437062
            C        0.388626
2011-01-03  A        0.563854
            B        0.479299
            C        0.557190
Name: Value, dtype: float64

Run Code Online (Sandbox Code Playgroud)

另一种方案:

data = data.reset_index(level='Group')
print (data.groupby('Group').resample('D')['Value'].mean())

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年前
查看次数：	2927 次
最近记录：	6 年，8 月前