Thu*_*tne 2 python dataframe pandas
我有一个数据集,如下所示
userid time val1 val2 val3 val4
1 2010-6-1 0:15 12 16 17 11
1 2010-6-1 0:30 11.5 14 15.2 10
1 2010-6-1 0:45 12 14 15 10
1 2010-6-1 1:00 8 11 13 0
.................................
.................................
2 2010-6-1 0:15 14 16 17 11
2 2010-6-1 0:30 11 14 15.2 10
2 2010-6-1 0:45 11 14 15 10
2 2010-6-1 1:00 9 11 13 0
.................................
.................................
3 ...................................
.................................
.................................
Run Code Online (Sandbox Code Playgroud)
我想获得每两行的平均值。预期结果将是
userid time val1 val2 val3 val4
1 2010-6-1 0:30 11.75 15 16.1 10.5
1 2010-6-1 1:00 10 12.5 14 5
..............................
..............................
2 2010-6-1 0:30 12.5 15 16.1 10.5
2 2010-6-1 1:00 10 12.5 14 5
.................................
.................................
3 ...................................
.................................
.................................
Run Code Online (Sandbox Code Playgroud)
目前我的方法是
data = pd.read_csv("sample_dataset.csv")
i = 0
while i < len(data) - 1:
x = data.iloc[i:i+2].mean()
x['time'] = data.iloc[i+1]['time']
data.iloc[i] = x
i+=2
for i in range(len(data)):
if i % 2 != 1:
del data.iloc[i]
Run Code Online (Sandbox Code Playgroud)
但这是非常低效的。因此,有人可以指出我一种获得预期结果的更好方法吗?在数据集中,我有超过1000000行
我在用 resample
df.set_index('time').resample('30Min',closed = 'right',label ='right').mean()
Out[293]:
val1 val2 val3 val4
time
2010-06-01 00:30:00 11.75 15.0 16.1 10.5
2010-06-01 01:00:00 10.00 12.5 14.0 5.0
Run Code Online (Sandbox Code Playgroud)
方法2
df.groupby(np.arange(len(df))//2).agg(lambda x : x.iloc[-1] if x.dtype=='datetime64[ns]' else x.mean())
Out[308]:
time val1 val2 val3 val4
0 2010-06-01 00:30:00 11.75 15.0 16.1 10.5
1 2010-06-01 01:00:00 10.00 12.5 14.0 5.0
Run Code Online (Sandbox Code Playgroud)
更新解决方案
df.groupby([df.userid,np.arange(len(df))//2]).agg(lambda x : x.iloc[-1] if x.dtype=='datetime64[ns]' else x.mean()).reset_index(drop=True)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
70 次 |
最近记录: |