ada*_*.ra 3 python indexing aggregate pandas
我正在按多列对数据帧进行分组并聚合以获取多个统计信息.如何获得一个完全平坦的结构,每个可能的组密钥组合枚举为行,每个统计数据作为列存在?
import numpy as np
import pandas as pd
cities = ['Berlin', 'Oslo']
days = ['Monday', 'Friday']
data = pd.DataFrame({
'city': np.random.choice(cities, 12),
'day': np.random.choice(days, 12),
'people': np.random.normal(loc=10, size=12),
'cats': np.random.normal(loc=6, size=12)})
grouped = data.groupby(['city', 'day']).agg([np.mean, np.std])
Run Code Online (Sandbox Code Playgroud)
这样我就得到了:
cats people
mean std mean std
city day
Berlin Friday 6.146924 0.721263 10.445606 0.730992
Monday 5.239267 NaN 9.022811 NaN
Oslo Friday 6.322276 0.866899 11.579813 0.114341
Monday 5.028919 0.815674 10.458439 1.182689
Run Code Online (Sandbox Code Playgroud)
我需要弄平:
city day cats_mean cats_std people_mean people_std
Berlin Friday 6.146924 0.721263 10.445606 0.730992
Berlin Monday 5.239267 NaN 9.022811 NaN
Oslo Friday 6.322276 0.866899 11.579813 0.114341
Oslo Monday 5.028919 0.815674 10.458439 1.182689
Run Code Online (Sandbox Code Playgroud)
In [36]: grouped.columns = grouped.columns.map('_'.join)
In [37]: grouped = grouped.reset_index()
In [38]: grouped
Out[38]:
city day cats_mean cats_std people_mean people_std
0 Berlin Friday 5.852991 1.085163 11.078541 0.839688
1 Berlin Monday 6.978343 0.630983 9.876106 1.846204
2 Oslo Friday 6.096773 1.278176 9.710216 0.691672
Run Code Online (Sandbox Code Playgroud)