Tom*_*mas 9 python sorting grouping pandas
我正在尝试在分组和聚合后对数据(Pandas)进行排序,而且我被卡住了.我的数据:
data = {'from_year': [2010, 2011, 2012, 2011, 2012, 2010, 2011, 2012],
'name': ['John', 'John1', 'John', 'John', 'John4', 'John', 'John1', 'John6'],
'out_days': [11, 8, 10, 15, 11, 6, 10, 4]}
persons = pd.DataFrame(data, columns=["from_year", "name", "out_days"])
days_off_yearly = persons.groupby(["from_year", "name"]).agg({"out_days": [np.sum]})
print(days_off_yearly)
Run Code Online (Sandbox Code Playgroud)
之后我将数据排序:
out_days
sum
from_year name
2010 John 17
2011 John 15
John1 18
2012 John 10
John4 11
John6 4
Run Code Online (Sandbox Code Playgroud)
我想通过from_year和out_days sum对数据进行排序,并期望数据为:
out_days
sum
from_year name
2012 John4 11
John 10
John6 4
2011 John1 18
John 15
2010 John 17
Run Code Online (Sandbox Code Playgroud)
我在尝试
print(days_off_yearly.sort_values(["from_year", ("out_days", "sum")], ascending=False).head(10))
Run Code Online (Sandbox Code Playgroud)
但得到KeyError:'from_year'.
任何帮助赞赏.
您可以使用sort_values,但首先reset_index,然后set_index:
#simplier aggregation
days_off_yearly = persons.groupby(["from_year", "name"])['out_days'].sum()
print(days_off_yearly)
from_year name
2010 John 17
2011 John 15
John1 18
2012 John 10
John4 11
John6 4
Name: out_days, dtype: int64
print (days_off_yearly.reset_index()
.sort_values(['from_year','out_days'],ascending=False)
.set_index(['from_year','name']))
out_days
from_year name
2012 John4 11
John 10
John6 4
2011 John1 18
John 15
2010 John 17
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7347 次 |
| 最近记录: |