Kha*_*oti 6 python dataframe pandas
我在 pandas 中有以下数据集:
import pandas as pd
seq = [1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]
event_no = [5, 5, 5, 6, 6, 6, 4, 4, 4, 3, 3, 3, 1, 1, 1, 2, 2, 2]
points_no = [1, 1, 1, None, None, None, 1, 1, 1, 1, 1, 1, None, None, None, 1, 1, 1]
df = pd.DataFrame({"seq" : seq, "event_no": event_no, "points_no": points_no})
seq event_no points_no
0 1 5 1.0
1 1 5 1.0
2 1 5 1.0
3 1 6 NaN
4 1 6 NaN
5 1 6 NaN
6 1 4 1.0
7 1 4 1.0
8 1 4 1.0
9 2 3 1.0
10 2 3 1.0
11 2 3 1.0
12 2 1 NaN
13 2 1 NaN
14 2 1 NaN
15 2 2 1.0
16 2 2 1.0
17 2 2 1.0
Run Code Online (Sandbox Code Playgroud)
seq
那时我将其分组event_no
,然后求和points_no
......
df2 = df.groupby(['seq', 'event_no']).points_no.sum().reset_index()
Run Code Online (Sandbox Code Playgroud)
下面的输出不保留 column 中数据的原始索引顺序event_no
,而是按升序排序:
seq event_no points_no
0 1 4 3.0
1 1 5 3.0
2 1 6 0.0
3 2 1 0.0
4 2 2 3.0
5 2 3 3.0
Run Code Online (Sandbox Code Playgroud)
我真正想要的是这个输出:
seq event_no points_no
0 1 5 3.0
1 1 6 0.0
2 1 4 3.0
3 2 3 3.0
4 2 1 0.0
5 2 2 3.0
Run Code Online (Sandbox Code Playgroud)
有没有办法在保留索引顺序的同时获得所述结果?
使用参数sort=False
:
df.groupby(['seq', 'event_no'], sort=False).points_no.sum().reset_index()
Run Code Online (Sandbox Code Playgroud)