pandas groupby 内排序（多索引）

Question

pandas groupby 内排序（多索引）

编辑：输入样本数据 df 和预期输出。编辑2：我稍微修改了数据，以便结果在每种情况下都不是与“cc”相关的统一最大数字。

我的问题是：

我有一个数据框，其中有两个索引列，我按（Index1，Index2）和三列（X，Y，Z）分组
我创建了一个 groupby 并向其应用了一个函数（将 groupby 对象中的所有列缩放为 1）
我对生成的数据帧进行了总结，以获得每行的总和

df 为：

df = pd.DataFrame({'Index1': ['A', 'A', 'A', 'B', 'B', 'B'],
                'Index2': ['aa', 'bb', 'cc', 'aa', 'bb', 'cc'],
                'X': [1, 2, 7, 3, 6, 1],
                'Y': [2, 3, 6, 2, 4, 1],
                'Z': [3, 5, 9, 1, 2, 1]})

Run Code Online (Sandbox Code Playgroud)

那么代码是：

df_scored = pd.DataFrame()   #new df to hold results
cats = [X, Y, Z]             #categories (columns of df) to be scaled
grouped = df.groupby([Index 1, Index 2]).sum()
for cat in cats :
    df_scored[cat] = grouped.groupby(level = 0)[cat].apply(lambda x: x / x.max())
df_scored['Score'] = df_scored.sum(axis = 1)

Run Code Online (Sandbox Code Playgroud)

这会产生：

                      X         Y         Z     Score
Index1 Index2                                        
A      aa      0.142857  0.333333  0.333333  0.809524
       bb      0.285714  0.500000  0.555556  1.341270
       cc      1.000000  1.000000  1.000000  3.000000
B      aa      0.500000  0.500000  0.500000  1.500000
       bb      1.000000  1.000000  1.000000  3.000000
       cc      0.166667  0.250000  0.500000  0.916667

Run Code Online (Sandbox Code Playgroud)

现在我想按索引 1 的每个分组对生成的 df_scored 进行排序（以便索引 2 在索引 1 的每个组中按“分数”排序），这就是所需的结果：

                      X         Y         Z     Score
Index1 Index2                                        
A      cc      1.000000  1.000000  1.000000  3.000000
       bb      0.285714  0.500000  0.555556  1.341270
       aa      0.142857  0.333333  0.333333  0.809524
B      bb      1.000000  1.000000  1.000000  3.000000
       aa      0.500000  0.500000  0.500000  1.500000
       cc      0.166667  0.250000  0.500000  0.916667

Run Code Online (Sandbox Code Playgroud)

我该怎么做呢？

我在这里和这里看到了一些关于此的其他问题，但在这种情况下没有让它为我工作。

Answer 1

Aks*_*kar 6

将其添加到代码末尾

df_scored.sort_values('Score', ascending= False).sort_index(level='Index1', sort_remaining=False)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年前
查看次数：	3972 次
最近记录：	8 年前