Pandas：基于每组中的最大值的 GroupBy 和 Order 组

Question

Pandas：基于每组中的最大值的 GroupBy 和 Order 组

我有一个 Pandas DataFrame，其中包含曲目、分数和其他一些列。

我想对“轨道”进行分组，然后根据“分数”的最大值对这些组进行排序。

示例数据框：

tracks       score
20            2.2
20            1.5
25            3.5
24            1.2
24            5.5

Run Code Online (Sandbox Code Playgroud)

预期输出（我想比较每个组的最高值并将所有组从最高到最低排序，尽管我不想丢失任何其他数据 - 意味着我想显示所有行）：

tracks       score
24            5.5
              1.2
25            3.5
20            2.2
              1.5

Run Code Online (Sandbox Code Playgroud)

目前，我得到以下输出（我的分数已排序，但在分组后，我的曲目根据曲目编号进行排序）：

    tracks       score
20            2.2
              1.5
24            5.5
              4.2
25            3.5

Run Code Online (Sandbox Code Playgroud)

到目前为止我的方法： 1.我已经按分数对所有值进行了排序

sub_df = sub_df.sort_values("score")

Run Code Online (Sandbox Code Playgroud)

然后我执行以下操作来获取输出（我需要字典格式）：

url_dict = sub_df.groupby('track')['url'].apply(list).to_dict()

我还尝试使用 OrderedDict 但它没有用（至少现在），因为 groupBy 命令发送了错误的数据序列。

熊猫 = 0.23，Python = 3.6.4

Answer 1

jez*_*ael 5

创建辅助列 byGroupBy.transform并按多列排序DataFrame.sort_values，最后删除辅助列：

sub_df['max'] = sub_df.groupby('tracks')['score'].transform('max')

sub_df = sub_df.sort_values(["max","score"], ascending=False).drop('max', axis=1)
#if necessary sorting also by tracks column
#sub_df = sub_df.sort_values(["max","tracks","score"], ascending=False).drop('max', axis=1)
print (sub_df)
   tracks  score
4      24    5.5
3      24    1.2
2      25    3.5
0      20    2.2
1      20    1.5

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，9 月前
查看次数：	5678 次
最近记录：	6 年，9 月前