熊猫划分两个多指数系列

Question

熊猫划分两个多指数系列

我有一个看起来像的多索引系列

            value
foo bar baz     
1   A    C    6
         D    2
    B    D    6
         F    4
2   B    C    5
         F    7

Run Code Online (Sandbox Code Playgroud)

我想总结一下foo和bar,得到每个foo,bar的值的总和,无论baz,我都可以实现df.groupby(level=[0, 1]).sum().这个系列看起来像:

        sum_value
foo bar      
1   A      8
    B      10
2   B      12

Run Code Online (Sandbox Code Playgroud)

然而,我想value用新的来划分原始sum_value,以获得baz的百分比,给予foo和bar.

            value
foo bar baz     
1   A    C    6/8=.75
         D    2/8=.25
    B    D    6/10=.6
         F    4/10=.5
2   B    C    5/12=.42
         F    7/12=.58

Run Code Online (Sandbox Code Playgroud)

我试过df.div(df.groupby(level=[0, 1]).sum()),但得到一个Not Implemented错误.谢谢!

Answer 1

Sco*_*ton 5

您可以使用以下方法做到这一点：使用transform与原始数据帧的相似索引求和，然后div与Pandas内部数据对齐方式一起使用：

df.div(df.groupby(['foo','bar']).transform('sum'))

Run Code Online (Sandbox Code Playgroud)

输出：

                value
foo bar baz          
1   A   C    0.750000
        D    0.250000
    B   D    0.600000
        F    0.400000
2   B   C    0.416667
        F    0.583333

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，1 月前
查看次数：	1204 次
最近记录：	8 年，1 月前