使用Pandas Python中的多索引数据透视表对列值求和

mua*_*aiz 0 python pivot-table dataframe pandas

我有这样的数据

Employed    Coverage    Education   Amount
No          Basic       Bachelor    541.8029122
No          Extended    Bachelor    312.6400955
No          Premium     Bachelor    427.9560121
No          Basic       Bachelor    91.17931022
No          Basic       Bachelor    533.6890081
Yes         Basic       Bachelor    683.484326
Yes         Basic       College     586.2670885
No          Premium     Master      725.0412884
Yes         Basic       Bachelor    948.3628611
Run Code Online (Sandbox Code Playgroud)

我想用多索引数据透视表对数量求和,使其看起来如下所示.这是我关注的链接,但无法获得正确的结果

在此输入图像描述

需要你的帮助.

jpp*_*jpp 6

这是一种方式.

import pandas as pd
import io
import json

s = '''\
Employed    Coverage    Education   Amount
No          Basic       Bachelor    541.8029122
No          Extended    Bachelor    312.6400955
No          Premium     Bachelor    427.9560121
No          Basic       Bachelor    91.17931022
No          Basic       Bachelor    533.6890081
Yes         Basic       Bachelor    683.484326
Yes         Basic       College     586.2670885
No          Premium     Master      725.0412884
Yes         Basic       Bachelor    948.3628611'''

# Recreate the dataframe
df = pd.read_csv(io.StringIO(s), sep='\s+')
Run Code Online (Sandbox Code Playgroud)

实际代码:

df['Coverage'] = df['Coverage'].astype('category')

pd.pivot_table(df, index='Education', columns=['Employed', 'Coverage'],
               values='Amount', aggfunc='sum', fill_value=0)

# Employed            No                                  Yes                 
# Coverage         Basic    Extended     Premium        Basic Extended Premium
# Education                                                                   
# Bachelor   1166.671231  312.640096  427.956012  1631.847187      0.0     0.0
# College       0.000000    0.000000    0.000000   586.267088      0.0     0.0
# Master        0.000000    0.000000  725.041288     0.000000      0.0     0.0
Run Code Online (Sandbox Code Playgroud)

笔记:

  • 转换为类别可确保报告该系列的所有方案.
  • 数据透视表默认计算是mean,因此sum必须明确指定.