Pandas:从用于计数的数据透视表的输出中删除浮点值

equ*_*ity 3 python pandas

我有以下(玩具)数据集:

import pandas as pd
import numpy as np

df = pd.DataFrame({'System_Key':['MER-002', 'MER-003', 'MER-004', 'MER-005', 'BAV-378', 'BAV-379', 'BAV-380', 'BAV-381', 'AUD-220', 'AUD-221', 'AUD-222', 'AUD-223'],
                   'Manufacturer':['Mercedes', 'Mercedes', 'Mercedes', 'Mercedes', 'BMW', 'BMW', 'BMW', 'BMW', 'Audi', 'Audi', 'Audi', 'Audi'],
                   'Region':['Americas', 'Europe', 'Americas', 'Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'Americas', 'Asia', 'Americas', 'Americas'],
                   'Department':[np.nan, 'Sales', np.nan, 'Operations', np.nan, np.nan, 'Accounting', np.nan, 'Finance', 'Finance', 'Finance', np.nan]
                  })

    System_Key  Manufacturer    Region       Department
0   MER-002     Mercedes        Americas     NaN
1   MER-003     Mercedes        Europe       Sales
2   MER-004     Mercedes        Americas     NaN
3   MER-005     Mercedes        Asia         Operations
4   BAV-378     BMW             Asia         NaN
5   BAV-379     BMW             Europe       NaN
6   BAV-380     BMW             Europe       Accounting
7   BAV-381     BMW             Europe       NaN
8   AUD-220     Audi            Americas     Finance
9   AUD-221     Audi            Asia         Finance
10  AUD-222     Audi            Americas     Finance
11  AUD-223     Audi            Americas     NaN
Run Code Online (Sandbox Code Playgroud)

首先,我删除数据框中的 NaN 值:

df = df.fillna('')
Run Code Online (Sandbox Code Playgroud)

然后,我按如下方式旋转数据框:

pivot = pd.pivot_table(df, index='Manufacturer', columns='Region', values='System_Key', aggfunc='size').applymap(str)
Run Code Online (Sandbox Code Playgroud)

请注意,我是在aggfunc='size'数数。

这会产生以下数据透视表:

Region           Americas   Asia    Europe
Manufacturer            
Audi             3.0        1.0     NaN
BMW              NaN        1.0     3.0
Mercedes         2.0        1.0     1.0
Run Code Online (Sandbox Code Playgroud)

如何将此数据透视表中的浮点值转换为整数?

提前致谢!

WeN*_*Ben 5

尝试fill_value

pivot = pd.pivot_table(df, index='Manufacturer', columns='Region', values='System_Key', aggfunc='size',fill_value=-1).astype(int)
Run Code Online (Sandbox Code Playgroud)