Pandas 用组值填充 NA

Dan*_*rty 0 python-3.x pandas

给定以下数据框:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Site':['A','A','A','B','B','B','C','C','C'],
                   'Value':[np.nan,1,np.nan,np.nan,2,2,3,np.nan,3]})

df

    Site    Value
0   A       NaN
1   A       1.0
2   A       NaN
3   B       NaN
4   B       2.0
5   B       2.0
6   C       3.0
7   C       NaN
8   C       3.0
Run Code Online (Sandbox Code Playgroud)

我想用站点的最常见(中位数或平均值)值填充 NaN 值。想要的结果是:

    Site    Value
0   A       1.0
1   A       1.0
2   A       1.0
3   B       2.0
4   B       2.0
5   B       2.0
6   C       3.0
7   C       3.0
8   C       3.0
Run Code Online (Sandbox Code Playgroud)

提前致谢!

更新:这很接近,但没有雪茄:

df['Value']=df.groupby(['Site'])['Value'].fillna(min)
Run Code Online (Sandbox Code Playgroud)

导致...

    Site    Value
0   A   <function amax at 0x108cf9048>
1   A   1
2   A   <function amax at 0x108cf9048>
3   B   <function amax at 0x108cf9048>
4   B   2
5   B   2
6   C   3
7   C   <function amax at 0x108cf9048>
8   C   3
Run Code Online (Sandbox Code Playgroud)

dmb*_*dmb 5

您可以使用这里的transform回答

df['Value'] = df.groupby('Site').transform(lambda x: x.fillna(x.mean()))


  Site  Value
0    A      1
1    A      1
2    A      1
3    B      2
4    B      2
5    B      2
6    C      3
7    C      3
8    C      3
Run Code Online (Sandbox Code Playgroud)

  • 在 &gt;1 列的情况下: df['Value'] = df.groupby('Site')['Value'].transform(lambda x: x.fillna(x.max())) (2认同)