如何使用 Python Pandas 在特定切片中制作 DataFrame 切片和“fillna”?

Geo*_*nko 5 python data-analysis dataframe pandas

问题:让我们从 Kaggle 中获取 Titanic 数据集。我有包含“Pclass”、“Sex”和“Age”列的数据框。我需要在“年龄”列中用某个组的中位数填充 NaN。如果是 1st class 的女性,我想用 1st class 女性的中位数填充她的年龄,而不是整个 Age 列的中位数。

问题是如何在某个切片中进行这种更改?

我试过:

data['Age'][(data['Sex'] == 'female')&(data['Pclass'] == 1)&(data['Age'].isnull())].fillna(median)
Run Code Online (Sandbox Code Playgroud)

“中位数”是我的价值,但没有任何变化“就地=真”没有帮助。

非常感谢!

jez*_*ael 5

我相信您需要按掩码过滤并分配回:

data = pd.DataFrame({'a':list('aaaddd'),
                     'Sex':['female','female','male','female','female','male'],
                     'Pclass':[1,2,1,2,1,1],
                     'Age':[40,20,30,20,np.nan,np.nan]})

print (data)
    Age  Pclass     Sex  a
0  40.0       1  female  a
1  20.0       2  female  a
2  30.0       1    male  a
3  20.0       2  female  d
4   NaN       1  female  d
5   NaN       1    male  d

#boolean mask
mask1 = (data['Sex'] == 'female')&(data['Pclass'] == 1)

#get median by mask without NaNs
med = data.loc[mask1, 'Age'].median()
print (med)
40.0

#repalce NaNs
data.loc[mask1, 'Age'] = data.loc[mask1, 'Age'].fillna(med)
print (data)
    Age  Pclass     Sex  a
0  40.0       1  female  a
1  20.0       2  female  a
2  30.0       1    male  a
3  20.0       2  female  d
4  40.0       1  female  d
5   NaN       1    male  d
Run Code Online (Sandbox Code Playgroud)

什么是相同的:

mask2 = mask1 &(data['Age'].isnull())

data.loc[mask2, 'Age'] = med
print (data)
    Age  Pclass     Sex  a
0  40.0       1  female  a
1  20.0       2  female  a
2  30.0       1    male  a
3  20.0       2  female  d
4  40.0       1  female  d
5   NaN       1    male  d
Run Code Online (Sandbox Code Playgroud)

编辑:

如果需要用NaN中位数替换所有组:

data['Age'] = data.groupby(["Sex","Pclass"])["Age"].apply(lambda x: x.fillna(x.median()))
print (data)

    Age  Pclass     Sex  a
0  40.0       1  female  a
1  20.0       2  female  a
2  30.0       1    male  a
3  20.0       2  female  d
4  40.0       1  female  d
5  30.0       1    male  d
Run Code Online (Sandbox Code Playgroud)