Wou*_*nes 14 python dataframe pandas data-cleaning
列出了加载到pandas数据帧'df2'中的人员的属性.对于清理,我想用np.nan替换零值(0或'0').
df2.dtypes
ID object
Name object
Weight float64
Height float64
BootSize object
SuitSize object
Type object
dtype: object
Run Code Online (Sandbox Code Playgroud)
将值0设置为np.nan的工作代码:
df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan
Run Code Online (Sandbox Code Playgroud)
相信这可以用类似/更短的方式完成:
df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)
Run Code Online (Sandbox Code Playgroud)
但是上述方法不起作用.零保持在df2.如何解决这个问题?
jez*_*ael 32
我想你需要replace
通过dict
:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
Run Code Online (Sandbox Code Playgroud)
您可以使用 'replace' 方法并将要在列表中替换的值作为第一个参数传递,并将所需的值作为第二个参数传递:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)
Run Code Online (Sandbox Code Playgroud)
尝试:
df2.replace(to_replace={
'Weight':{0:np.nan},
'Height':{0:np.nan},
'BootSize':{'0':np.nan},
'SuitSize':{'0':np.nan},
})
Run Code Online (Sandbox Code Playgroud)