Kuz*_*nbo 5 python merge dataframe pandas fillna
我有一个na值的列,我想根据一个键根据另一个数据帧的值填充.我想知道是否有任何简单的方法可以这样做.
示例:我有一个对象的数据框及其颜色如下:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **NaN**
4 chair white
5 chair **NaN**
6 ball grey
Run Code Online (Sandbox Code Playgroud)
我想用以下数据框中的默认颜色填充颜色列中的na值:
object default_color
0 chair brown
1 ball blue
2 door grey
Run Code Online (Sandbox Code Playgroud)
结果将是这样的:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **blue**
4 chair white
5 chair **brown**
6 ball grey
Run Code Online (Sandbox Code Playgroud)
有没有"简单"的方法来做到这一点?
谢谢 :)
np.where通过将列设置为索引来使用和映射ie
df['color']= np.where(df['color'].isnull(),df['object'].map(df2.set_index('object')['default_color']),df['color'])
Run Code Online (Sandbox Code Playgroud)
要么 df.where
df['color'] = df['color'].where(df['color'].notnull(), df['object'].map(df2.set_index('object')['default_color']))
Run Code Online (Sandbox Code Playgroud)
object color 0 chair black 1 ball yellow 2 door brown 3 ball blue 4 chair white 5 chair brown 6 ball grey
首先创建Series,然后替换NaNs:
s = df1['object'].map(df2.set_index('object')['default_color'])
print (s)
0 brown
1 blue
2 grey
3 blue
4 brown
5 brown
6 blue
Name: object, dtype: object
Run Code Online (Sandbox Code Playgroud)
df1['color']= df1['color'].mask(df1['color'].isnull(), s)
Run Code Online (Sandbox Code Playgroud)
要么:
df1.loc[df1['color'].isnull(), 'color'] = s
Run Code Online (Sandbox Code Playgroud)
要么:
df1['color'] = df1['color'].combine_first(s)
Run Code Online (Sandbox Code Playgroud)
要么:
df1['color'] = df1['color'].fillna(s)
Run Code Online (Sandbox Code Playgroud)
print (df1)
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
Run Code Online (Sandbox Code Playgroud)
如果中的唯一值object:
df = df1.set_index('object')['color']
.combine_first(df2.set_index('object')['default_color'])
.reset_index()
Run Code Online (Sandbox Code Playgroud)
要么:
df = df1.set_index('object')['color']
.fillna(df2.set_index('object')['default_color'])
.reset_index()
Run Code Online (Sandbox Code Playgroud)
使用loc+ map:
m = df.color.isnull()
df.loc[m, 'color'] = df.loc[m, 'object'].map(df2.set_index('object').default_color)
df
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
Run Code Online (Sandbox Code Playgroud)
如果你打算做了很多这些替代品的,你应该叫set_index上df2只有一次,并保存其结果。