使用另一个数据框的合并填充na值

Kuz*_*nbo 5 python merge dataframe pandas fillna

我有一个na值的列,我想根据一个键根据另一个数据帧的值填充.我想知道是否有任何简单的方法可以这样做.

示例:我有一个对象的数据框及其颜色如下:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **NaN**
4  chair   white
5  chair     **NaN**
6   ball    grey
Run Code Online (Sandbox Code Playgroud)

我想用以下数据框中的默认颜色填充颜色列中的na值:

  object default_color
0  chair         brown
1   ball          blue
2   door          grey
Run Code Online (Sandbox Code Playgroud)

结果将是这样的:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **blue**
4  chair   white
5  chair     **brown**
6   ball    grey
Run Code Online (Sandbox Code Playgroud)

有没有"简单"的方法来做到这一点?

谢谢 :)

Flo*_*oor 8

np.where通过将列设置为索引来使用和映射ie

df['color']= np.where(df['color'].isnull(),df['object'].map(df2.set_index('object')['default_color']),df['color'])
Run Code Online (Sandbox Code Playgroud)

要么 df.where

df['color'] = df['color'].where(df['color'].notnull(), df['object'].map(df2.set_index('object')['default_color'])) 
Run Code Online (Sandbox Code Playgroud)
 object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey

  • 是的,只是给你们一些空间.;-) (2认同)
  • 恭喜10k :) (2认同)

jez*_*ael 5

首先创建Series,然后替换NaNs:

s = df1['object'].map(df2.set_index('object')['default_color'])
print (s)
0    brown
1     blue
2     grey
3     blue
4    brown
5    brown
6     blue
Name: object, dtype: object
Run Code Online (Sandbox Code Playgroud)
df1['color']= df1['color'].mask(df1['color'].isnull(), s)
Run Code Online (Sandbox Code Playgroud)

要么:

df1.loc[df1['color'].isnull(), 'color'] = s
Run Code Online (Sandbox Code Playgroud)

要么:

df1['color'] = df1['color'].combine_first(s)
Run Code Online (Sandbox Code Playgroud)

要么:

df1['color'] = df1['color'].fillna(s)
Run Code Online (Sandbox Code Playgroud)
print (df1)
  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey
Run Code Online (Sandbox Code Playgroud)

如果中的唯一值object

df = df1.set_index('object')['color']
        .combine_first(df2.set_index('object')['default_color'])
        .reset_index()
Run Code Online (Sandbox Code Playgroud)

要么:

df = df1.set_index('object')['color']
        .fillna(df2.set_index('object')['default_color'])
        .reset_index()
Run Code Online (Sandbox Code Playgroud)


cs9*_*s95 5

使用loc+ map

m = df.color.isnull()
df.loc[m, 'color'] = df.loc[m, 'object'].map(df2.set_index('object').default_color)

df

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey
Run Code Online (Sandbox Code Playgroud)

如果你打算做了很多这些替代品的,你应该叫set_indexdf2只有一次,并保存其结果。