从另一个数据框值更新熊猫列值

Chr*_*Joo 2 python pandas

我有以下 2 个数据框

df_a = 
   id    val 
0  A100  11
1  A101  12
2  A102  13
3  A103  14
4  A104  15


df_b = 
   id    loc  val 
0  A100  12
1  A100  23
2  A100  32
3  A102  21
4  A102  38
5  A102  12
6  A102  18
7  A102  19
..... 
Run Code Online (Sandbox Code Playgroud)

想要的结果:

df_b = 
   id    loc  val 
0  A100  12   11
1  A100  23   11 
2  A100  32   11
3  A102  21   12
4  A102  38   12 
5  A102  12   12
6  A102  18   12
7  A102  19   12 
..... 
Run Code Online (Sandbox Code Playgroud)

当我尝试像这样通过 df_a 的 'val' 列更新 df_b 的 'val' 列时,

for index, row in df_a.iterrows():
    v = row['val']
    seq = df_a.loc[df_a['val'] == v] 
    df_b.loc[df_b['val'] == v, 'val'] = seq['val'] 
Run Code Online (Sandbox Code Playgroud)

或者

df_x = df_b.join(df_a, on=['id'], how='inner', lsuffix='_left', rsuffix='_right') 
Run Code Online (Sandbox Code Playgroud)

但是我无法解决这个问题......我该如何解决这个棘手的事情?

谢谢

jez*_*ael 5

您可以使用mapbySeries创建者set_index

df_b['val'] = df_b['id'].map(df_a.set_index('id')['val'])
print (df_b)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13
Run Code Online (Sandbox Code Playgroud)

mergeleft join

df = pd.merge(df_b,df_a, on='id', how='left')

print (df)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13
Run Code Online (Sandbox Code Playgroud)

如果只有一个id用于加入两者的公共列df是可能的 omi 它。

df = pd.merge(df_b,df_a, how='left')
print (df)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13
Run Code Online (Sandbox Code Playgroud)