根据列合并数据帧,仅保留第一个匹配项

A11*_*122 2 python pandas

我有2个数据框,如下所示。

df_1    
Index   Fruit
1       Apple
2       Banana
3       Peach

df_2    
Fruit   Taste
Apple   Tasty
Banana  Tasty
Banana  Rotten
Peach   Rotten
Peach   Tasty
Peach   Tasty
Run Code Online (Sandbox Code Playgroud)

我想基于两个dataframes合并Fruit,但只保留第一次出现AppleBanana以及Peach在第二数据帧。最终结果应为:

df_output       
Index   Fruit   Taste
1   Apple   Tasty
2   Banana  Tasty
3   Peach   Rotten
Run Code Online (Sandbox Code Playgroud)

其中FruitIndexTaste是列标题。我尝试了类似的方法,df1.merge(df2,how='left',on='Fruit但是它基于df_2

谢谢。

jez*_*ael 5

使用drop_duplicates的第一行:

df = df_1.merge(df_2.drop_duplicates('Fruit'),how='left',on='Fruit')
print (df)
   Index   Fruit   Taste
0      1   Apple   Tasty
1      2  Banana   Tasty
2      3   Peach  Rotten
Run Code Online (Sandbox Code Playgroud)

如果要只快添加一列,请使用map

s = df_2.drop_duplicates('Fruit').set_index('Fruit')['Taste']
df_1['Taste'] = df_1['Fruit'].map(s)
print (df_1)
   Index   Fruit   Taste
0      1   Apple   Tasty
1      2  Banana   Tasty
2      3   Peach  Rotten
Run Code Online (Sandbox Code Playgroud)

  • @BarathVutukuri - 将 `.drop_duplicates('Fruit')` 更改为 `.drop_duplicates('Fruit', keep='last')` (3认同)