我有两个数据帧
df
city mail
a satya
b def
c akash
d satya
e abc
f xyz
#Another Dataframe d as
city mail
x satya
y def
z akash
u ash
Run Code Online (Sandbox Code Playgroud)
所以现在我需要更新df中的城市来自'd'中比较邮件的更新值,如果找不到某些邮件ID,它应该保持不变.所以看起来应该是这样的
df ### o/p should be like
city mail
x satya
y def
z akash
x satya #repeated so same value should placed here
e abc # not found so as it was
f xyz
Run Code Online (Sandbox Code Playgroud)
我试过了 -
s = {'mail': ['satya', 'def', 'akash', 'satya', 'abc', 'xyz'],'city': ['a', 'b', 'c', 'd', 'e', 'f']}
s1 = {'mail': ['satya', 'def', 'akash', 'ash'],'city': ['x', 'y', 'z', 'u']}
df = pd.DataFrame(s)
d = pd.DataFrame(s1)
#from google i tried
df.loc[df.mail.isin(d.mail),['city']] = d['city']
Run Code Online (Sandbox Code Playgroud)
#giving错误的结果为
city mail
x satya
y def
z akash
u satya ###this value should be for city 'x'
e abc
f xyz
Run Code Online (Sandbox Code Playgroud)
我不能在='mail',how ='left'进行合并,因为在一个数据帧中我的客户较少.因此,在合并后,如何在合并后的城市中映射非匹配邮件城市的值.
请建议.
看起来你要更新的city价值df从city价值d.该update函数基于索引,因此首先需要设置.
# Add extra columns to dataframe.
df['mobile_no'] = ['212-555-1111'] * len(df)
df['age'] = [20] * len(df)
# Update city values keyed on `mail`.
new_city = df[['mail', 'city']].set_index('mail')
new_city.update(d.set_index('mail'))
df['city'] = new_city.values
>>> df
city mail mobile_no age
0 x satya 212-555-1111 20
1 y def 212-555-1111 20
2 z akash 212-555-1111 20
3 x satya 212-555-1111 20
4 e abc 212-555-1111 20
5 f xyz 212-555-1111 20
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8979 次 |
| 最近记录: |