Nek*_*eko 5 numpy pandas python-3.6
我有两个数据框:
\n\n df1<- A C\n 7.629 1\n 5.227 2\n 5.472 3\n 5.386 4\n 5.445 5\n\n A B \n df2<- 7.634 10.0\n 7.732 30.0\n 5.223 33.0\n 5.479 22.0\n 5.390 49.0\n 5.439 53.0\nRun Code Online (Sandbox Code Playgroud)\n\n我想对 A 列执行内部合并,容差值为 \xc2\xb10.01 以获得结果数据:
\n\ndf3<- A B C \n 7.634 10.0 1\n 5.223 33.0 2\n 5.479 22.0 3\n 5.390 49.0 4 \n 5.439 53.0 5 \nRun Code Online (Sandbox Code Playgroud)\n\n这可以吗?
\n\n(请注意,df3 的 A 列具有从 df2 复制的值)
\nmerge_asof似乎解决了你的问题(推荐第二种方法,我是从零学来的~)
pd.merge_asof(df2.sort_values('A'), df1.sort_values('A'), direction='nearest',on='A').sort_values('C').drop_duplicates('C')
Out[415]:
A C B
0 5.227 2 33.0
1 5.386 4 49.0
2 5.445 5 53.0
3 5.472 3 22.0
4 7.629 1 10.0
Run Code Online (Sandbox Code Playgroud)
或者使用IntervalIndex
df2.index = pd.IntervalIndex.from_arrays(df2['A']-0.01,df2['A']+0.01,closed='both')
df1['B']=df2.loc[df1.A].B.values
df1['A']=df2.loc[df1.A].A.values
df1
Out[450]:
A C B
[7.619, 7.639] 7.634 1 10.0
[5.217, 5.237] 5.223 2 33.0
[5.462, 5.482] 5.479 3 22.0
[5.376, 5.396] 5.390 4 49.0
[5.435, 5.455] 5.439 5 53.0
Run Code Online (Sandbox Code Playgroud)