根据近似或完全匹配合并两个Pandas DataFrame

mon*_*top 2 python merge dataframe pandas

下面是我要合并的DataFrames示例.

#!/usr/bin/env python

import pandas as pd

countries   = ['Germany', 'France', 'Indonesia']
rank_one    = [1, 5, 7]
capitals    = ['Berlin', 'Paris', 'Jakarta']
df1         = pd.DataFrame({'country': countries,
                            'rank_one': rank_one,
                            'capital': capitals})

df1 = df1[['country', 'capital', 'rank_one']]    

population = ['8M', '82M', '66M', '255M']
rank_two   = [0, 1, 6, 9]
df2        = pd.DataFrame({'population': population,
                           'rank_two': rank_two})

df2        = df2[['rank_two', 'population']]
Run Code Online (Sandbox Code Playgroud)

我想基于精确或近似匹配合并两个DataFrame.

如果 rank_two is equal to rank_one

要么

rank_two is the closest and next bigger number from rank_one.

示例:

df1 :

     country  capital  rank_one
0    Germany   Berlin         1
1     France    Paris         5
2  Indonesia  Jakarta         7
Run Code Online (Sandbox Code Playgroud)

df2 :

   rank_two population
0         0         8M
1         1        82M
2         6        66M
3         9       255M
Run Code Online (Sandbox Code Playgroud)

df3_result :

     country  capital  rank_one  rank_two population
0    Germany   Berlin         1         1        82M
1     France    Paris         5         6        66M
2  Indonesia  Jakarta         7         9       255M
Run Code Online (Sandbox Code Playgroud)

WeN*_*Ben 6

通过使用 merge_asof

pd.merge_asof(df1,df2,left_on='rank_one',right_on='rank_two',direction='forward')
Out[1206]: 
     country  capital  rank_one  rank_two population
0    Germany   Berlin         1         1        82M
1     France    Paris         5         6        66M
2  Indonesia  Jakarta         7         9       255M
Run Code Online (Sandbox Code Playgroud)