在pandas.merge_asof之后保留两个合并键

00_*_*_00 6 python merge pandas

我发现了这个不错的功能 pandas.merge_asof。从文档中

pandas.merge_asof(left, right, on=None, left_on=None, right_on=None)

Parameters: 

left : DataFrame
right : DataFrame
on : label

Field name to join on. Must be found in both DataFrames.
The data MUST be ordered. 
Furthermore this must be a numeric column,such as datetimelike, integer, or float. 
On or left_on/right_on must be given.
Run Code Online (Sandbox Code Playgroud)

并且它按预期工作。

但是,我合并的数据框on仅将原来所在的数据框保留为列left。我需要将它们都保留下来

   mydf=pandas.merge_asof(left, right, on='Time')
Run Code Online (Sandbox Code Playgroud)

mydf同时包含Timeleftright

示例数据:

a=pd.DataFrame(data=pd.date_range('20100201', periods=100, freq='6h3min'),columns=['Time'])
b=pd.DataFrame(data=
                  pd.date_range('20100201', periods=24, freq='1h'),columns=['Time'])
b['val']=range(b.shape[0])
out=pd.merge_asof(a,b,on='Time',direction='forward',tolerance=pd.Timedelta('30min'))
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 7

我认为一种可能的解决方案是重命名列:

out = pd.merge_asof(a.rename(columns={'Time':'Time1'}), 
                    b.rename(columns={'Time':'Time2'}), 
                    left_on='Time1',
                    right_on='Time2',
                    direction='forward',
                    tolerance=pd.Timedelta('30min'))

print (out.head())
                Time1      Time2  val
0 2010-02-01 00:00:00 2010-02-01  0.0
1 2010-02-01 06:03:00        NaT  NaN
2 2010-02-01 12:06:00        NaT  NaN
3 2010-02-01 18:09:00        NaT  NaN
4 2010-02-02 00:12:00        NaT  NaN
Run Code Online (Sandbox Code Playgroud)

  • 可惜除了重命名还是没有更好的办法 (2认同)