如何在给定的datetime值restiriction下填充数据框中的列?

Bgh*_*aak 2 python fuzzy-comparison dataframe python-datetime pandas

鉴于熊猫数据框架df1df2:

df1:

                           d  v
0 2018-02-16 13:39:55.562506  1
1 2018-02-16 10:18:56.768246  4
Run Code Online (Sandbox Code Playgroud)

并且df2:

                           d   vx
0 2018-02-16 13:39:56.668377  100
1 2018-02-16 14:01:05.766319  200
Run Code Online (Sandbox Code Playgroud)

如何df1使用vx值扩展df2,以便时间戳几乎相同,即值的差异不超过2秒(和NaN不匹配)?

例:

                           d  v     vx
0 2018-02-16 10:18:56.768246  4    NaN
1 2018-02-16 13:39:55.562506  1  100.0
Run Code Online (Sandbox Code Playgroud)

以下是代码:

import pandas as pd
import datetime as dt

dt1 = dt.datetime(2018, 2, 16, 13, 39, 55, 562506)
dt2 = dt.datetime(2018, 2, 16, 10, 18 , 56, 768246)
df1 = pd.DataFrame({'v':[1,4], 'd':[dt1, dt2]})

dt3 = dt.datetime(2018, 2, 16, 13, 39 , 56, 668377)
dt4 = dt.datetime(2018, 2, 16, 14, 1 , 5, 766319)
df2 = pd.DataFrame({'vx':[100,200], 'd':[dt3, dt4]})
Run Code Online (Sandbox Code Playgroud)

Max*_*axU 5

使用pd.merge_asof()

In [232]: pd.merge_asof(df1.sort_values('d'), df2, on='d', 
                        tolerance=pd.to_timedelta('2S'), 
                        direction='nearest')
Out[232]:
                           d  v     vx
0 2018-02-16 10:18:56.768246  4    NaN
1 2018-02-16 13:39:55.562506  1  100.0
Run Code Online (Sandbox Code Playgroud)

注意:d必须为两个DF分类连接字段(在您的情况下)