小编Edo*_*ard的帖子

在python中查找日期范围重叠

我试图找到一种更有效的方法,根据特定列(id)查找数据框中的重叠数据范围(每行提供的开始/结束日期).

Dataframe在"from"列上排序

我认为有一种方法可以像我一样避免"双重"应用功能......

import pandas as pd
from datetime import datetime

df = pd.DataFrame(columns=['id','from','to'], index=range(5), \
                  data=[[878,'2006-01-01','2007-10-01'],
                        [878,'2007-10-02','2008-12-01'],
                        [878,'2008-12-02','2010-04-03'],
                        [879,'2010-04-04','2199-05-11'],
                        [879,'2016-05-12','2199-12-31']])

df['from'] = pd.to_datetime(df['from'])
df['to'] = pd.to_datetime(df['to'])


    id  from        to
0   878 2006-01-01  2007-10-01
1   878 2007-10-02  2008-12-01
2   878 2008-12-02  2010-04-03
3   879 2010-04-04  2199-05-11
4   879 2016-05-12  2199-12-31
Run Code Online (Sandbox Code Playgroud)

我使用"apply"函数循环所有组,在每个组中,我每行使用"apply":

def check_date_by_id(df):

    df['prevFrom'] = df['from'].shift()
    df['prevTo'] = df['to'].shift()

    def check_date_by_row(x):

        if pd.isnull(x.prevFrom) or pd.isnull(x.prevTo):
            x['overlap'] = False
            return x

        latest_start = max(x['from'], x.prevFrom)
        earliest_end = min(x['to'], x.prevTo)
        x['overlap'] = …
Run Code Online (Sandbox Code Playgroud)

python pandas

6
推荐指数
1
解决办法
4685
查看次数

标签 统计

pandas ×1

python ×1