熊猫-从日期时间中提取日期,如果时间超过某个小时,则提取一天

ben*_*nsw 2 python datetime pandas

假设我有这个数据框。

import pandas as pd
data = {"Date": ["2018-08-05", "2018-08-05", "2018-08-05", "2018-08-05", "2018-08-06"],  
        "Time_End":["2018-08-05 13:50:00", "2018-08-05 14:26:00", "2018-08-05 17:30:00", "2018-08-05 17:10:00", "2018-08-06 11:23:00"],
        "Reason":["blah1", "blah2", "blah3", "blah4", "blah5"]
       }
df = pd.DataFrame.from_dict(data)
df

        Date             Time_End          Reason
0   2018-08-05      2018-08-05 13:50:00     blah1
1   2018-08-05      2018-08-05 14:26:00     blah2
2   2018-08-05      2018-08-05 17:30:00     blah3
3   2018-08-05      2018-08-05 17:10:00     blah4
4   2018-08-06      2018-08-06 11:23:00     blah5
Run Code Online (Sandbox Code Playgroud)

我只想从“ Time_End”中提取日期到名为“ Birth_date”的新列中。但是,我也想检查时间是否过了17:00。如果是这样,提取的日期将加一成为第二天。下面显示了所需的输出。

    Date        Birth_date      Time_End            Reason
0   2018-08-05  2018-08-05  2018-08-05 13:50:00     blah1
1   2018-08-05  2018-08-05  2018-08-05 14:26:00     blah2
2   2018-08-05  2018-08-06  2018-08-05 17:30:00     blah3
3   2018-08-05  2018-08-06  2018-08-05 17:10:00     blah4
4   2018-08-06  2018-08-06  2018-08-06 11:23:00     blah5 
Run Code Online (Sandbox Code Playgroud)

我想出了这一点,但它并没有达到我的预期。

df["after_17"] = df["Time_End"].dt.hour > 17
df["birth_date"] = df["after_17"].map(lambda x: df["Time_End"].dt.date if x  else df["Time_End"].dt.date + pd.DateOffset(1))
Run Code Online (Sandbox Code Playgroud)

它把输出连接在一起并形成一行。我如何使其正常工作?我也欢迎其他解决方案。

muz*_*zyq 5

使用库中的timedelta方法datetime将加上7个小时Time_End,然后使用提取仅日期部分dt.date

import pandas as pd
from datetime import timedelta

data = {"Date": ["2018-08-05", "2018-08-05", "2018-08-05", "2018-08-05", "2018-08-06"],  
        "Time_End":["2018-08-05 13:50:00", "2018-08-05 14:26:00", "2018-08-05 17:30:00", "2018-08-05 17:10:00", "2018-08-06 11:23:00"],
        "Reason":["blah1", "blah2", "blah3", "blah4", "blah5"]
       }

df = pd.DataFrame.from_dict(data).astype({'Time_End': 'datetime64'})

td = timedelta(hours=7)

df['Birth_Date'] = (df.Time_End + td).dt.date
Run Code Online (Sandbox Code Playgroud)

输出量

    Date        Time_End            Reason  Birth_Date
0   2018-08-05  2018-08-05 13:50:00 blah1   2018-08-05
1   2018-08-05  2018-08-05 14:26:00 blah2   2018-08-05
2   2018-08-05  2018-08-05 17:30:00 blah3   2018-08-06
3   2018-08-05  2018-08-05 17:10:00 blah4   2018-08-06
4   2018-08-06  2018-08-06 11:23:00 blah5   2018-08-06
Run Code Online (Sandbox Code Playgroud)