检查日期是否位于 pandas 的上半月或下半月

Rah*_*rma 5 python pandas

我有一个由日期列组成的数据框,但日期列是字符串。如何检查日期是在上半月还是下半月,并添加带有帐单日期的另一列

例如

如果日期是08-10-2020(格式为 dd-mm-yyyy),则该billing date列将包含同月 16 日,如果日期位于 17-31 之间,则计费日期将包含下个月的 1 日

数据:

print(df['dispatch_date'].head())

0    01-10-2020
1    07-10-2020
2    17-10-2020
3    16-10-2020
4    09-10-2020
Name: dispatch_date, dtype: object
Run Code Online (Sandbox Code Playgroud)

示例输出:

                 billing date
0    01-10-2020  16-10-2020
1    07-10-2020  16-10-2020
2    17-10-2020  01-11-2020
3    16-10-2020  01-11-2020
4    09-10-2020  16-10-2020
Run Code Online (Sandbox Code Playgroud)

sai*_*sai 2

你可以使用apply如下方法来做到这一点-

import pandas as pd
import datetime as dt

dates = ['01-10-2020', '07-10-2020', '17-10-2020', '15-12-2020', '19-12-2020']
df = pd.DataFrame(data=dates, columns=['dates'])

# if the billing data can still be string going ahead
print(df.dates.apply(lambda x: '16'+x[2:] if int(x[:2]) < 16 else '01-'+str(int(x[3:5])+1)+x[5:] if int(x[3:5]) != 12 else '01-'+'01-'+str(int(x[6:])+1)))
df['billing_date'] = df.dates.apply(lambda x: '16'+x[2:] if int(x[:2]) < 16 else '01-'+str(int(x[3:5])+1)+x[5:] if int(x[3:5]) != 12 else '01-'+'01-'+str(int(x[6:])+1))

# if billing date series is needed as a datetime object
print(df.dates.apply(lambda x: dt.date(int(x[-4:]), int(x[3:5]), 16) if int(x[:2]) < 16 else dt.date(int(x[-4:]), int(x[3:5])+1, 1) if int(x[3:5]) != 12 else dt.date(int(x[-4:])+1, 1, 1)))
df['billing_date'] = df.dates.apply(lambda x: dt.date(int(x[-4:]), int(x[3:5]), 16) if int(x[:2]) < 16 else dt.date(int(x[-4:]), int(x[3:5])+1, 1) if int(x[3:5]) != 12 else dt.date(int(x[-4:])+1, 1, 1))
Run Code Online (Sandbox Code Playgroud) 输出
0    16-10-2020
1    16-10-2020
2    01-11-2020
3    16-12-2020
4    01-01-2021
Name: dates, dtype: object

0    2020-10-16
1    2020-10-16
2    2020-11-01
3    2020-12-16
4    2021-01-01
Name: dates, dtype: object
Run Code Online (Sandbox Code Playgroud)

编辑:代码处理 12 月可能出现的边缘情况