Rod*_*phe 2 python regex datetime python-3.x
我正在尝试从字符串中提取日期信息.字符串可能如下所示:
我想提取:
我开始做这样的事情:
publishedWhen = '1 year 1 month and 1 days and 1 hour'
y,m,d,h = 0,0,0,0
if 'day ' in publishedWhen:
d = int(publishedWhen.split(' day ')[0])
if 'days ' in publishedWhen:
d = int(publishedWhen.split(' days ')[0])
if 'days ' not in publishedWhen and 'day ' not in publishedWhen:
d = 0
if 'month ' in publishedWhen:
m = int(publishedWhen.split(' month ')[0])
d = int(publishedWhen.replace(publishedWhen.split(' month ')[0] + ' month ','').replace('and','').replace('days','').replace('day',''))
if 'months ' in publishedWhen:
m = int(publishedWhen.split(' months ')[0])
Run Code Online (Sandbox Code Playgroud)
但是,我知道这个代码是错误的(有些情况可能没有考虑到),正则表达式可能会产生更清洁和有效的东西.这是真的?哪个正则表达式可以帮助我提取所有这些信息?
您不必使用re\gular expres {2}离子?而是在Python Package Index中查看非常丰富的第三方软件包库.
例如,您可以使用dateparser- 用于解析人类可读日期和dateutil- 用于相对delta对象的组合:
from datetime import datetime
import dateparser as dateparser
from dateutil.relativedelta import relativedelta
BASE_DATE = datetime(2018, 1, 1)
def get_relative_date(date_string):
parsed_date = dateparser.parse(date_string, settings={"RELATIVE_BASE": BASE_DATE})
return relativedelta(parsed_date, BASE_DATE)
date_strings = [
"5 months and 17 hours",
"1 month and 19 days",
"3 months and 1 day",
"2 years 1 month and 2 days",
"1 year 1 month and 1 days and 1 hour"
]
for date_string in date_strings:
delta = get_relative_date(date_string)
print(f"y={abs(delta.years)} m={abs(delta.months)} d={abs(delta.days)} h={abs(delta.hours)}")
Run Code Online (Sandbox Code Playgroud)
打印:
y=0 m=5 d=0 h=17
y=0 m=1 d=19 h=0
y=0 m=3 d=1 h=0
y=2 m=1 d=2 h=0
y=1 m=1 d=1 h=1
Run Code Online (Sandbox Code Playgroud)
我并不特别喜欢需要使用某个基准日期来执行delta,并且非常确定有一个可以直接解析为delta对象的包.对任何建议开放.
| 归档时间: |
|
| 查看次数: |
34 次 |
| 最近记录: |