Mad*_*dan 10 python date pandas
我的其中一列中包含以下数据:
df['DOB']
0 01-01-84
1 31-07-85
2 24-08-85
3 30-12-93
4 09-12-77
5 08-09-90
6 01-06-88
7 04-10-89
8 15-11-91
9 01-06-68
Name: DOB, dtype: object
Run Code Online (Sandbox Code Playgroud)
我想将其转换为数据类型列。我尝试了以下操作:
print(pd.to_datetime(df1['Date.of.Birth']))
0 1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-09-12
5 1990-08-09
6 1988-01-06
7 1989-04-10
8 1991-11-15
9 2068-01-06
Name: DOB, dtype: datetime64[ns]
Run Code Online (Sandbox Code Playgroud)
如何获取日期为1968-01-06而不是2068-01-06?
您可以先转换为日期时间,如果年份大于或等于,2020则减去100创建的年份DateOffset:
df['DOB'] = pd.to_datetime(df['DOB'], format='%d-%m-%y')
df.loc[df['DOB'].dt.year >= 2020, 'DOB'] -= pd.DateOffset(years=100)
#same like
#mask = df['DOB'].dt.year >= 2020
#df.loc[mask, 'DOB'] = df.loc[mask, 'DOB'] - pd.DateOffset(years=100)
print (df)
DOB
0 1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-12-09
5 1990-09-08
6 1988-06-01
7 1989-10-04
8 1991-11-15
9 1968-06-01
Run Code Online (Sandbox Code Playgroud)
或者您可以添加19或添加20到年份Series.str.replace并根据numpy.where条件设置值。
注意:解决方案也工作多年00了2000,最多2020。
s1 = df['DOB'].str.replace(r'-(\d+)$', r'-19\1')
s2 = df['DOB'].str.replace(r'-(\d+)$', r'-20\1')
mask = df['DOB'].str[-2:].astype(int) <= 20
df['DOB'] = pd.to_datetime(np.where(mask, s2, s1))
print (df)
DOB
0 1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-09-12
5 1990-08-09
6 1988-01-06
7 1989-04-10
8 1991-11-15
9 1968-01-06
Run Code Online (Sandbox Code Playgroud)
如果所有年份都低于2000:
s1 = df['DOB'].str.replace(r'-(\d+)$', r'-19\1')
df['DOB'] = pd.to_datetime(s1, format='%d-%m-%Y')
print (df)
DOB
0 1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-12-09
5 1990-09-08
6 1988-06-01
7 1989-10-04
8 1991-11-15
9 1968-06-01
Run Code Online (Sandbox Code Playgroud)
在这种特殊情况下,我会用这样的:
pd.to_datetime(df['DOB'].str[:-2] + '19' + df['DOB'].str[-2:])
Run Code Online (Sandbox Code Playgroud)
请注意,如果您在1999年之后拥有DOB,这将中断!
输出:
0 1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-09-12
5 1990-08-09
6 1988-01-06
7 1989-04-10
8 1991-11-15
9 1968-01-06
dtype: datetime64[ns]
Run Code Online (Sandbox Code Playgroud)