将年和月名称转换为pandas dataframe的datetime列

use*_*827 4 python pandas

如何将年份和月份名称转换为此数据帧的datetime列:

 region  year    Months
0  alabama  2018   January
1  alabama  2018  February
2  alabama  2018     March
3  alabama  2018     April
4  alabama  2018       May
Run Code Online (Sandbox Code Playgroud)

当我这样做:

pd.to_datetime(df_sub['year'] * 10000 + df_sub['Months'] * 100, format='%Y%m')
Run Code Online (Sandbox Code Playgroud)

我收到此错误:

*** TypeError: unsupported operand type(s) for +: 'int' and 'str'
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 11

您可以将year列字符串,添加Months和使用参数formatto_datetimehttp://strftime.org/:

print (pd.to_datetime(df_sub['year'].astype(str)  + df_sub['Months'], format='%Y%B'))
0   2018-01-01
1   2018-02-01
2   2018-03-01
3   2018-04-01
4   2018-05-01
dtype: datetime64[ns]
Run Code Online (Sandbox Code Playgroud)


piR*_*red 5

理解中的f字符串(Python 3.6+)

pd.to_datetime([f'{y}-{m}-01' for y, m in zip(df.year, df.Months)])

DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
               '2018-05-01'],
              dtype='datetime64[ns]', freq=None)
Run Code Online (Sandbox Code Playgroud)

str.format

pd.to_datetime(['{}-{}-01'.format(y, m) for y, m in zip(df.year, df.Months)])

DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
               '2018-05-01'],
              dtype='datetime64[ns]', freq=None)
Run Code Online (Sandbox Code Playgroud)