将字符串日期转换为pandas dataframe中的其他格式

rac*_*ler 2 python dataframe pandas

到目前为止,我一直在社区寻找这个答案,不可能.

我在python 3.5.1中有一个数据框,其中包含一个列,其中包含从CSV文件导入的字符串中的日期.

数据框看起来像这样

                  TimeStamp  TBD  TBD     Value  TBD
0       2016/06/08 17:19:53  NaN  NaN  0.062942  NaN
1       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
2       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
Run Code Online (Sandbox Code Playgroud)

我需要的是将TimeStamp列格式更改为%m /%d /%y%H:%M:%D

                  TimeStamp  TBD  TBD     Value  TBD
0       06/08/2016 17:19:53  NaN  NaN  0.062942  NaN
Run Code Online (Sandbox Code Playgroud)

到目前为止,我已经找到了一些解决方案,但对于字符串,而不是系列

任何帮助,将不胜感激

谢谢

unu*_*tbu 5

如果将字符串列转换为时间序列,则可以使用以下dt.strftime方法:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)
Run Code Online (Sandbox Code Playgroud)

产量

   TBD  TBD.1  TBD.2            TimeStamp     Value
0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
Run Code Online (Sandbox Code Playgroud)

由于您希望将一列字符串转换为另一个(不同的)字符串列,因此您还可以使用向量化str.replace方法:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
print(df)
Run Code Online (Sandbox Code Playgroud)

以来

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
Out[32]: 
0    06/08/2016 17:19:53
1    06/08/2016 17:19:54
2    06/08/2016 17:19:54
Name: TimeStamp, dtype: object
Run Code Online (Sandbox Code Playgroud)

这使用正则表达式重新排列字符串的片段,而不首先将字符串解析为日期.这比第一种方法更快(主要是因为它跳过了解析步骤),但它的缺点是不检查日期字符串是否为有效日期.