如何在数据框中为Pandas datetime对象正确设置Datetimeindex?

use*_*387 26 python datetime pandas

我有一个pandas数据帧:

    lat         lng         alt days              date        time
0   40.003834   116.321462  211 39745.175405      2008-10-24  04:12:35
1   40.003783   116.321431  201 39745.175463  2008-10-24      04:12:40
2   40.003690   116.321429  203 39745.175521      2008-10-24      04:12:45
3   40.003589   116.321427  194 39745.175579      2008-10-24      04:12:50
4   40.003522   116.321412  190 39745.175637      2008-10-24      04:12:55
5   40.003509   116.321484  188 39745.175694      2008-10-24      04:13:00
Run Code Online (Sandbox Code Playgroud)

我试图将df ['date']和df ['time']列转换为日期时间.我可以:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
df = df.set_index(['Datetime'])
del df['date']
del df['time']
Run Code Online (Sandbox Code Playgroud)

我得到:

                    lat         lng         alt days
Datetime                            
2008-10-2404:12:35  40.003834   116.321462  211 39745.175405    
2008-10-2404:12:40  40.003783   116.321431  201 39745.175463
2008-10-2404:12:45  40.003690   116.321429  203 39745.175521    
2008-10-2404:12:50  40.003589   116.321427  194 39745.175579    
2008-10-2404:12:55  40.003522   116.321412  190 39745.175637
Run Code Online (Sandbox Code Playgroud)

但是,如果我尝试:

df.between_time(time(1),time(22,59,59))['lng'].std()
Run Code Online (Sandbox Code Playgroud)

我收到一个错误 - 'TypeError:Index必须是DatetimeIndex'

所以,我也尝试过设置DatetimeIndex:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
#df = df.set_index(['Datetime'])
df = df.set_index(pd.DatetimeIndex(df['Datetime']))
del df['date']
del df['time']
Run Code Online (Sandbox Code Playgroud)

这也会引发错误 - 'DateParseError:未知字符串格式'

如何正确创建datetime列和DatetimeIndex,以便df.between_time()正常工作?

Kra*_*cit 40

为了简化Kirubaharan的回答:

df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'])
df = df.set_index('Datetime')
Run Code Online (Sandbox Code Playgroud)

并且为了获得不受欢迎的列(如OP所做,但在问题中没有指定):

df = df.drop(['date','time'], axis=1)
Run Code Online (Sandbox Code Playgroud)

  • 所以这里的技巧是在日期和时间之间添加一个空格,然后“pd.to_datetime()”对结果字符串是否正确? (2认同)

Kir*_*n J 29

您没有正确创建日期时间索引,

format = '%Y-%m-%d %H:%M:%S'
df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'], format=format)
df = df.set_index(pd.DatetimeIndex(df['Datetime']))
Run Code Online (Sandbox Code Playgroud)