将pandas DateTimeIndex转换为Unix时间?

Chr*_*ier 49 python pandas

将pandas DateTimeIndex转换为(可迭代的)Unix时间的惯用方法是什么?这可能不是要走的路:

[time.mktime(t.timetuple()) for t in my_data_frame.index.to_pydatetime()]
Run Code Online (Sandbox Code Playgroud)

roo*_*oot 94

由于DatetimeIndexndarray引擎盖下,你可以做转换没有理解(要快得多).

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: from datetime import datetime

In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
   ...: index = pd.DatetimeIndex(dates)
   ...: 
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000], 
        dtype=int64)

In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)

%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop

%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop
Run Code Online (Sandbox Code Playgroud)

  • 如果不清楚,`index.astype(np.int64)`返回的时间以纳秒为单位,而不是秒。 (2认同)

And*_*den 42

注意:时间戳只是unix时间,以纳秒为单位(因此除以10**9):

[t.value // 10 ** 9 for t in tsframe.index]
Run Code Online (Sandbox Code Playgroud)

例如:

In [1]: t = pd.Timestamp('2000-02-11 00:00:00')

In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>

In [3]: t.value
Out[3]: 950227200000000000L

In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0
Run Code Online (Sandbox Code Playgroud)

正如@root指出的那样,直接提取值数组更快:

tsframe.index.astype(np.int64) // 10 ** 9
Run Code Online (Sandbox Code Playgroud)


Ran*_*ani 8

其他答案的摘要:

df['<time_col>'].astype(np.int64) // 10**9
Run Code Online (Sandbox Code Playgroud)

如果你想保持毫秒鸿沟10**6,而不是