Chr*_*ris 9 python datetime pandas
根据这个答案的精神,我尝试以下方法将日期时间的DataFrame列转换为自纪元以来的秒数列.
df['date'] = (df['date']+datetime.timedelta(hours=2)-datetime.datetime(1970,1,1))
df['date'].map(lambda td:td.total_seconds())
Run Code Online (Sandbox Code Playgroud)
第二个命令导致以下错误,我不明白.关于这里可能会发生什么的任何想法?我用apply替换了地图,这对事情没有帮助.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-99-7123e823f995> in <module>()
----> 1 df['date'].map(lambda td:td.total_seconds())
/Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/pandas-0.12.0_937_gb55c790-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in map(self, arg, na_action)
1932 return self._constructor(new_values, index=self.index).__finalize__(self)
1933 else:
-> 1934 mapped = map_f(values, arg)
1935 return self._constructor(mapped, index=self.index).__finalize__(self)
1936
/Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/pandas-0.12.0_937_gb55c790-py2.7-macosx-10.8-x86_64.egg/pandas/lib.so in pandas.lib.map_infer (pandas/lib.c:43628)()
<ipython-input-99-7123e823f995> in <lambda>(td)
----> 1 df['date'].map(lambda td:td.total_seconds())
AttributeError: 'float' object has no attribute 'total_seconds'
Run Code Online (Sandbox Code Playgroud)
Jef*_*eff 15
更新:
在0.15.0 Timedeltas成为一个完整的dtype.
所以这成为可能(以及下面的方法)
In [45]: s = Series(pd.timedelta_range('1 day',freq='1S',periods=5))
In [46]: s.dt.components
Out[46]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 1 0 0 0 0 0 0
1 1 0 0 1 0 0 0
2 1 0 0 2 0 0 0
3 1 0 0 3 0 0 0
4 1 0 0 4 0 0 0
In [47]: s.astype('timedelta64[s]')
Out[47]:
0 86400
1 86401
2 86402
3 86403
4 86404
dtype: float64
Run Code Online (Sandbox Code Playgroud)
原答案:
我看到你是主人(并且很快就会出现0.13),所以假设你有numpy> = 1.7.做这个.见这里的文档(这是变频)
In [5]: df = DataFrame(dict(date = date_range('20130101',periods=10)))
In [6]: df
Out[6]:
date
0 2013-01-01 00:00:00
1 2013-01-02 00:00:00
2 2013-01-03 00:00:00
3 2013-01-04 00:00:00
4 2013-01-05 00:00:00
5 2013-01-06 00:00:00
6 2013-01-07 00:00:00
7 2013-01-08 00:00:00
8 2013-01-09 00:00:00
9 2013-01-10 00:00:00
In [7]: df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1)
Out[7]:
0 15706 days, 02:00:00
1 15707 days, 02:00:00
2 15708 days, 02:00:00
3 15709 days, 02:00:00
4 15710 days, 02:00:00
5 15711 days, 02:00:00
6 15712 days, 02:00:00
7 15713 days, 02:00:00
8 15714 days, 02:00:00
9 15715 days, 02:00:00
Name: date, dtype: timedelta64[ns]
In [9]: (df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1)) / np.timedelta64(1,'s')
Out[9]:
0 1357005600
1 1357092000
2 1357178400
3 1357264800
4 1357351200
5 1357437600
6 1357524000
7 1357610400
8 1357696800
9 1357783200
Name: date, dtype: float64
Run Code Online (Sandbox Code Playgroud)
包含的值是np.timedelta64[ns]对象,它们没有与timedelta对象相同的方法,所以没有total_seconds().
In [10]: s = (df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1))
In [11]: s[0]
Out[11]: numpy.timedelta64(1357005600000000000,'ns')
Run Code Online (Sandbox Code Playgroud)
您可以将它们输入int,然后返回一个ns单元.
In [12]: s[0].astype(int)
Out[12]: 1357005600000000000
Run Code Online (Sandbox Code Playgroud)
您也可以这样做(但仅限于单个单元元素).
In [18]: s[0].astype('timedelta64[s]')
Out[18]: numpy.timedelta64(1357005600,'s')
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
17189 次 |
| 最近记录: |