假设我已经从sql或csv(不是在python中创建)加载了时间序列数据,索引将是:
DatetimeIndex(['2015-03-02 00:00:00', '2015-03-02 01:00:00',
'2015-03-02 02:00:00', '2015-03-02 03:00:00',
'2015-03-02 04:00:00', '2015-03-02 05:00:00',
'2015-03-02 06:00:00', '2015-03-02 07:00:00',
'2015-03-02 08:00:00', '2015-03-02 09:00:00',
...
'2015-07-19 14:00:00', '2015-07-19 15:00:00',
'2015-07-19 16:00:00', '2015-07-19 17:00:00',
'2015-07-19 18:00:00', '2015-07-19 19:00:00',
'2015-07-19 20:00:00', '2015-07-19 21:00:00',
'2015-07-19 22:00:00', '2015-07-19 23:00:00'],
dtype='datetime64[ns]', name=u'hour', length=3360, freq=None, tz=None)
Run Code Online (Sandbox Code Playgroud)
如您所见,'freq'为None.我想知道如何检测此系列的频率并将'freq'设置为其频率.
如果可能的话,我希望这可以在数据不连续的情况下工作(系列中有很多中断).
我试图找到两个时间戳之间所有差异的模式,但我不知道如何将其转换为系列可读的格式
我试图在系列上设置一些值,但它会自动舍入到整数,我应该怎么做才能防止这种情况?
from __future__ import division
import pandas as pd
In [100]: series = pd.Series(range(20))
In [101]: series[10]
Out[101]: 10
In [102]: series[10] = 0.05
In [103]: series[10]
Out[103]: 0
In [104]: series[10] = 2.5
In [105]: series[10]
Out[105]: 2
In [106]: series[10] = float(2.5)
In [107]: series[10]
Out[107]: 2
In [108]: float(2/3)
Out[108]: 0.6666666666666666
In [109]: series[10] = float(2/3)
In [110]: series[10]
Out[110]: 0
Run Code Online (Sandbox Code Playgroud) 我从中学到了inf:如何在Python中表示无限数?
我希望找到包含'inf'的数组的最小值
In [330]: residuals
Out[330]:
array([[ 2272.35651718, 1387.71126686, 1115.48728484],
[ 695.08009848, inf, inf],
[ 601.44997093, inf, inf]])
In [331]: min(residuals)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-331-69b09e4201cf> in <module>()
----> 1 min(residuals)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Run Code Online (Sandbox Code Playgroud)
然而,似乎'inf'是模棱两可的?有没有一种聪明的方法来找到最小值而不是在这个数组的每个值上运行一个循环?