Ame*_*ina 5 python time-series pandas
考虑一系列保持timedelta64[ns]
措施在两个事件A和B之间的时间差:
> time_deltas
499900 -1 days +23:45:13
499916 -1 days +23:50:57
499917 00:03:27
499919 00:17:45
499920 00:16:56
499921 -1 days +23:59:26
499922 00:16:34
499923 00:15:46
499928 00:12:56
499929 00:05:54
...
499970 00:00:48
499971 -1 days +23:58:32
dtype: timedelta64[ns]
Run Code Online (Sandbox Code Playgroud)
我如何识别负增量?(例如A发生在B之前).
这不起作用:
> time_deltas[time_deltas<0]
TypeError: invalid type comparison
Run Code Online (Sandbox Code Playgroud)
还要考虑以下事项:
# Negative time delta example:
> time_deltas.iloc[-1]
Timedelta('-1 days +23:58:32')
# Values seem to have integer representation in ns
> time_deltas.iloc[-1].value
-88000000000
# Positive time delta example:
> time_deltas.iloc[-2]
Timedelta('0 days 00:00:48')
# Again, values seem to have integer representation in ns
> time_deltas.iloc[-1].value
48000000000
Run Code Online (Sandbox Code Playgroud)
但是之后:
# Trying to use the internal representation fails
> time_deltas.apply(lambda x: x.value>0)
AttributeError: 'numpy.timedelta64' object has no attribute 'value'
# Same with
> time_deltas.apply(lambda x: x['value']>0)
IndexError: invalid index to scalar variable.
Run Code Online (Sandbox Code Playgroud)
比较它pd.Timedelta(0)
:
In [60]: time_deltas = pd.to_timedelta(np.random.randint(-10**6, 10**6, size=10))
In [61]: time_deltas
Out[61]:
TimedeltaIndex([ '00:00:00.000809', '-1 days +23:59:59.999034',
'-1 days +23:59:59.999456', '-1 days +23:59:59.999156',
'-1 days +23:59:59.999053', '-1 days +23:59:59.999723',
'-1 days +23:59:59.999523', '00:00:00.000349',
'00:00:00.000051', '00:00:00.000774'],
dtype='timedelta64[ns]', freq=None)
In [62]: time_deltas < pd.Timedelta(0)
Out[62]: array([False, True, True, True, True, True, True, False, False, False], dtype=bool)
Run Code Online (Sandbox Code Playgroud)