从python实现R包TSdist

Jin*_*ang 7 python r time-series rpy2 jupyter-notebook

我正在尝试从python jupyter notebook 实现R包TSdist.

import rpy2.robjects.numpy2ri
from rpy2.robjects.packages import importr
rpy2.robjects.numpy2ri.activate()

R = rpy2.robjects.r
## load in package 
TSdist = importr('TSdist')
## t,c are two series 
dist = TSdist.ERPDistance(t.values,c.values,g=0,sigma =30)
## dist is a R Boolean vector with one value
dist[0]
Run Code Online (Sandbox Code Playgroud)

这给了我一个NA,我得到了一个警告:

/usr/lib64/python3.4/site-packages/rpy2/rinterface/ init .py:186:RRuntimeWarning:错误:该系列必须是单变量向量

warnings.warn(x,RRuntimeWarning)

有关如何正确实施它的任何想法?或者如何使用离散傅立叶变换(DFT),自回归系数,编辑实际序列上的距离(EDR)来测量与python包的时间序列相似性.在上述方法这个文件.

Par*_*ait 2

原因可能是两个系列对象被传递到该方法中。假设 series 表示pandas series,调用values返回一个 numpy 数组。根据文档ERPDistance需要数字向量,而不是数组。

print(type(pd.Series(np.random.randn(5))))
# <class 'pandas.core.series.Series'>

print(type(pd.Series(np.random.randn(5)).values))
# <class 'numpy.ndarray'>
Run Code Online (Sandbox Code Playgroud)

考虑简单地将系列转换为以 R 为基数的数值向量或使用 rpy2's FloatVector

from rpy2.robjects.packages import importr

R = rpy2.robjects.r
## load in package 
base = importr('base')
TSdist = importr('TSdist')

new_t = base.as_numeric(t.tolist())
print(type(new_t))
# <class 'rpy2.robjects.vectors.FloatVector'>

new_c = rpy2.robjects.FloatVector(c.tolist())
print(type(new_c))
# <class 'rpy2.robjects.vectors.FloatVector'>

## new_t, new_c are now numeric vectors
dist = TSdist.ERPDistance(new_t, new_c, g=0, sigma =30)
Run Code Online (Sandbox Code Playgroud)