jay*_*020 6 loops r xts data.table
我有1分钟增量的时间序列数据.我编写了一个代码,但是我拥有大量数据(超过1M行),循环遍历每一行的时间太长了.数据如下所示:
t0 = as.POSIXlt("2018-12-23 00:01:00")
t0 = t0+seq(60,60*10,60)
p1 = seq(5,5*10,5)
p2 = seq(7,7*10,7)
m0 = cbind(p1,p2)
rownames(m0) = as.character(t0)
Run Code Online (Sandbox Code Playgroud)
它看起来像这样的地方:
> head(m0)
p1 p2
2018-12-23 00:02:00 5 7
2018-12-23 00:03:00 10 14
2018-12-23 00:04:00 15 21
2018-12-23 00:05:00 20 28
2018-12-23 00:06:00 25 35
2018-12-23 00:07:00 30 42
Run Code Online (Sandbox Code Playgroud)
我希望通过在每分钟之前添加11行(55秒)并将值从最新值继续来将此数据转换为5秒增量.所以它会像:
> new0
p1 p2
2018-12-23 00:01:05 5 7
2018-12-23 00:01:10 5 7
2018-12-23 00:01:15 5 7
2018-12-23 00:01:20 5 7
2018-12-23 00:01:25 5 7
2018-12-23 00:01:30 5 7
2018-12-23 00:01:35 5 7
2018-12-23 00:01:40 5 7
2018-12-23 00:01:45 5 7
2018-12-23 00:01:50 5 7
2018-12-23 00:01:55 5 7
2018-12-23 00:02:00 5 7
2018-12-23 00:02:05 10 14
2018-12-23 00:02:10 10 14
2018-12-23 00:02:15 10 14
2018-12-23 00:02:20 10 14
2018-12-23 00:02:25 10 14
2018-12-23 00:02:30 10 14
2018-12-23 00:02:35 10 14
2018-12-23 00:02:40 10 14
2018-12-23 00:02:45 10 14
2018-12-23 00:02:50 10 14
2018-12-23 00:02:55 10 14
2018-12-23 00:03:00 10 14
Run Code Online (Sandbox Code Playgroud)
我希望能找到某种方式来做到这一点,而无需使用一个循环,并利用高效的代码XTS和/或data.table这我不是太熟悉.
我尝试使用ave
基础R中的函数,但速度不够快.
由于您使用以下标记data.table
:
library(data.table)
dt = as.data.table(m0, keep = T)[, rn := as.POSIXct(rn)]
dt[.(rep(rn, each = 12) - seq(0, 55, 5)), on = 'rn', roll = -Inf][order(rn)]
# rn p1 p2
# 1: 2018-12-23 00:01:05 5 7
# 2: 2018-12-23 00:01:10 5 7
# 3: 2018-12-23 00:01:15 5 7
# 4: 2018-12-23 00:01:20 5 7
# 5: 2018-12-23 00:01:25 5 7
# ---
#116: 2018-12-23 00:10:40 50 70
#117: 2018-12-23 00:10:45 50 70
#118: 2018-12-23 00:10:50 50 70
#119: 2018-12-23 00:10:55 50 70
#120: 2018-12-23 00:11:00 50 70
Run Code Online (Sandbox Code Playgroud)
这是在基础R中执行此操作的一种方法.首先,将数据转换为具有时间戳的显式列的数据框:
m0 <- as.data.frame(m0)
m0$t <- t0
p1 p2 t
1 5 7 2018-12-23 00:02:00
2 10 14 2018-12-23 00:03:00
3 15 21 2018-12-23 00:04:00
4 20 28 2018-12-23 00:05:00
5 25 35 2018-12-23 00:06:00
6 30 42 2018-12-23 00:07:00
7 35 49 2018-12-23 00:08:00
8 40 56 2018-12-23 00:09:00
9 45 63 2018-12-23 00:10:00
10 50 70 2018-12-23 00:11:00
Run Code Online (Sandbox Code Playgroud)
然后merge
该数据帧具有1列时间差数据帧(0到55):
m1 <- merge(m0, data.frame(diff = seq(0, 55, 5)))
Run Code Online (Sandbox Code Playgroud)
最后,从timestamp列中减去差异列以创建新值:
m1$t2 <- with(m1, t - diff)
> m1[c(1, 20, 40), ]
p1 p2 t diff t2
1 5 7 2018-12-23 00:02:00 0 2018-12-23 00:02:00
20 50 70 2018-12-23 00:11:00 5 2018-12-23 00:10:55
40 50 70 2018-12-23 00:11:00 15 2018-12-23 00:10:45
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
113 次 |
最近记录: |