我正在尝试找到一个可以替换以下代码的向量化过程(这需要很长时间才能运行):
for (i in 2:nrow(z)) {
if (z$customerID[i]==z$customerID[i-1])
{z$timeDelta[i]<-(z$time[i]-z$time[i-1])} else {z$timeDelta[i]<- NA}
}
Run Code Online (Sandbox Code Playgroud)
我试着寻找不同的应用片段,但没有发现任何有用的东西.
这是一些示例数据:
customerID time
1 2013-04-17 15:30:00 IDT
1 2013-05-19 11:32:00 IDT
1 2013-05-20 10:14:00 IDT
2 2013-03-14 18:41:00 IST
2 2013-04-24 09:52:00 IDT
2 2013-04-24 17:08:00 IDT
Run Code Online (Sandbox Code Playgroud)
我想获得以下输出:
customerID time timeDelta*
1 2013-04-17 15:30:00 IDT NA
1 2013-05-19 11:32:00 IDT 31.83
1 2013-05-20 10:14:00 IDT 0.94
2 2013-03-14 18:41:00 IST NA
2 2013-04-24 09:52:00 IDT 40.59
2 2013-04-24 17:08:00 IDT 0.3
*I prefer the time will be in days
Run Code Online (Sandbox Code Playgroud)
jdh*_*son 10
z$timeDelta <- NA
z$timeDelta[-1] <- ifelse(tail(z$customerID,-1) == head(z$customerID,-1), diff(z$time)/24, NA)
Run Code Online (Sandbox Code Playgroud)
或更短的版本
z$timeDelta <- NA
z$timeDelta[-1] <- ifelse(!diff(z$customerID), diff(z$time)/24, NA)
Run Code Online (Sandbox Code Playgroud)