(1-previous_record)*current_record 的累积积

use*_*292 5 iteration r accumulate dplyr rolling-computation

数据框包含两个变量 ( timeand rate) 和 10 个观测值

time <- seq(1:10) 
rate <- 1-(0.99^time)
dat <- data.frame(time, rate)
Run Code Online (Sandbox Code Playgroud)

我需要添加一个新列(称为new_rate)。

new_rate 定义如下

注意:new_rate_1是new列new_rate等的第一次观察。

new_rate_1 = rate_1
new_rate_2 = (1-rate_1)*rate_2
new_rate_3 = (1-rate_1)*(1-rate_2)*rate_3
new_rate_4 = (1-rate_1)*(1-rate_2)*(1-rate_3)*rate_4
...
new_rate_10 = (1-rate_1)*(1-rate_2)*(1-rate_3)*(1-rate_4)*(1-rate_5)*(1-rate_6)*(1-rate_7)*(1-rate_8)*(1-rate_9)*rate_10
Run Code Online (Sandbox Code Playgroud)

如何在基础 Rdplyr?

the*_*ail 8

cumprod救援(给@Cole 的帽子提示以简化代码):

dat$rate * c(1, cumprod(1 - head(dat$rate, -1)))
Run Code Online (Sandbox Code Playgroud)

逻辑是你本质上是在做一个 的cumulative product 1 - dat$rate,乘以当前步骤。
第一步,您可以只保留现有值,但随后您需要偏移两个向量,以便乘法得到所需的结果。

证明:

out <- c(
dat$rate[1],
(1-dat$rate[1])*dat$rate[2],
(1-dat$rate[1])*(1-dat$rate[2])*dat$rate[3],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*dat$rate[4],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*dat$rate[5],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*dat$rate[6],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*dat$rate[7],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*dat$rate[8],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*(1-dat$rate[8])*dat$rate[9],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*(1-dat$rate[8])*(1-dat$rate[9])*dat$rate[10]
)

all.equal(
  dat$rate * c(1, cumprod(1 - head(dat$rate, -1))),
  out
)
#[1] TRUE
Run Code Online (Sandbox Code Playgroud)