从上一行中减去上一行的值

hai*_*ham 25 r lag dataframe

在R中,假设我有这个数据框:

Data
id      date        value
2380    10/30/12    21.01
2380    10/31/12    22.04
2380    11/1/12     22.65
2380    11/2/12     23.11
20100   10/30/12    35.21
20100   10/31/12    37.07
20100   11/1/12     38.17
20100   11/2/12     38.97
20103   10/30/12    57.98
20103   10/31/12    60.83 
Run Code Online (Sandbox Code Playgroud)

我想通过组ID日期从当前值中减去先前的值来创建:

id      date        value   diff
2380    10/30/12    21.01   0
2380    10/31/12    22.04   1.03
2380    11/1/12     22.65   0.61
2380    11/2/12     23.11   0.46
20100   10/30/12    35.21   0
20100   10/31/12    37.07   1.86
20100   11/1/12     38.17   1.1
20100   11/2/12     38.97   0.8
20103   10/30/12    57.98   0
20103   10/31/12    60.83   2.85
Run Code Online (Sandbox Code Playgroud)

zer*_*323 59

dplyr:

library(dplyr)

data %>%
    group_by(id) %>%
    arrange(date) %>%
    mutate(diff = value - lag(value, default = first(value)))
Run Code Online (Sandbox Code Playgroud)

为了清楚起见,你可以arrange通过date和分组列(根据评论律师)

data %>%
    group_by(id) %>%
    arrange(date, .by_group = TRUE) %>%
    mutate(diff = value - lag(value, default = first(value)))
Run Code Online (Sandbox Code Playgroud)

lagorder_by:

data %>%
    group_by(id) %>%
    mutate(diff = value - lag(value, default = first(value), order_by = date))
Run Code Online (Sandbox Code Playgroud)

data.table:

library(data.table)

dt <- as.data.table(data)
setkey(dt, id, date)
dt[, diff := value - shift(value, fill = first(value)), by = id]
Run Code Online (Sandbox Code Playgroud)


jos*_*ber 18

您可以使用以下ave功能执行此操作:

data$diff <- ave(data$value, data$id, FUN=function(x) c(0, diff(x)))
data
#       id                date value diff
# 1   2380 2012-10-30 00:15:51 21.01 0.00
# 2   2380 2012-10-31 00:31:03 22.04 1.03
# 3   2380 2012-11-01 00:16:02 22.65 0.61
# 4   2380 2012-11-02 00:15:32 23.11 0.46
# 5  20100 2012-10-30 00:15:38 35.21 0.00
# 6  20100 2012-10-31 00:15:48 37.07 1.86
# 7  20100 2012-11-01 00:15:49 38.17 1.10
# 8  20100 2012-11-02 00:15:19 38.97 0.80
# 9  20103 2012-10-30 10:27:34 57.98 0.00
# 10 20103 2012-10-31 12:24:42 60.83 2.85
Run Code Online (Sandbox Code Playgroud)

第一个参数是要操作的数据,第二个参数是组,最后一个参数是要应用于每个组的数据的函数.

  • 谢谢!``diff(x)`''基本上返回一个比输入短1的向量,将其与0组合为标题元素以使其具有相同的大小。得到它了。 (2认同)