在R中,假设我有这个数据框:
Data
id date value
2380 10/30/12 21.01
2380 10/31/12 22.04
2380 11/1/12 22.65
2380 11/2/12 23.11
20100 10/30/12 35.21
20100 10/31/12 37.07
20100 11/1/12 38.17
20100 11/2/12 38.97
20103 10/30/12 57.98
20103 10/31/12 60.83
Run Code Online (Sandbox Code Playgroud)
我想通过组ID日期从当前值中减去先前的值来创建:
id date value diff
2380 10/30/12 21.01 0
2380 10/31/12 22.04 1.03
2380 11/1/12 22.65 0.61
2380 11/2/12 23.11 0.46
20100 10/30/12 35.21 0
20100 10/31/12 37.07 1.86
20100 11/1/12 38.17 1.1
20100 11/2/12 38.97 0.8
20103 10/30/12 57.98 0
20103 10/31/12 60.83 2.85
Run Code Online (Sandbox Code Playgroud)
zer*_*323 59
用dplyr:
library(dplyr)
data %>%
group_by(id) %>%
arrange(date) %>%
mutate(diff = value - lag(value, default = first(value)))
Run Code Online (Sandbox Code Playgroud)
为了清楚起见,你可以arrange通过date和分组列(根据评论的律师)
data %>%
group_by(id) %>%
arrange(date, .by_group = TRUE) %>%
mutate(diff = value - lag(value, default = first(value)))
Run Code Online (Sandbox Code Playgroud)
或lag用order_by:
data %>%
group_by(id) %>%
mutate(diff = value - lag(value, default = first(value), order_by = date))
Run Code Online (Sandbox Code Playgroud)
用data.table:
library(data.table)
dt <- as.data.table(data)
setkey(dt, id, date)
dt[, diff := value - shift(value, fill = first(value)), by = id]
Run Code Online (Sandbox Code Playgroud)
jos*_*ber 18
您可以使用以下ave功能执行此操作:
data$diff <- ave(data$value, data$id, FUN=function(x) c(0, diff(x)))
data
# id date value diff
# 1 2380 2012-10-30 00:15:51 21.01 0.00
# 2 2380 2012-10-31 00:31:03 22.04 1.03
# 3 2380 2012-11-01 00:16:02 22.65 0.61
# 4 2380 2012-11-02 00:15:32 23.11 0.46
# 5 20100 2012-10-30 00:15:38 35.21 0.00
# 6 20100 2012-10-31 00:15:48 37.07 1.86
# 7 20100 2012-11-01 00:15:49 38.17 1.10
# 8 20100 2012-11-02 00:15:19 38.97 0.80
# 9 20103 2012-10-30 10:27:34 57.98 0.00
# 10 20103 2012-10-31 12:24:42 60.83 2.85
Run Code Online (Sandbox Code Playgroud)
第一个参数是要操作的数据,第二个参数是组,最后一个参数是要应用于每个组的数据的函数.
| 归档时间: |
|
| 查看次数: |
31189 次 |
| 最近记录: |