我正在寻找一种解决方案来添加"desired_result"列,最好使用dplyr和/或ave().请参阅此处的数据框,其中组是"section",我希望我的"desired_results"列按顺序计数的唯一实例位于"exhibit"中:
structure(list(section = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), exhibit = structure(c(1L,
2L, 3L, 3L, 1L, 2L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor"),
desired_result = c(1L, 2L, 3L, 3L, 1L, 2L, 2L, 3L)), .Names = c("section",
"exhibit", "desired_result"), class = "data.frame", row.names = c(NA,
-8L))
Run Code Online (Sandbox Code Playgroud)
dense_rank这是
library(dplyr)
df %>%
group_by(section) %>%
mutate(desire=dense_rank(exhibit))
# section exhibit desired_result desire
#1 1 a 1 1
#2 1 b 2 2
#3 1 c 3 3
#4 1 c 3 3
#5 2 a 1 1
#6 2 b 2 2
#7 2 b 2 2
#8 2 c 3 3
Run Code Online (Sandbox Code Playgroud)
我最近推的功能rleid(),以data.table(目前市面上的开发版本,1.9.5),它正是这样做的.如果你有兴趣,你可以通过下面安装此.
require(data.table) # 1.9.5, for `rleid()`
require(dplyr)
DF %>%
group_by(section) %>%
mutate(desired_results=rleid(exhibit))
# section exhibit desired_result desired_results
# 1 1 a 1 1
# 2 1 b 2 2
# 3 1 c 3 3
# 4 1 c 3 3
# 5 2 a 1 1
# 6 2 b 2 2
# 7 2 b 2 2
# 8 2 c 3 3
Run Code Online (Sandbox Code Playgroud)