我有一个类似于此示例数据框的数据框:
example <- data.frame(id = c("1","1","1", "1", "2", "2", "2"),
amount = c(2300, 1765, 2300, 1500, 35, 180, 180),
date = c("2010-11-01", "2010-11-02", "2010-11-03", "2010-11-04", "2010-11-01", "2010-11-02", "2010-11-03"))
Run Code Online (Sandbox Code Playgroud)
我想添加一列,该列将有一个 1 来指示金额是否为经常性金额。如果金额在同一 ID 内重复,则只能将经常性金额视为经常性金额。所以它看起来像这样:
desiredResult <- data.frame(id = c("1","1","1", "1", "2", "2", "2"),
amount = c(2300, 1765, 2300, 1500, 2300, 180, 180),
date = c("2010-11-01", "2010-11-02", "2010-11-03", "2010-11-04", "2010-11-01", "2010-11-02", "2010-11-03"),
probableRecurringAmount = c(1,0,1,0,0,1,1))
Run Code Online (Sandbox Code Playgroud)
数据集非常大,我很难想出一个有效的解决方案。我正在考虑根据这些其他列的组合向列添加键,但我只想有一个二进制标志。
你可以这样做:
library(dplyr)
example %>%
group_by(id, amount) %>%
mutate(probableRecurringAmount = ifelse(n() > 1, 1, 0))
# A tibble: 7 x 4
# Groups: id, amount [5]
# id amount date probableRecurringAmount
#<fct> <dbl> <fct> <dbl>
#1 1 2300 2010-11-01 1
#2 1 1765 2010-11-02 0
#3 1 2300 2010-11-03 1
#4 1 1500 2010-11-04 0
#5 2 35 2010-11-01 0
#6 2 180 2010-11-02 1
#7 2 180 2010-11-03 1
Run Code Online (Sandbox Code Playgroud)