添加作为时间序列数据帧中重复数字的二进制指示符的列的最有效方法是什么?

giz*_*aom 1 r dplyr

我有一个类似于此示例数据框的数据框:

example <- data.frame(id = c("1","1","1", "1", "2", "2", "2"),
                      amount = c(2300, 1765, 2300, 1500, 35, 180, 180),
                      date = c("2010-11-01", "2010-11-02", "2010-11-03", "2010-11-04", "2010-11-01", "2010-11-02", "2010-11-03"))
Run Code Online (Sandbox Code Playgroud)

我想添加一列,该列将有一个 1 来指示金额是否为经常性金额。如果金额在同一 ID 内重复,则只能将经常性金额视为经常性金额。所以它看起来像这样:

desiredResult <- data.frame(id = c("1","1","1", "1", "2", "2", "2"),
                      amount = c(2300, 1765, 2300, 1500, 2300, 180, 180),
                      date = c("2010-11-01", "2010-11-02", "2010-11-03", "2010-11-04", "2010-11-01", "2010-11-02", "2010-11-03"),
                      probableRecurringAmount = c(1,0,1,0,0,1,1)) 
Run Code Online (Sandbox Code Playgroud)

数据集非常大,我很难想出一个有效的解决方案。我正在考虑根据这些其他列的组合向列添加键,但我只想有一个二进制标志。

sam*_*dhi 5

你可以这样做:

library(dplyr)    
example %>%
  group_by(id, amount) %>%
  mutate(probableRecurringAmount  = ifelse(n() > 1, 1, 0))

# A tibble: 7 x 4
# Groups:   id, amount [5]
# id      amount date       probableRecurringAmount
#<fct>  <dbl> <fct>                        <dbl>
#1 1       2300 2010-11-01                       1
#2 1       1765 2010-11-02                       0
#3 1       2300 2010-11-03                       1
#4 1       1500 2010-11-04                       0
#5 2         35 2010-11-01                       0
#6 2        180 2010-11-02                       1
#7 2        180 2010-11-03                       1
Run Code Online (Sandbox Code Playgroud)