在 r 中的 if_else 中处理 NA

afl*_*man 4 if-statement r dplyr

我有以下数据集,其中三列包含日期。

library(dplyr)

set.seed(45)

df1 <- data.frame(hire_date = sample(seq(as.Date('1999/01/01'),    as.Date('2000/01/01'), by="week"), 10),
              t1 = sample(seq(as.Date('2000/01/01'), as.Date('2001/01/01'), by="week"), 10),
              t2 = sample(seq(as.Date('2000/01/01'), as.Date('2001/01/01'), by="day"), 10))

#this value is actually unknown
df1[10,2] <- NA

    hire_date         t1         t2
1  1999-08-20 2000-05-13 2000-02-17   
2  1999-04-23 2000-11-11 2000-04-27   
3  1999-03-26 2000-04-15 2000-08-01   
4  1999-05-07 2000-06-03 2000-08-29   
5  1999-04-30 2000-05-27 2000-11-19   
6  1999-04-09 2000-12-30 2000-01-26   
7  1999-03-12 2000-12-23 2000-12-07  
8  1999-06-25 2000-02-12 2000-09-26  
9  1999-02-26 2000-05-06 2000-08-23 
10 1999-01-01       <NA> 2000-03-18 
Run Code Online (Sandbox Code Playgroud)

我想执行 if else 语句,如果 t1 OR t2 和 Hire_date 之间的差异在 [395,500] 之间,则 df1$com 为 1

下面的 if_else 语句几乎让我明白了,但是 NA 把它搞砸了。有任何想法吗?

df1$com <- if_else((df1$t1 - df1$hire_date) >= 395 &
               (df1$t1 - df1$hire_date) <= 500, 1,
       if_else((df1$t2 - df1$hire_date) >= 395 &
                (df1$t2 - df1$hire_date) <= 500, 1, 0))
Run Code Online (Sandbox Code Playgroud)

Sam*_*rke 5

您可以使用dplyr::case_when而不是嵌套if_else语句。它将使您轻松控制如何治疗NA。并且dplyr::between还会为您的日期比较进行清理。

df1 %>%
  mutate(com = case_when(
    is.na(t1) | is.na(t2) ~ 999, # or however you want to treat NA cases
    between(t1 - hire_date, 395, 500) ~ 1,
    between(t2 - hire_date, 395, 500) ~ 1,
    TRUE ~ 0 # neither range is between 395 and 500
  ))

#>     hire_date         t1         t2 com
#> 1  1999-08-20 2000-05-13 2000-02-17   0
#> 2  1999-04-23 2000-11-11 2000-04-27   0
#> 3  1999-03-26 2000-04-15 2000-08-01   1
#> 4  1999-05-07 2000-06-03 2000-08-29   1
#> 5  1999-04-30 2000-05-27 2000-11-19   0
#> 6  1999-04-09 2000-12-30 2000-01-26   0
#> 7  1999-03-12 2000-12-23 2000-12-07   0
#> 8  1999-06-25 2000-02-12 2000-09-26   1
#> 9  1999-02-26 2000-05-06 2000-08-23   1
#> 10 1999-01-01       <NA> 2000-03-18 999
Run Code Online (Sandbox Code Playgroud)