dplyr `case_when()` 问题与 NA

Jas*_*ter 3 if-statement r dplyr

library(tidyverse)
df <- tibble(ID = c("ABC", "EFG", "HIJ", "KLM", "NOP", "QRS"),
             Date = as.Date(c("2019-01-03", "2019-01-08", 
                              "2019-06-09", "2019-06-11",
                              "2019-08-12", "2019-08-21")))
#> # A tibble: 6 x 2
#>   ID    Date
#>   <chr> <date>        
#> 1 ABC   2019-01-03    
#> 2 EFG   2019-01-08    
#> 3 HIJ   2019-06-09    
#> 4 KLM   2019-06-11    
#> 5 NOP   2019-08-12    
#> 6 QRS   2019-08-21 
Run Code Online (Sandbox Code Playgroud)

让我们从上面的数据框开始。我想要的直接显示在下面。数据框中的前两个行项目满足我的case_when()语句中的条件,并填充有“fizz”和“buzz”。剩余部分填充了NA.

df %>% 
  mutate(col3 = case_when(ID == "ABC" & Date == as.Date("2019-01-03") ~ "fizz",
                          ID == "EFG" & Date == as.Date("2019-01-08") ~ "buzz"))
#> # A tibble: 6 x 3
#>   ID    Date       col3 
#>   <chr> <date>     <chr>
#> 1 ABC   2019-01-03 fizz 
#> 2 EFG   2019-01-08 buzz 
#> 3 HIJ   2019-06-09 NA   
#> 4 KLM   2019-06-11 NA   
#> 5 NOP   2019-08-12 NA   
#> 6 QRS   2019-08-21 NA 
Run Code Online (Sandbox Code Playgroud)

然而,当我尝试明确告诉case_when()函数填充数据框的其余部分时,NA我得到如下所示的错误?我没有TRUE ~ NA以正确的方式使用吗?

TRUE ~ _XYZ_参数不是告诉函数用 填充上述条件不满足的任何条件_XYZ_吗?

df %>% 
  mutate(col3 = case_when(ID == "ABC" & Date == as.Date("2019-01-03") ~ "fizz",
                          ID == "EFG" & Date == as.Date("2019-01-08") ~ "buzz",
                          TRUE ~ NA)
#> Error: unexpected ',' in " 
#> ID == "EFG" & Date == as.Date("2019-01-08") ~ "buzz","
Run Code Online (Sandbox Code Playgroud)

Nak*_*akx 12

在 中case_when(),NA 需要属于正确的类别。

class("fizz")
[1] "character"
Run Code Online (Sandbox Code Playgroud)

从文档中:

All RHS values need to be of the same type. Inconsistent types will throw an error.
This applies also to NA values used in RHS: NA is logical, use
typed values like NA_real_, NA_complex, NA_character_, NA_integer_ as appropriate.
Run Code Online (Sandbox Code Playgroud)

https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/case_when

在这里您可以使用NA_character_一个有用的快捷方式as.character(NA)

df %>% 
  mutate(col3 = case_when(ID == "ABC" & Date == as.Date("2019-01-03") ~ "fizz",
                          ID == "EFG" & Date == as.Date("2019-01-08") ~ "buzz",
                          TRUE ~ NA_character_))
Run Code Online (Sandbox Code Playgroud)

正如文档所述,对于其他数据类,存在其他 NA_types NA_real_, NA_complex, 。NA_integer_


Nov*_*ova 6

试试下面的代码 - 它告诉case_when你希望NA是一个字符,就像你的列的其余部分一样。我认为您还缺少上面的括号。

df %>% 
  mutate(col3 = case_when(ID == "ABC" & Date == as.Date("2019-01-03") ~ "fizz",
                          ID == "EFG" & Date == as.Date("2019-01-08") ~ "buzz",
                          TRUE ~ as.character(NA)))

# A tibble: 6 x 3
  ID    Date       col3 
  <chr> <date>     <chr>
1 ABC   2019-01-03 fizz 
2 EFG   2019-01-08 buzz 
3 HIJ   2019-06-09 NA   
4 KLM   2019-06-11 NA   
5 NOP   2019-08-12 NA   
6 QRS   2019-08-21 NA 
Run Code Online (Sandbox Code Playgroud)

  • 或者使用“NA_character_”。 (4认同)