如何使用mutate_all并使用dplyr正确重新编码?

Tam*_*agy 1 r dplyr recode

我一直在尝试使用recode的dplyr变体,结合mutate_all对数据集中的所有变量,但它不会产生预期的输出.我发现的其他答案没有解决这个问题(例如dplyr中的Recode和Mutate_all)

这是我尝试过的:

library(tidyverse)
library(car)

# Create sample data
df <- data_frame(a = c("Yes","Maybe","No","Yes"), b = c("No","Maybe","Yes","Yes"))

# Using dplyr::recode
df %>% mutate_all(funs(recode(., `1` = "Yes", `0` = "No", `NA` = "Maybe")))
Run Code Online (Sandbox Code Playgroud)

对价值没有影响:

# A tibble: 4 × 2
      a     b
  <chr> <chr>
1   Yes    No
2 Maybe Maybe
3    No   Yes
4   Yes   Yes
Run Code Online (Sandbox Code Playgroud)

我想要的可以用car :: Recode重现:

# Using car::Recode
df %>% mutate_all(funs(Recode(., "'Yes' = 1; 'No' = 0; 'Maybe' = NA")))
Run Code Online (Sandbox Code Playgroud)

这是期望的结果:

# A tibble: 4 × 2
      a     b
  <dbl> <dbl>
1     1     0
2    NA    NA
3     0     1
4     1     1
Run Code Online (Sandbox Code Playgroud)

GGa*_*mba 5

你倒转了'键/值' dplyr::recode.这对我有用:

df %>% mutate_all(funs(recode(., Yes = 1L, No = 0L, Maybe = NA_integer_)))

# A tibble: 4 × 2
      a     b
  <dbl> <dbl>
1     1     0
2    NA    NA
3     0     1
4     1     1
Run Code Online (Sandbox Code Playgroud)

请注意,如果未指定类型,则会引发错误NA.

你也可以使用引用或不引用的值(例如:两者Yes'Yes'工作)