Mos*_*ses 1 r tidyverse mutate across
I am trying to provide two functions inside the mutate(across(where(is.factor))) to order the factor levels and drop unused levels. The code appears not to be working as expected. Where might have gone wrong?
#---- Libraries ----
library(tidyverse)
#---- Data ----
set.seed(2021)
df <- tibble(
a1 = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
a2 = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
gender = gl(2, 15, labels = c("Males", "Females")),
b2 = gl(3, 10, labels = c("Primary", "Secondary", "Tertiary", "Unknown")),
c1 = gl(3, 10, labels = c("15-19", "20-24", "25-30", "30-35")),
outcome = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
weight = runif(30, 1, 12)
)
#---- Problem ----
df <- df %>%
mutate(across(where(is.factor), list(fct_infreq, fct_drop)))
levels(df$b2)
# The unused levels not dropped
Run Code Online (Sandbox Code Playgroud)
问题是您实际上在这里改变了两个新列,因此您将在结果数据框中看到有两列b2_1和b2_2,每列对应于应用这两个函数。
如果你运行levels(df$b2_2)你会看到你想要的输出。
如果您的目标是先删除然后重新排序,那么您需要运行连续的变异:
df <- df %>%
mutate(across(where(is.factor), fct_drop)) %>%
mutate(across(where(is.factor), fct_infreq))
Run Code Online (Sandbox Code Playgroud)
或在您的 mutate 中运行嵌套函数
df <- df %>%
mutate(across(where(is.factor), ~fct_infreq(fct_drop(.x))))
Run Code Online (Sandbox Code Playgroud)