列名称上整洁的评估映射

Tar*_*haw 2 r purrr tidyverse tidyeval

我自己做了如下的功能:

    emp_term_var <- function(data, colName, year = "2015") {
  
  # Terminations by year and variable in df
  colName <- enquo(colName) 
  term_test <- data %>%
    filter(year(DateofTermination) == year) %>%
    group_by(UQ(colName)) %>%
    count(UQ(colName)) %>%
    clean_names()
  return(term_test)
  
}
Run Code Online (Sandbox Code Playgroud)

我有一个包含多个列的 df,例如 Department、State、Position 等。当我想使用我编写的函数时,我将列的名称不带引号,如下所示:

emp_term_var(data = df, colName = Department, year = "2015")
Run Code Online (Sandbox Code Playgroud)

返回:

# A tibble: 5 x 2
# Groups:   department [5]
  department               n
  <chr>                <int>
1 Admin Offices            1
2 IT/IS                    4
3 Production              15
4 Sales                    1
5 Software Engineering     2
> 
Run Code Online (Sandbox Code Playgroud)

如何映射多列?如果我尝试

columns <- c(Department, State)
Run Code Online (Sandbox Code Playgroud)

R 没有告诉我,因为它将这些标识为对象而不是列名。我怎样才能让 R 知道这些是要存储在对象列中的列名称,以便我可以将其传递给以下形式的映射:

map(colnames, ~ emp_term_var(df, colName = .x, year = "2015"))
Run Code Online (Sandbox Code Playgroud)

Lio*_*nry 6

另一个解决方案是保持函数不变,但更改内部调用它的方式map()

columns <- c("Department", "State")

map(colnames, ~ emp_term_var(df, colName = .data[[.x]], year = "2015"))
Run Code Online (Sandbox Code Playgroud)

注意我们如何通过colName = .data[[.x]]而不是colName = .x.

您还可以在其他上下文中执行此操作,例如for循环:

for (col in columns) {
  print(
    emp_term_var(df, colName = .data[[col]], year = "2015")
  )
}
Run Code Online (Sandbox Code Playgroud)