使用 dplyr 在 mutate() 内动态引用列

deg*_*eso 5 r dplyr

我正在尝试使用summarise()基于动态的列名称来创建列。我发现我可以使用粘合语法"{}"和轻松创建动态名称:=,但我无法弄清楚如何在另一个mutate()函数中引用这些列。

\n

根据我在网上阅读的提示,常见的解决方案是使用{{varname}}或使用varname_enq <- enquo(varname)with !!varname_enq。不幸的是,这两种方法对我来说都不起作用。

\n

到目前为止,我已经看过其他 SO 帖子以及使用 dplyr 编程指南。我非常感谢您给我的所有建议!

\n

下面是一个突出问题的小例子。

\n
This is the goal:\n\n# A tibble: 3 \xc3\x97 3\n  Species    species_sum cumulative_sum\n  <fct>            <dbl>          <dbl>\n1 setosa            250.           250.\n2 versicolor        297.           547.\n3 virginica         329.           876.\n
Run Code Online (Sandbox Code Playgroud)\n
mycol <- "species_sum"\nmycol_enquo <- enquo(mycol)\nmyothercol <- "cumulative_sum"\n\n# this works, but the cumulative sum isn\'t dynamic\niris %>% \n  group_by(Species) %>% \n  summarise("{mycol}" := sum(Sepal.Length)) %>% \n  ungroup() %>%\n  mutate(cumulative_sum = cumsum(species_sum))\n\n# this works, but the cumsum function still uses a fixed variable name\niris %>% \n  group_by(Species) %>% \n  summarise("{mycol}" := sum(Sepal.Length)) %>% \n  ungroup() %>%\n  mutate("{myothercol}" := cumsum(species_sum))\n\n# doesn\'t work, the new column is all NA\niris %>% \n  group_by(Species) %>% \n  summarise("{mycol}" := sum(Sepal.Length)) %>% \n  ungroup() %>%\n  mutate("{myothercol}" := cumsum( "{mycol}" ))\n  \n# doesn\'t work, the new column is all NA\niris %>% \n  group_by(Species) %>% \n  summarise("{mycol}" := sum(Sepal.Length)) %>% \n  ungroup() %>%\n  mutate("{myothercol}" := cumsum( {{mycol}} ))\n\n# doesn\'t work, the new column is all NA\niris %>% \n  group_by(Species) %>% \n  summarise("{mycol}" := sum(Sepal.Length)) %>% \n  ungroup() %>%\n  mutate("{myothercol}" := cumsum( !!mycol_enquo ))\n
Run Code Online (Sandbox Code Playgroud)\n

小智 3

这行得通吗?

用字符向量指定变量.data是你的朋友。

iris %>% 
  group_by(Species) %>% 
  summarise(!!mycol := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate(!!myothercol := cumsum(.data[[mycol]]))
# A tibble: 3 x 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876.
Run Code Online (Sandbox Code Playgroud)

或者你可以利用across()

iris %>% 
  group_by(Species) %>% 
  summarise(across(Sepal.Length, sum, .names = mycol)) %>%
  ungroup() %>%
  mutate(across(all_of(mycol), cumsum, .names = myothercol))
# A tibble: 3 x 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876
Run Code Online (Sandbox Code Playgroud)