wns*_*mth 18 r dynamic names dataset dplyr
我有一个具有以下结构的数据集:
Classes ‘tbl_df’ and 'data.frame': 10 obs. of 7 variables:
$ GdeName : chr "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" ...
$ Partei : chr "BDP" "CSP" "CVP" "EDU" ...
$ Stand1971: num NA NA 4.91 NA 3.21 ...
$ Stand1975: num NA NA 5.389 0.438 4.536 ...
$ Stand1979: num NA NA 6.2774 0.0195 3.4355 ...
$ Stand1983: num NA NA 4.66 1.41 3.76 ...
$ Stand1987: num NA NA 3.48 1.65 5.75 ...
Run Code Online (Sandbox Code Playgroud)
我想提供一个允许计算任何值之间差异的函数,我想用dplyr
s mutate
函数这样做:(假设参数from
并to
作为参数传递)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff = from - to)
Run Code Online (Sandbox Code Playgroud)
当然,这不起作用,因为dplyr
使用非标准评估.而且我知道现在有一个优雅的问题解决方案使用mutate_
,我已经读过这个小插图,但我仍然无法理解它.
该怎么办?
这是可重现示例的数据集的前几行
structure(list(GdeName = c("Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis"
), Partei = c("BDP", "CSP", "CVP", "EDU", "EVP", "FDP", "FGA",
"FPS", "GLP", "GPS"), Stand1971 = c(NA, NA, 4.907306434, NA,
3.2109535926, 18.272143463, NA, NA, NA, NA), Stand1975 = c(NA,
NA, 5.389079711, 0.4382328556, 4.5363022622, 18.749259742, NA,
NA, NA, NA), Stand1979 = c(NA, NA, 6.2773722628, 0.0194647202,
3.4355231144, 25.294403893, NA, NA, NA, 2.7055961071), Stand1983 = c(NA,
NA, 4.6609804428, 1.412940467, 3.7563539244, 26.277246489, 0.8529335746,
NA, NA, 2.601878177), Stand1987 = c(NA, NA, 3.4767860929, 1.6535933856,
5.7451770193, 22.146844746, NA, 3.7453183521, NA, 13.702211858
)), .Names = c("GdeName", "Partei", "Stand1971", "Stand1975",
"Stand1979", "Stand1983", "Stand1987"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -10L))
Run Code Online (Sandbox Code Playgroud)
MrF*_*ick 22
使用最新版本的dplyr(> = 0.7),您可以使用rlang
!!
(bang-bang)运算符.
library(tidyverse)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff=(!!as.name(from))-(!!as.name(to)))
Run Code Online (Sandbox Code Playgroud)
您只需要将字符串转换为名称,as.name
然后将它们插入表达式中.不幸的是,我似乎不得不使用比我想要的更多的括号,但!!
操作员似乎陷入了一个奇怪的操作顺序.
原始答案,dplyr(0.3- <0.7):
从那个插图(vignette("nse","dplyr")
),使用lazyeval的interp()
功能
library(lazyeval)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate_(diff=interp(~from - to, from=as.name(from), to=as.name(to)))
Run Code Online (Sandbox Code Playgroud)
您现在可以使用.data
内dplyr
链。
library(dplyr)
from <- "Stand1971"
to <- "Stand1987"
data %>% mutate(diff = .data[[from]] - .data[[to]])
Run Code Online (Sandbox Code Playgroud)
另一种选择是sym
与 bang-bang ( !!
) 一起使用
data %>% mutate(diff = !!sym(from) - !!sym(to))
Run Code Online (Sandbox Code Playgroud)
在基础 R 中,我们可以使用:
data$diff <- data[[from]] - data[[to]]
Run Code Online (Sandbox Code Playgroud)