最近我偶然发现了一个奇怪的行为,dplyr如果有人提供一些见解,我会很高兴.
假设我有一个com列包含一些数值的数据.在一个简单的场景中,我想计算rowSums.虽然有很多方法可以做,但这里有两个例子:
df <- data.frame(matrix(rnorm(20), 10, 2),
ids = paste("i", 1:20, sep = ""),
stringsAsFactors = FALSE)
# works
dplyr::select(df, - ids) %>% {rowSums(.)}
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = dplyr::select(df, - ids) %>% {rowSums(.)})
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = dplyr::select(., - ids) %>% {rowSums(.)})
# workaround:
tmp <- dplyr::select(df, - ids) %>% {rowSums(.)}
df %>%
dplyr::mutate(blubb = tmp)
# works
rowSums(dplyr::select(df, - ids))
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = rowSums(dplyr::select(df, - ids)))
# workaround
tmp <- rowSums(dplyr::select(df, - ids))
df %>%
dplyr::mutate(blubb = tmp)
Run Code Online (Sandbox Code Playgroud)
首先,我并不真正理解导致错误的原因,其次我想知道如何以一种整洁的方式实现一些(可行)列的整洁计算.
编辑
问题mutate和rowSums排除列虽然相关,但侧重于使用rowSums进行计算.在这里,我渴望了解为什么上面的例子不起作用.它不是关于如何解决(参见解决方法),而是了解应用天真appraoch时会发生什么
Wei*_*ong 16
因为你嵌套的例子不工作select中mutate,并使用裸变量名.在这种情况下,select正在尝试做类似的事情
> -df$ids
Error in -df$ids : invalid argument to unary operator
Run Code Online (Sandbox Code Playgroud)
这是因为你不能否定的字符串(即失败-"i1"或者-"i2"是没有意义的).下面的任何一种配方都有效:
df %>% mutate(blubb = rowSums(select_(., "X1", "X2")))
df %>% mutate(blubb = rowSums(select(., -3)))
Run Code Online (Sandbox Code Playgroud)
要么
df %>% mutate(blubb = rowSums(select_(., "-ids")))
Run Code Online (Sandbox Code Playgroud)
正如@Haboryme所建议的那样.
select_已弃用。您可以使用:
library(dplyr)
df <- data.frame(matrix(rnorm(20), 10, 2),
ids = paste("i", 1:20, sep = ""),
stringsAsFactors = FALSE)
df %>%
mutate(blubb = rowSums(select(., .dots = c("X1", "X2"))))
# Or more generally:
desired_columns <- c("X1", "X2")
df %>%
mutate(blubb = rowSums(select(., .dots = all_of(desired_columns))))
Run Code Online (Sandbox Code Playgroud)