基于数据集信息的 ar 函数出错

Cur*_* G. 0 r function dataframe dplyr

我有这个数据集:

\n
df <- data.frame( raca = c("Nel","Nel","Nel", "Nel","Angus","Angus","Angus","Angus"),\n                  marmo = c(350, 320, 330, 400, 800, 820, 450, NA))\n
Run Code Online (Sandbox Code Playgroud)\n

我想做描述性统计。我使用了这段代码:

\n
df %>%\n  group_by(raca) %>%\n  dplyr::summarise(across(1,~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n                                                    N = length(.),\n                                                    DP = round(sd(.,na.rm=TRUE),digits = 2),\n                                                    Min = min(.,na.rm=TRUE),\n                                                    Max = max(.,na.rm=TRUE),\n                                                    `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n  pivot_longer(-raca) %>% arrange(name,raca)\n
Run Code Online (Sandbox Code Playgroud)\n

并且运作良好。但我想要一个函数,我尝试了这段代码:

\n
desc_function <- function(a,b, c)   { a %>%\n    group_by(a[,b]) %>%\n    dplyr::summarise(across(a[,c],~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n                                              N = length(.),\n                                              DP = round(sd(.,na.rm=TRUE),digits = 2),\n                                              Min = min(.,na.rm=TRUE),\n                                              Max = max(.,na.rm=TRUE),\n                                              `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n    pivot_longer(a[,b]) %>% arrange(name,a[,b])}\n\n\ndesc_function(df, "raca", "marmo")\n
Run Code Online (Sandbox Code Playgroud)\n

但发生了这个错误:

\n
 Error: Problem with summarise() input ..1.\ni ..1 = across(...).\nx Selections can't have missing values.\ni The error occurred in group 1: a[, b] = "Angus".\nRun rlang::last_error() to see where the error occurred.\n
Run Code Online (Sandbox Code Playgroud)\n

Qui*_*c22 5

我同意 shafee 的观点,阅读如何编程dplyr略有不同。

\n

这是您的操作方法(直接调整您的代码)

\n
desc_function <- function(a,b, c)   { a %>%\n    group_by(.data[[b]]) %>%\n    dplyr::summarise(across(.data[[c]],~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n                                              N = length(.),\n                                              DP = round(sd(.,na.rm=TRUE),digits = 2),\n                                              Min = min(.,na.rm=TRUE),\n                                              Max = max(.,na.rm=TRUE),\n                                              `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n    pivot_longer(-.data[[b]]) %>% arrange(name,.data[[b]])}\n\n\ndesc_function(df, "raca", "marmo")\n
Run Code Online (Sandbox Code Playgroud)\n

注意使用 来.data[[b]]从函数中调用字符串变量

\n

或者传递未包含在字符串中的变量,如下所示

\n
desc_function <- function(a,b, c)   { a %>%\n    group_by({{b}}) %>%\n    dplyr::summarise(across({{c}},~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n                                              N = length(.),\n                                              DP = round(sd(.,na.rm=TRUE),digits = 2),\n                                              Min = min(.,na.rm=TRUE),\n                                              Max = max(.,na.rm=TRUE),\n                                              `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n    pivot_longer(-{{b}}) %>% arrange(name,{{b}})}\n\n\ndesc_function(df, raca, marmo)\n
Run Code Online (Sandbox Code Playgroud)\n

这次使用{{b}}等。

\n

如上所述,所有内容均记录在https://dplyr.tidyverse.org/articles/programming.html中

\n