Cur*_* G. 0 r function dataframe dplyr
我有这个数据集:
\ndf <- data.frame( raca = c("Nel","Nel","Nel", "Nel","Angus","Angus","Angus","Angus"),\n marmo = c(350, 320, 330, 400, 800, 820, 450, NA))\n
Run Code Online (Sandbox Code Playgroud)\n我想做描述性统计。我使用了这段代码:
\ndf %>%\n group_by(raca) %>%\n dplyr::summarise(across(1,~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n N = length(.),\n DP = round(sd(.,na.rm=TRUE),digits = 2),\n Min = min(.,na.rm=TRUE),\n Max = max(.,na.rm=TRUE),\n `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n pivot_longer(-raca) %>% arrange(name,raca)\n
Run Code Online (Sandbox Code Playgroud)\n并且运作良好。但我想要一个函数,我尝试了这段代码:
\ndesc_function <- function(a,b, c) { a %>%\n group_by(a[,b]) %>%\n dplyr::summarise(across(a[,c],~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n N = length(.),\n DP = round(sd(.,na.rm=TRUE),digits = 2),\n Min = min(.,na.rm=TRUE),\n Max = max(.,na.rm=TRUE),\n `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n pivot_longer(a[,b]) %>% arrange(name,a[,b])}\n\n\ndesc_function(df, "raca", "marmo")\n
Run Code Online (Sandbox Code Playgroud)\n但发生了这个错误:
\n Error: Problem with summarise() input ..1.\ni ..1 = across(...).\nx Selections can't have missing values.\ni The error occurred in group 1: a[, b] = "Angus".\nRun rlang::last_error() to see where the error occurred.\n
Run Code Online (Sandbox Code Playgroud)\n
我同意 shafee 的观点,阅读如何编程dplyr
略有不同。
这是您的操作方法(直接调整您的代码)
\ndesc_function <- function(a,b, c) { a %>%\n group_by(.data[[b]]) %>%\n dplyr::summarise(across(.data[[c]],~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n N = length(.),\n DP = round(sd(.,na.rm=TRUE),digits = 2),\n Min = min(.,na.rm=TRUE),\n Max = max(.,na.rm=TRUE),\n `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n pivot_longer(-.data[[b]]) %>% arrange(name,.data[[b]])}\n\n\ndesc_function(df, "raca", "marmo")\n
Run Code Online (Sandbox Code Playgroud)\n注意使用 来.data[[b]]
从函数中调用字符串变量
或者传递未包含在字符串中的变量,如下所示
\ndesc_function <- function(a,b, c) { a %>%\n group_by({{b}}) %>%\n dplyr::summarise(across({{c}},~data.frame(M\xc3\xa9dia =round(mean(.,na.rm=TRUE,digits=2),digits = 2),\n N = length(.),\n DP = round(sd(.,na.rm=TRUE),digits = 2),\n Min = min(.,na.rm=TRUE),\n Max = max(.,na.rm=TRUE),\n `Coef Varia\xc3\xa7\xc3\xa3o` = round(sd(., na.rm=TRUE)/mean(.,na.rm=TRUE)*100,digits=2)))) %>%\n pivot_longer(-{{b}}) %>% arrange(name,{{b}})}\n\n\ndesc_function(df, raca, marmo)\n
Run Code Online (Sandbox Code Playgroud)\n这次使用{{b}}
等。
如上所述,所有内容均记录在https://dplyr.tidyverse.org/articles/programming.html中
\n 归档时间: |
|
查看次数: |
74 次 |
最近记录: |