Man*_*asi 5 r function plyr lapply dplyr
下面是我创建的示例数据框以及预期的输出。
df = data.frame(color = c("Yellow", "Blue", "Green", "Red", "Magenta"),
values = c(24, 24, 34, 45, 49),
Quarter = c("Period1","Period2" , "Period3", "Period3", "Period1"),
Market = c("Camden", "StreetA", "DansFireplace", "StreetA", "DansFireplace"))
dfXQuarter = df %>% group_by(Quarter) %>% summarise(values = sum(values)) %>%
mutate(cut = "Quarter") %>% data.frame()
colnames(dfXQuarter)[1] = "Grouping"
dfXMarket = df %>% group_by(Market) %>% summarise(values = sum(values)) %>%
mutate(cut = "Market")%>% data.frame()
colnames(dfXMarket)[1] = "Grouping"
df_all = rbind(dfXQuarter, dfXMarket)
Run Code Online (Sandbox Code Playgroud)
现在,为了简洁起见,我想将其编译成一个函数并使用 lapply。以下是我的尝试 -
list = c("Market", "Quarter")
df_all <- do.call(rbind, lapply(list, function(x){
df_l= df %>% group_by(x) %>%
summarise(values = sum(values)) %>%
mutate(cut= x) %>%
data.frame()
colnames(df_l)[df_l$x] = "Grouping"
df_l
}))
Run Code Online (Sandbox Code Playgroud)
这段代码给了我错误。
我需要输出是“df_all”输出的精确副本,以便进行进一步操作。
我如何正确编写这个函数?
我们可以用purrr::map_dfr
library(dplyr)
library(purrr)
#Don't use the R build-in type e.g. list in variables name
lst <- c("Market", "Quarter")
#Use map if you need the output as a list
map_dfr(lst, ~df %>% group_by("Grouping"=!!sym(.x)) %>%
summarise(values = sum(values)) %>%
mutate(cut = .x) %>%
#To avoid the warning massage from bind_rows
mutate_if(is.factor, as.character))
# A tibble: 6 x 3
Grouping values cut
<chr> <dbl> <chr>
1 Camden 24 Market
2 DansFireplace 83 Market
3 StreetA 69 Market
4 Period1 73 Quarter
5 Period2 24 Quarter
6 Period3 79 Quarter
Run Code Online (Sandbox Code Playgroud)
我们可以通过以下方式修复第一个解决方案
group_by(x)为group_by_at(x),因为 x 在这里是一个字符串。colnames(df_l)[colnames(df_l)==x] <- "Grouping"命名分组变量。