R将行汇总到一行(连续和因子变量)

yok*_*ota 7 r dplyr

我试图在一行中将一堆行累积成一行.如果可能的话,我希望在dplyr中.我知道我的代码远非正确,但这是我得到了多远:

data %>%
  group_by(DAY) %>%
  summarise_each(funs(Sum = n()), SEX, GROUP, TOTAL)
Run Code Online (Sandbox Code Playgroud)

原版的:

DAY SEX GROUP   TOTAL       
7/1/14  FEMALE  A   1       
7/1/14  FEMALE  B   1       
7/1/14  FEMALE  B   1       
7/1/14  FEMALE  A   1       
7/1/14  MALE    A   1       
7/1/14  MALE    B   2       
Run Code Online (Sandbox Code Playgroud)

新:

DAY     FEMALE  MALE    GROUP_A GROUP_B TOTAL
7/1/14  4       2       3       3       7  
Run Code Online (Sandbox Code Playgroud)

Cat*_*ath 8

另一种方式data.table,在data.frame超过一天的测试.

require(data.table)
setDT(data)[, as.list(c(table(SEX), table(GROUP), TOTAL=sum(TOTAL))), by=DAY]

#      DAY FEMALE MALE A B TOTAL
#1: 7/1/14      3    0 1 2     3
#2: 8/1/14      1    2 2 1     4
Run Code Online (Sandbox Code Playgroud)

编辑:另一个较少手动的选项(你不需要知道哪些变量是因素,哪些是数字),感谢@jangorecki和@DavidArenburg的帮助

wh_num <- sapply(data, is.numeric)[-1]
wh_fact <-sapply(data, is.factor)[-1]
setDT(data)[, as.list(c(lapply(.SD[, wh_fact, with = FALSE], table), 
                        lapply(.SD[, wh_num, with = FALSE], sum), 
                        recursive = TRUE)), by = DAY]

#      DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL
#1: 7/1/14          3        0       1       2     3
#2: 8/1/14          1        2       2       1     4
Run Code Online (Sandbox Code Playgroud)

数据

data <- structure(list(DAY = c("7/1/14", "7/1/14", "7/1/14", "8/1/14", 
"8/1/14", "8/1/14"), SEX = structure(c(1L, 1L, 1L, 1L, 2L, 2L
), .Label = c("FEMALE", "MALE"), class = "factor"), GROUP = structure(c(1L, 
2L, 2L, 1L, 1L, 2L), .Label = c("A", "B"), class = "factor"), 
    TOTAL = c(1L, 1L, 1L, 1L, 1L, 2L)), .Names = c("DAY", "SEX", 
"GROUP", "TOTAL"), row.names = c(NA, -6L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)


pic*_*ick 5

它可能看起来有点神秘,但这是一个短暂的咒语

dat %>% group_by(DAY) %>%
  summarise_each(funs(ifelse(is.numeric(.), sum(.), list(table(.))))) -> res

data.frame(DAY=res$DAY, t(unlist(res[, 2:ncol(res)])))
#      DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL
# 1 7/1/14          4        2       3       3     7
Run Code Online (Sandbox Code Playgroud)

在这里,您只需将每列列为表格(如果它不是数字),或者将其汇总(如果它是总数列).这需要作为列表返回,因为summarise_each需要单个值.然后,结果扩展为常规data.frame.