如何展平包含列表的R数据框?

Tim*_*Tim 14 r

我想找到最好的"R方式"来展平看起来像这样的数据帧:

  CAT    COUNT     TREAT
   A     1,2,3     Treat-a, Treat-b
   B     4,5       Treat-c,Treat-d,Treat-e
Run Code Online (Sandbox Code Playgroud)

所以它的结构如下:

   CAT   COUNT1  COUNT2 COUNT3  TREAT1   TREAT2   TREAT3
    A    1       2      3       Treat-a  Treat-b  NA 
    B    4       5      NA      Treat-c  Treat-d  Treat-e 
Run Code Online (Sandbox Code Playgroud)

生成源数据帧的示例代码:

df<-data.frame(CAT=c("A","B"))
df$COUNT <-list(1:3,4:5) 
df$TREAT <-list(paste("Treat-", letters[1:2],sep=""),paste("Treat-", letters[3:5],sep=""))
Run Code Online (Sandbox Code Playgroud)

我相信我需要rbind和unlist的组合?任何帮助将不胜感激. - 蒂姆

Her*_*oka 10

这是一个使用基R的解决方案,接受列表中任意长度的向量,无需指定要折叠的数据帧的哪些列.部分解决方案是使用答案生成的.

df2 <- do.call(cbind,lapply(df,function(x){
  #check if it is a list, otherwise just return as is
  if(is.list(x)){
    return(data.frame(t(sapply(x,'[',seq(max(sapply(x,length)))))))
  } else{
  return(x)
  }
}))
Run Code Online (Sandbox Code Playgroud)

从R 3.2开始,也有lengths替换sapply(x, length),

df3 <- do.call(cbind.data.frame, lapply(df, function(x) {
  # check if it is a list, otherwise just return as is
  if (is.list(x)) {
    data.frame(t(sapply(x,'[', seq(max(lengths(x))))))
  } else {
   x
 }
}))
Run Code Online (Sandbox Code Playgroud)

使用的数据:

df <- structure(list(CAT = structure(1:2, .Label = c("A", "B"), class = "factor"), 
    COUNT = list(1:3, 4:5), TREAT = list(c("Treat-a", "Treat-b"
    ), c("Treat-c", "Treat-d", "Treat-e"))), .Names = c("CAT", 
"COUNT", "TREAT"), row.names = c(NA, -2L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)


raw*_*awr 10

这是基础r的另一种方式

df<-data.frame(CAT=c("A","B"))
df$COUNT <-list(1:3,4:5)
df$TREAT <-list(paste("Treat-", letters[1:2],sep=""),paste("Treat-", letters[3:5],sep=""))
Run Code Online (Sandbox Code Playgroud)

创建一个帮助函数来完成工作

f <- function(l) {
  if (!is.list(l)) return(l)
  do.call('rbind', lapply(l, function(x) `length<-`(x, max(lengths(l)))))
}
Run Code Online (Sandbox Code Playgroud)

始终测试您的代码

f(df$TREAT)

#           [,1]      [,2]      [,3]     
# [1,] "Treat-a" "Treat-b" NA       
# [2,] "Treat-c" "Treat-d" "Treat-e"
Run Code Online (Sandbox Code Playgroud)

应用它

df[] <- lapply(df, f)
df

#     CAT COUNT.1 COUNT.2 COUNT.3 TREAT.1 TREAT.2 TREAT.3
#   1   A       1       2       3 Treat-a Treat-b    <NA>
#   2   B       4       5      NA Treat-c Treat-d Treat-e
Run Code Online (Sandbox Code Playgroud)