转置数据表

Abh*_*bhi 6 r data.table

在数据计算结束后有效转换 data.table 的好方法是什么

nrow=500e3
ncol=2000
m <- matrix(rnorm(nrow*ncol),nrow=nrow)
colnames(m) <- c('foo',seq(ncol-1))
dt <- data.table(m)
df <- as.data.frame(m)
dt <- t(dt)  #take a long time and converts the data table to a matrix
Run Code Online (Sandbox Code Playgroud)

计算时间

1. to transpose the matrix
system.time(mt <- t(m))
   user  system elapsed
 20.005   0.016  20.024

2. to transpose the dt
system.time(dt <- t(dt))
user  system elapsed
32.722  15.129  47.855

3. to transpose a df
system.time(df <- t(df))
user  system elapsed
32.414  15.357  47.775
Run Code Online (Sandbox Code Playgroud)

Mic*_*ico 0

这是一个相当老的问题,从那时起data.table就添加/导出了transpose转置列表。就性能而言,t除了矩阵之外,它的性能都优于(我认为这是可以预料的)

system.time(t(m))
 #   user  system elapsed 
 # 23.990  23.416  85.722 
system.time(t(dt))
 #   user  system elapsed 
 # 31.223  53.197 195.221 
system.time(t(df))
 #   user  system elapsed 
 # 30.609  45.404 148.323 
system.time(setDT(transpose(dt)))
 #   user  system elapsed 
 # 42.135  38.478 116.599
Run Code Online (Sandbox Code Playgroud)