采取以下行动的惯用方式是什么data.table?
library(dplyr)
df %>%
group_by(b) %>%
slice(1:10)
Run Code Online (Sandbox Code Playgroud)
我可以
library(data.table)
df[, .SD[1:10]
, by = b]
Run Code Online (Sandbox Code Playgroud)
但这似乎要慢得多。有没有更好的办法?
set.seed(0)
df <- rep(1:500, sample(500:1000, 500, T)) %>%
data.table(a = runif(length(.))
,b = .)
f1 <- function(df){
df %>%
group_by(b) %>%
slice(1:10)
}
f2 <- function(df){
df[, .SD[1:10]
, by = b]
}
library(microbenchmark)
microbenchmark(f1(df), f2(df))
#Unit: milliseconds
# expr min lq mean median uq max neval
# f1(df) 17.67435 19.50381 22.06026 20.50166 21.42668 78.3318 100
# f2(df) 69.69554 79.43387 119.67845 88.25585 …Run Code Online (Sandbox Code Playgroud)