使用以下内容分配多个列:= data.table,group

Ale*_*lex 115 r variable-assignment dataframe data.table colon-equals

使用分配给多个列的最佳方法是什么data.table?例如:

f <- function(x) {c("hi", "hello")}
x <- data.table(id = 1:10)
Run Code Online (Sandbox Code Playgroud)

我想做这样的事情(当然这种语法不正确):

x[ , (col1, col2) := f(), by = "id"]
Run Code Online (Sandbox Code Playgroud)

为了扩展它,我可能有很多列的名称存储在一个变量(比如说col_names)中,我想这样做:

x[ , col_names := another_f(), by = "id", with = FALSE]
Run Code Online (Sandbox Code Playgroud)

做这样的事的正确方法是什么?

Mat*_*wle 144

这现在适用于R-Forge的v1.8.3.谢谢你突出它!

x <- data.table(a = 1:3, b = 1:6) 
f <- function(x) {list("hi", "hello")} 
x[ , c("col1", "col2") := f(), by = a][]
#    a b col1  col2
# 1: 1 1   hi hello
# 2: 2 2   hi hello
# 3: 3 3   hi hello
# 4: 1 4   hi hello
# 5: 2 5   hi hello
# 6: 3 6   hi hello

x[ , c("mean", "sum") := list(mean(b), sum(b)), by = a][]
#    a b col1  col2 mean sum
# 1: 1 1   hi hello  2.5   5
# 2: 2 2   hi hello  3.5   7
# 3: 3 3   hi hello  4.5   9
# 4: 1 4   hi hello  2.5   5
# 5: 2 5   hi hello  3.5   7
# 6: 3 6   hi hello  4.5   9 

mynames = c("Name1", "Longer%")
x[ , (mynames) := list(mean(b) * 4, sum(b) * 3), by = a]
#     a b col1  col2 mean sum Name1 Longer%
# 1: 1 1   hi hello  2.5   5    10      15
# 2: 2 2   hi hello  3.5   7    14      21
# 3: 3 3   hi hello  4.5   9    18      27
# 4: 1 4   hi hello  2.5   5    10      15
# 5: 2 5   hi hello  3.5   7    14      21
# 6: 3 6   hi hello  4.5   9    18      27
Run Code Online (Sandbox Code Playgroud)


x[ , get("mynames") := list(mean(b) * 4, sum(b) * 3), by = a][]  # same
#    a b col1  col2 mean sum Name1 Longer%
# 1: 1 1   hi hello  2.5   5    10      15
# 2: 2 2   hi hello  3.5   7    14      21
# 3: 3 3   hi hello  4.5   9    18      27
# 4: 1 4   hi hello  2.5   5    10      15
# 5: 2 5   hi hello  3.5   7    14      21
# 6: 3 6   hi hello  4.5   9    18      27

x[ , eval(mynames) := list(mean(b) * 4, sum(b) * 3), by = a][]   # same
#    a b col1  col2 mean sum Name1 Longer%
# 1: 1 1   hi hello  2.5   5    10      15
# 2: 2 2   hi hello  3.5   7    14      21
# 3: 3 3   hi hello  4.5   9    18      27
# 4: 1 4   hi hello  2.5   5    10      15
# 5: 2 5   hi hello  3.5   7    14      21
# 6: 3 6   hi hello  4.5   9    18      27
Run Code Online (Sandbox Code Playgroud)

  • @MattDowle如果我的函数已经返回命名列表怎么办,我是否可以将列添加到 dt 而不必再次命名它们?例如 f &lt;- function(x) {list("c"="hi", "d"="hello")} 将打印带有 x[ , f(), by = a][] 的命名列的结果。我不知道如何将结果附加到 dt 中。 (2认同)

Ger*_*rry 37

以下简写符号可能有用.所有的功劳都归功于Andrew Brooks,特别是这篇文章.

dt[,`:=`(avg=mean(mpg), med=median(mpg), min=min(mpg)), by=cyl]
Run Code Online (Sandbox Code Playgroud)

  • 这比 `c() := list()` 更好、更易读。 (2认同)