如何按data.table中的多列分组？

Question

我正在尝试在 data.table 中进行一些聚合，但我面临着一个无法找到解决方案的挑战。挑战真的很简单，我想沿着不止一个维度来总结 data.table 中的一些值。

让以下代码正常工作没有问题：

Export4R[,sum(units),by=Type]

这给出了以下几点：

Type    Value
foobar  45
barfoo  25

但现在我想把它进一步分解一下，希望得到一张这样的表：

Type    Month    Value
foobar  Mar      12
foobar  Apr      7
....

我试图用一行代码来做到这一点，但不幸的是，这似乎不起作用：

Export4R[,sum(units),by=Type,Month]

这很可能是一个非常简单的问题，但我很难找到答案。

感谢您的帮助！

Answer 1

Export4R[,sum(units),by="Type,Month"]

或者

Export4R[,sum(units),by=list(Type,Month)]

后一种语法允许表达列名和命名；例如，

Export4R[,sum(units),by=list(Grp1=substring(Type,1,2), Grp2=Month)]

顺便说一句，您可以在多行上格式化长查询：

Export4R[,list(
    s = sum(units)
    ,m = mean(units)
),by=list(
    Grp1=substring(Type,1,2)
    ,Grp2=Month
)]

将逗号放在开头的原因是这样您就可以轻松添加和注释列，而不会弄乱最后一项的右括号；例如，

Export4R[,list(
    s = sum(units)
    # ,m = mean(units)
),by=list(
    Grp1=substring(Type,1,2)
    # ,Grp2=Month
)]

这个想法来自 SQL。