r data.table:聚合分组列不一致

Question

r data.table:聚合分组列不一致

我正在使用data.table包来聚合一个列,该列也是一个分组列.但结果并不是我的预期.

my_data =  data.table(contnt=c("america", "asia", "asia","europe", "europe", "europe"), num= 1:6)

#my_data
#contnt  num
#america  1
#asia     2
#asia     3
#europe   4
#europe   5
#europe   6

my_data[, length(contnt),by=contnt]
#contnt  V1
#america  1
#asia     1
#europe   1

Run Code Online (Sandbox Code Playgroud)

当我聚合除分组列之外的列时,它的工作方式不同

my_data[, length(num),by=contnt]
#contnt  V1
#america  1
#asia     2
#europe   3

Run Code Online (Sandbox Code Playgroud)

造成这种差异的原因是什么？

Answer 1

shr*_*sgm 6

这是一个很好的例子来演示data.table将分组变量与其他变量分组到函数的方式:

my_data[,print(contnt),by=contnt]
# [1] "america"
# [1] "asia"
# [1] "europe"

my_data[,print(num),by=contnt]
# [1] 1
# [1] 2 3
# [1] 4 5 6

Run Code Online (Sandbox Code Playgroud)

实质上,分组变量作为长度为1的向量传递给每个组,而对于其他变量,传递每个组的整个向量.

归档时间：	8 年，1 月前
查看次数：	97 次
最近记录：	8 年，1 月前