当按多个条件分组时,我想保留空组(使用默认值,如 NA 或 0)。
dt = data.table(user = c("A", "A", "B"), date = c("t1", "t2", "t1"), duration = c(1, 2, 1))
dt[, .("total" = sum(duration)), by = .(date, user)]
Run Code Online (Sandbox Code Playgroud)
结果:
date user total
1: t1 A 1
2: t2 A 2
3: t1 B 1
Run Code Online (Sandbox Code Playgroud)
想要的结果:
date user total
1: t1 A 1
2: t2 A 2
3: t1 B 1
3: t2 B NA
Run Code Online (Sandbox Code Playgroud)
一种解决方案可能是在分组之前添加具有 0 个值的行,但它需要创建许多列的笛卡尔乘积并手动检查该组合是否已存在值,但我更喜欢内置/更简单的值。
你可以试试:
dt[CJ(user = user, date = date, unique = TRUE), on = .(user, date)]
user date duration
1: A t1 1
2: A t2 2
3: B t1 1
4: B t2 NA
Run Code Online (Sandbox Code Playgroud)