设置data.table`(group,-value.1)`的显示顺序,同时保留键`id`

Sco*_*hie 5 printing r key data.table

是否可以在保留其键orderdata.table同时存储行?

可以说我有以下虚拟表:

library(data.table)
dt <- data.table(id=letters[1:6], 
                   group=sample(c("red", "blue"), replace=TRUE), 
                   value.1=rnorm(6), 
                   value.2=runif(6))
setkey(dt, id)
dt
   id group    value.1    value.2
1:  a  blue  1.4557851 0.73249612
2:  b   red -0.6443284 0.49924102
3:  c  blue -1.5531374 0.72977197
4:  d   red -1.5977095 0.08033604
5:  e  blue  1.8050975 0.43553048
6:  f   red -0.4816474 0.23658045
Run Code Online (Sandbox Code Playgroud)

我想存储这个表,以便按行排序group,按value.1降序排列,即:

> dt[order(group, value.1, decreasing=T),]
   id group    value.1    value.2
1:  f   red -0.4816474 0.23658045
2:  b   red -0.6443284 0.49924102
3:  d   red -1.5977095 0.08033604
4:  e  blue  1.8050975 0.43553048
5:  a  blue  1.4557851 0.73249612
6:  c  blue -1.5531374 0.72977197
Run Code Online (Sandbox Code Playgroud)

显然我可以将其保存为新变量,但我也希望将id列保留为主键.

Arun回答"在data.table中设置密钥的目的是什么?" 建议这可以通过巧妙的使用来实现setkey,因为它按照其键的顺序对data.table进行排序(尽管没有选项将键设置为递减顺序):

> setkey(dt, group, value.1, id)
> dt
   id group    value.1    value.2
1:  c  blue -1.5531374 0.72977197
2:  a  blue  1.4557851 0.73249612
3:  e  blue  1.8050975 0.43553048
4:  d   red -1.5977095 0.08033604
5:  b   red -0.6443284 0.49924102
6:  f   red -0.4816474 0.23658045
Run Code Online (Sandbox Code Playgroud)

但是,我失去了使用id我的主键的能力,因为它group是第一个提供的密钥:

> dt["a"]
   group id value.1 value.2
1:     a NA      NA      NA
Run Code Online (Sandbox Code Playgroud)

Sco*_*hie 0

基于 @eddi 的答案,我创建了一个黑客解决方案,其中我将未评估的调用存储为order的属性data.table,它print.data.table遵循:

set_order <- function(dt, cols, decreasing=FALSE) {
  # Store a call to order as an additional attribute
  attr(dt, "order") <- paste0("order(", paste(cols, collapse=", "), 
                              ", decreasing=", decreasing, ")")
  invisible(dt)
}

print.data.table = function(x, ...) {
  if (!is.null(attr(x, "order"))) {
    # Use the stored ordering to print the data.table
    data.table:::print.data.table(x[eval(parse(text=attr(x, "order")))], ...)
  } else {
    data.table:::print.data.table(x, ...)
  }
}
Run Code Online (Sandbox Code Playgroud)

给我我想要的行为:

dt <- set_order(dt, c("group", "value.1"), decreasing=T)
dt
#    id group    value.1    value.2
# 1:  f   red -0.4816474 0.23658045
# 2:  b   red -0.6443284 0.49924102
# 3:  d   red -1.5977095 0.08033604
# 4:  e  blue  1.8050975 0.43553048
# 5:  a  blue  1.4557851 0.73249612
# 6:  c  blue -1.5531374 0.72977197

tables()
#      NAME NROW MB COLS                     KEY
# [1,] dt      6 1  id,group,value.1,value.2 id 
# Total: 1MB
Run Code Online (Sandbox Code Playgroud)