Sco*_*hie 5 printing r key data.table
是否可以在保留其键order的data.table同时存储行?
可以说我有以下虚拟表:
library(data.table)
dt <- data.table(id=letters[1:6],
group=sample(c("red", "blue"), replace=TRUE),
value.1=rnorm(6),
value.2=runif(6))
setkey(dt, id)
dt
id group value.1 value.2
1: a blue 1.4557851 0.73249612
2: b red -0.6443284 0.49924102
3: c blue -1.5531374 0.72977197
4: d red -1.5977095 0.08033604
5: e blue 1.8050975 0.43553048
6: f red -0.4816474 0.23658045
Run Code Online (Sandbox Code Playgroud)
我想存储这个表,以便按行排序group,按value.1降序排列,即:
> dt[order(group, value.1, decreasing=T),]
id group value.1 value.2
1: f red -0.4816474 0.23658045
2: b red -0.6443284 0.49924102
3: d red -1.5977095 0.08033604
4: e blue 1.8050975 0.43553048
5: a blue 1.4557851 0.73249612
6: c blue -1.5531374 0.72977197
Run Code Online (Sandbox Code Playgroud)
显然我可以将其保存为新变量,但我也希望将id列保留为主键.
Arun回答"在data.table中设置密钥的目的是什么?" 建议这可以通过巧妙的使用来实现setkey,因为它按照其键的顺序对data.table进行排序(尽管没有选项将键设置为递减顺序):
> setkey(dt, group, value.1, id)
> dt
id group value.1 value.2
1: c blue -1.5531374 0.72977197
2: a blue 1.4557851 0.73249612
3: e blue 1.8050975 0.43553048
4: d red -1.5977095 0.08033604
5: b red -0.6443284 0.49924102
6: f red -0.4816474 0.23658045
Run Code Online (Sandbox Code Playgroud)
但是,我失去了使用id我的主键的能力,因为它group是第一个提供的密钥:
> dt["a"]
group id value.1 value.2
1: a NA NA NA
Run Code Online (Sandbox Code Playgroud)
基于 @eddi 的答案,我创建了一个黑客解决方案,其中我将未评估的调用存储为order的属性data.table,它print.data.table遵循:
set_order <- function(dt, cols, decreasing=FALSE) {
# Store a call to order as an additional attribute
attr(dt, "order") <- paste0("order(", paste(cols, collapse=", "),
", decreasing=", decreasing, ")")
invisible(dt)
}
print.data.table = function(x, ...) {
if (!is.null(attr(x, "order"))) {
# Use the stored ordering to print the data.table
data.table:::print.data.table(x[eval(parse(text=attr(x, "order")))], ...)
} else {
data.table:::print.data.table(x, ...)
}
}
Run Code Online (Sandbox Code Playgroud)
给我我想要的行为:
dt <- set_order(dt, c("group", "value.1"), decreasing=T)
dt
# id group value.1 value.2
# 1: f red -0.4816474 0.23658045
# 2: b red -0.6443284 0.49924102
# 3: d red -1.5977095 0.08033604
# 4: e blue 1.8050975 0.43553048
# 5: a blue 1.4557851 0.73249612
# 6: c blue -1.5531374 0.72977197
tables()
# NAME NROW MB COLS KEY
# [1,] dt 6 1 id,group,value.1,value.2 id
# Total: 1MB
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1432 次 |
| 最近记录: |