我想使用更改列的因子级别setattr.但是,当选择标准data.table方式(dt[ , col])的列时,levels不会更新.另一方面,当在data.table设置中以非正统方式选择列时- 即使用$-it工作.
library(data.table)
# Some data
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# We want to change levels of 'x' using setattr
# New desired levels
lev <- c("a_new", "b_new")
# Select column in the standard data.table way
setattr(x = d[ , x], name = "levels", value = lev)
# Levels are not updated
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# Select column in a non-standard data.table way using $
setattr(x = d$x, name = "levels", value = lev)
# Levels are updated
d
# x y
# 1: b_new 1
# 2: a_new 2
# 3: a_new 3
# 4: b_new 4
# Just check if d[ , x] really is the same as d$x
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
identical(d[ , x], d$x)
# [1] TRUE
# Yes, it seems so
Run Code Online (Sandbox Code Playgroud)
感觉我在这里缺少一些data.table(R?)基础知识.谁能解释一下发生了什么?
我已经找到了另外两个职位上setattr和levels:
setattr上levels保留不必要的重复(R data.table)
它们都用于$选择列.他们都没有提到这种[ , col]方式.
如果从两个表达式中查看地址,可能会有所帮助:
address(d$x)
# [1] "0x10e4ac4d8"
address(d$x)
# [1] "0x10e4ac4d8"
address(d[,x])
# [1] "0x105e0b520"
address(d[,x])
# [1] "0x105e0a600"
Run Code Online (Sandbox Code Playgroud)
请注意,当您多次调用时,第一个表达式中的地址不会更改,而第二个表达式表示由于地址的动态特性而表示它正在复制该列,因此setattr它将没有任何效果在原始data.table上.