Ofe*_*lon 15 r data.table
(这是一个跟进问题这个。)
检查这个玩具代码:
> x <- data.frame(a = 1:2)
> foo <- function(z) { setDT(z) ; z[, b:=3:4] ; z }
> y <- foo(x)
>
> class(x)
[1] "data.table" "data.frame"
> x
a
1: 1
2: 2
Run Code Online (Sandbox Code Playgroud)
看起来 setDT 确实改变了 x 的类,但是添加的数据不适用于 x。
这里发生了什么?
在您的函数中是对截至z
的引用。x
setDT
library(data.table)\nfoo <- function(z) {print(address(z)); setDT(z); print(address(z))} \nx <- data.frame(a = 1:2)\naddress(x)\n#[1] "0x555ec9a471e8"\nfoo(x)\n#[1] "0x555ec9a471e8"\n#[1] "0x555ec9ede300"\n
Run Code Online (Sandbox Code Playgroud)\n在setDT
下面的行中,z
仍然指向相同的地址,例如x
:
setattr(z, "class", data.table:::.resetclass(z, "data.frame"))\n
Run Code Online (Sandbox Code Playgroud)\nsetattr
不复印。因此x
和z
仍然指向相同的地址,并且现在都属于同一地址data.frame
:
x <- data.frame(a = 1:2)\nz <- x\nclass(x)\n#[1] "data.frame"\naddress(x)\n#[1] "0x555ec95de600"\naddress(z)\n#[1] "0x555ec95de600"\n\nsetattr(z, "class", data.table:::.resetclass(z, "data.frame"))\n\nclass(x)\n#[1] "data.table" "data.frame"\naddress(x)\n#[1] "0x555ec95de600"\naddress(z)\n#[1] "0x555ec95de600"\n
Run Code Online (Sandbox Code Playgroud)\n然后setalloccol
被调用,在这种情况下调用:
assign("z", .Call(data.table:::Calloccolwrapper, z, 1024, FALSE))\n
Run Code Online (Sandbox Code Playgroud)\n现在让x
和z
指向不同的地址。
address(x)\n#[1] "0x555ecaa09c00"\naddress(z)\n#[1] "0x555ec95de600"\n
Run Code Online (Sandbox Code Playgroud)\n并且两者都具有class
data.frame
class(x)\n#[1] "data.table" "data.frame"\nclass(z)\n#[1] "data.table" "data.frame"\n
Run Code Online (Sandbox Code Playgroud)\n我想他们什么时候会使用
\nclass(z) <- data.table:::.resetclass(z, "data.frame")\n
Run Code Online (Sandbox Code Playgroud)\n代替
\nsetattr(z, "class", data.table:::.resetclass(z, "data.frame"))\n
Run Code Online (Sandbox Code Playgroud)\n就不会出现这个问题。
\nx <- data.frame(a = 1:2)\nz <- x\naddress(x)\n#[1] "0x555ec9cd2228"\nclass(z) <- data.table:::.resetclass(z, "data.frame")\nclass(x)\n#[1] "data.frame"\nclass(z)\n#[1] "data.table" "data.frame"\naddress(x)\n#[1] "0x555ec9cd2228"\naddress(z)\n#[1] "0x555ec9cd65a8"\n
Run Code Online (Sandbox Code Playgroud)\n但 afterclass(z) <- value
z
不会指向之前指向的相同地址:
z <- data.frame(a = 1:2)\naddress(z)\n#[1] "0x5653dbe72b68"\naddress(z$a)\n#[1] "0x5653db82e140"\nclass(z) <- c("data.table", "data.frame")\naddress(z)\n#[1] "0x5653dbe82d98"\naddress(z$a)\n#[1] "0x5653db82e140"\n
Run Code Online (Sandbox Code Playgroud)\n但之后setDT
它也不会指向之前指向的相同地址:
z <- data.frame(a = 1:2)\naddress(z)\n#[1] "0x55b6f04d0db8"\nsetDT(z)\naddress(z)\n#[1] "0x55b6efe1e0e0"\n
Run Code Online (Sandbox Code Playgroud)\n正如@Matt-dowle 指出的,也可以更改x
over中的数据z
:
x <- data.frame(a = c(1,3))\nz <- x\nsetDT(z)\nz[, b:=3:4]\nz[2, a:=7]\nz\n# a b\n#1: 1 3\n#2: 7 4\nx\n# a\n#1: 1\n#2: 7\n
Run Code Online (Sandbox Code Playgroud)\nR.version.string\n#[1] "R version 4.0.2 (2020-06-22)"\npackageVersion("data.table")\n#[1] \xe2\x80\x981.12.8\xe2\x80\x99\n
Run Code Online (Sandbox Code Playgroud)\n