相关疑难解决方法(0)

Slow memory leak in data.table when returning named lists in j (trying to reshape a data.table)

Edit 3:

I created a much shorter example of the memory leak. I hope it makes it much easier to reason about what's going on. As the iterations proceed, you see steadily increasing gc() VCell memory use, while memory use reported by tables() stays the same. Somehow, the unlist(.SD) call seems to be responsible. Here it is:

DT = data.table(k = 1:100, g = 1:20, val = rnorm(2e6))
for (i in 1:100){
  tmp = DT[ , unlist(.SD), by = 'k'] …
Run Code Online (Sandbox Code Playgroud)

r data.table

12
推荐指数
1
解决办法
572
查看次数

在data.table中为R获取随机内部selfref错误

我喜欢data.table,它快速而直观,还有什么可以更好?唉,这是我的问题:当data.tableforeach()循环中引用一个循环(使用doMC实现)时,我偶尔会得到以下错误: 附录中的示例

Error in { : 
  Internal error: .internal.selfref prot is not itself an extptr
Run Code Online (Sandbox Code Playgroud)

这里令人讨厌的问题之一是我不能让它以任何一致性重现,但它会在一些很长(几小时)的任务中发生,所以我想确保它永远不会发生,如果可能的话.

由于我引用相同的data.table,DT在每个循环中,我尝试在每个循环的开头运行以下内容:

setattr(DT,".internal.selfref",NULL)   
Run Code Online (Sandbox Code Playgroud)

...删除无效/损坏的self ref属性.这有效,并且不再出现内部selfref错误.不过,这是一种解决方法.

解决根本问题的任何想法?

非常感谢您的帮助!

埃里克

附录:缩写R会话信息以确认最新版本:

R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
other attached packages:
 [1] data.table_1.8.8  doMC_1.3.0
Run Code Online (Sandbox Code Playgroud)

使用模拟数据的示例 - 您可能需要history()多次运行该函数(如数百个)才能获得错误:

##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Load packages and Prepare Data
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
require(data.table)
##this is the package we use for multicore
require(doMC)
##register n-2 of your machine's cores
registerDoMC(multicore:::detectCores()-2) 

## Build simulated …
Run Code Online (Sandbox Code Playgroud)

r setattribute data.table

10
推荐指数
1
解决办法
1167
查看次数

标签 统计

data.table ×2

r ×2

setattribute ×1