Nic*_*ans 10 memory compression r
有没有办法'压缩'类lm的对象,以便我可以将它保存到磁盘并稍后加载以与predict.lm一起使用?
我有一个lm对象,在保存时最终约为142mb,我很难相信predict.lm需要所有原始观察/拟合值/残差等来进行线性预测.我可以删除信息,以便保存的模型更小吗?
我已经尝试将一些变量(fitting.values,residuals等)设置为NA,但它似乎对保存的文件大小没有影响.
您可以使用biglm适合您的模型,biglm模型对象小于lm模型对象.您可以使用predict.biglm创建一个可以传递newdata设计矩阵的函数,该函数返回预测值.
另一种选择是使用saveRDS保存文件,这些文件看起来略小,因为它们具有较少的开销,是单个对象,而不是可以保存多个对象的保存.
library(biglm)
m <- lm(log(Volume)~log(Girth)+log(Height), trees)
mm <- lm(log(Volume)~log(Girth)+log(Height), trees, model = FALSE, x =FALSE, y = FALSE)
bm <- biglm(log(Volume)~log(Girth)+log(Height), trees)
pred <- predict(bm, make.function = TRUE)
save(m, file = 'm.rdata')
save(mm, file = 'mm.rdata')
save(bm, file = 'bm.rdata')
save(pred, file = 'pred.rdata')
saveRDS(m, file = 'm.rds')
saveRDS(mm, file = 'mm.rds')
saveRDS(bm, file = 'bm.rds')
saveRDS(pred, file = 'pred.rds')
file.info(paste(rep(c('m','mm','bm','pred'),each=2) ,c('.rdata','.rds'),sep=''))
# size isdir mode mtime ctime atime exe
# m.rdata 2806 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:23 2013-03-07 11:29:30 no
# m.rds 2798 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# mm.rdata 2113 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:28 2013-03-07 11:29:30 no
# mm.rds 2102 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# bm.rdata 592 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:34 2013-03-07 11:29:30 no
# bm.rds 583 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# pred.rdata 1007 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:40 2013-03-07 11:29:30 no
# pred.rds 995 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:27:30 2013-03-07 11:29:30 no
Run Code Online (Sandbox Code Playgroud)
有几件事:
这个问题确实是重复的.
从狭义上讲model=FALSE,已经在另一个问题中得到了回答.
从更广泛的意义上讲,predict(fit, newdata)实际上只是进行矩阵向量乘法,因此您可以只保存预测向量并将其与矩阵相乘.
有替代拟合功能.下面是fastLm()RcppArmadillo中的一个例子,它也恰好更快.
请参阅下面的插图.
R> library(RcppArmadillo)
Loading required package: Rcpp
R> flm <- fastLm(Volume ~ Girth, data=trees)
R> predict(flm, newdata=trees[1:5,]) ## can predict as with lm()
[1] 5.10315 6.62291 7.63608 16.24803 17.26120
R> object.size(flm) ## tiny object size ...
3608 bytes
R> stdlm <- lm(Volume ~ Girth, data=trees)
R> object.size(stdlm) ## ... compared to what lm() has
20264 bytes
R> stdlm <- lm(Volume ~ Girth, data=trees, model=FALSE)
R> object.size(stdlm) ## ... even when model=FALSE
15424 bytes
R>
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1379 次 |
| 最近记录: |