考虑以下R数据集.
object.size(mtcars)
6736 bytes
#writing this object as rds
write.rds(mtcar,"mt.rds")
#properties of the file shows it as 1.218 KB
#reading back rds file
dataRDS<-read.rds("mt.rds")
object.size(dataRDS)
6736 bytes #this is the same as original mtcars (not surprising)
#writing this object as Stata data
write.dta(mtcars,"mt.dta")
#clicking the properties of file shows the size as 4.5 KB
#reading back Stata data in R
dataDTA<-read.dta("mt.dta")
object.size(dataDTA)
8656 bytes
# this is larger than the original file size
#reading Stata data from Stata gives the size as 2.82 KB
obs: 32 Written by R.
vars: 11
size: 2,816
Run Code Online (Sandbox Code Playgroud)
为什么默认R对象在读取R时占用的内存比读取Stata中从R转换为Stata数据的相同数据集要多?
大多数它似乎是大小的差异attributes,你可以看到它们的存储方式不同.并比较尺寸,
> object.size(attributes(dataDTA)) - object.size(attributes(dataRDS))
1696 bytes
> object.size(dataDTA) - object.size(dataRDS)
1920 bytes
Run Code Online (Sandbox Code Playgroud)
差异可能是由于对object.size真实尺寸的估计.