[R] small object but huge RData file exported

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Wed Oct 20 21:56:50 CEST 2021


On 20/10/2021 9:20 a.m., Jinsong Zhao wrote:
> On 2021/10/20 21:05, Duncan Murdoch wrote:
>> On 20/10/2021 8:57 a.m., Jinsong Zhao wrote:
>>> Hi there,
>>>
>>> I have a RData file that is obtained by save.image() with size about
>>> 74.0 MB (77,608,222 bytes).
>>>
>>> When load into R, I measured the size of each object with object.size():
>>>
>>>> object.size(combn.rda.m)
>>> 105448 bytes
>>>> object.size(cross)
>>> 102064 bytes
>>>> object.size(denitr.1)
>>> 25032 bytes
>>>> object.size(rda.denitr.1)
>>> 600280 bytes
>>>> object.size(xh)
>>> 7792 bytes
>>>> object.size(xh.x)
>>> 6064 bytes
>>>> object.size(xh.x.1)
>>> 24144 bytes
>>>> object.size(xh.x.2)
>>> 24144 bytes
>>>> object.size(xh.x.3)
>>> 24144 bytes
>>>> object.size(xh.y)
>>> 2384 bytes
>>>
>>> There are all small objects.
>>>
>>> If I delete the largest one "rda.denitr.1", and save.image("xx.RData").
>>> It has the size of 22.6 KB (23,244 bytes). All seem OK.
>>>
>>> However, when I save(rda.denitr.1, file = "yy.RData"), then it has the
>>> size of 73.9 MB (77,574,869 bytes).
>>>
>>> I don't know why...
>>>
>>> Any hint?
>>
>> As the docs for object.size() say, "Exactly which parts of the memory
>> allocation should be attributed to which object is not clear-cut."  In
>> particular, if a function or formula has an associated environment, it
>> isn't included, but it is sometimes saved in the image.
>>
>> So I'd suspect rda.denitr.1 contains something that references an
>> environment, and it's an environment that would be saved.  (I forget the
>> exact rules, but I think that means it's not the global environment and
>> it's not a package environment.)
>>
>> Duncan Murdoch
> 
> 
> The rda.denitr.1 is only a list with length 2:
> rda.denitr.1[[1]] is a vector with length 10;
> rda.denitr.2[[2]] is a list with the length 10. rda.denitr.1[[2]][[1]]
> to rda.denitr.1[[2]][[10]] are small RDA objects generated by rda() from
> vegan package.
> 
> If I
>   > a <- rda.denitr.1[[2]][[1]]
>   > object.size(a)
> 59896 bytes
>   > save(a, file = "abc.RData")
> It also has a large size of 73.9 MB (77,536,611 bytes)
> 
> Jinsong
> 

The rda() function uses formulas.  If it saves the formula in the 
result, then it references the environment of that formula, typically 
the environment where the formula was created.

Duncan Murdoch



More information about the R-help mailing list