[R] small object but huge RData file exported
Jinsong Zhao
j@zh@o @end|ng |rom ye@h@net
Thu Oct 21 08:09:10 CEST 2021
This example has demoed the similar or same characteristics of my question.
If I
> save(formula, file = "abc.RData")
and then in a new launched R session, I
> load("abc.RData")
> formula
x ~ y
<environment: 0x00000000171e4be8>
I want to know what are stored in the <environment: 0x00000000171e4be8>,
and how to access it, or how to save the object without the environment.
Best,
Jinsong
On 2021/10/21 4:06, Henrik Bengtsson wrote:
> Example illustrating what Duncan says:
>
>> make_formula <- function() { large <- rnorm(1e6); x ~ y }
>> formula <- make_formula()
>
> # "Apparent" size of object
>> object.size(formula)
> 728 bytes
>
> # Actual serialization size
>> length(serialize(formula, connection = NULL))
> [1] 8000203
>
> # A better size estimate
>> lobstr::obj_size(formula)
> 8,000,888 B
>
> /Henrik
>
> On Wed, Oct 20, 2021 at 12:57 PM Duncan Murdoch
> <murdoch.duncan using gmail.com> wrote:
>>
>> On 20/10/2021 9:20 a.m., Jinsong Zhao wrote:
>>> On 2021/10/20 21:05, Duncan Murdoch wrote:
>>>> On 20/10/2021 8:57 a.m., Jinsong Zhao wrote:
>>>>> Hi there,
>>>>>
>>>>> I have a RData file that is obtained by save.image() with size about
>>>>> 74.0 MB (77,608,222 bytes).
>>>>>
>>>>> When load into R, I measured the size of each object with object.size():
>>>>>
>>>>>> object.size(combn.rda.m)
>>>>> 105448 bytes
>>>>>> object.size(cross)
>>>>> 102064 bytes
>>>>>> object.size(denitr.1)
>>>>> 25032 bytes
>>>>>> object.size(rda.denitr.1)
>>>>> 600280 bytes
>>>>>> object.size(xh)
>>>>> 7792 bytes
>>>>>> object.size(xh.x)
>>>>> 6064 bytes
>>>>>> object.size(xh.x.1)
>>>>> 24144 bytes
>>>>>> object.size(xh.x.2)
>>>>> 24144 bytes
>>>>>> object.size(xh.x.3)
>>>>> 24144 bytes
>>>>>> object.size(xh.y)
>>>>> 2384 bytes
>>>>>
>>>>> There are all small objects.
>>>>>
>>>>> If I delete the largest one "rda.denitr.1", and save.image("xx.RData").
>>>>> It has the size of 22.6 KB (23,244 bytes). All seem OK.
>>>>>
>>>>> However, when I save(rda.denitr.1, file = "yy.RData"), then it has the
>>>>> size of 73.9 MB (77,574,869 bytes).
>>>>>
>>>>> I don't know why...
>>>>>
>>>>> Any hint?
>>>>
>>>> As the docs for object.size() say, "Exactly which parts of the memory
>>>> allocation should be attributed to which object is not clear-cut." In
>>>> particular, if a function or formula has an associated environment, it
>>>> isn't included, but it is sometimes saved in the image.
>>>>
>>>> So I'd suspect rda.denitr.1 contains something that references an
>>>> environment, and it's an environment that would be saved. (I forget the
>>>> exact rules, but I think that means it's not the global environment and
>>>> it's not a package environment.)
>>>>
>>>> Duncan Murdoch
>>>
>>>
>>> The rda.denitr.1 is only a list with length 2:
>>> rda.denitr.1[[1]] is a vector with length 10;
>>> rda.denitr.2[[2]] is a list with the length 10. rda.denitr.1[[2]][[1]]
>>> to rda.denitr.1[[2]][[10]] are small RDA objects generated by rda() from
>>> vegan package.
>>>
>>> If I
>>> > a <- rda.denitr.1[[2]][[1]]
>>> > object.size(a)
>>> 59896 bytes
>>> > save(a, file = "abc.RData")
>>> It also has a large size of 73.9 MB (77,536,611 bytes)
>>>
>>> Jinsong
>>>
>>
>> The rda() function uses formulas. If it saves the formula in the
>> result, then it references the environment of that formula, typically
>> the environment where the formula was created.
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list