[R] Batchjob creates small object but large workspace ???

Izmirlian, Grant (NIH/NCI) izmirlig at mail.nih.gov
Sat Nov 19 01:27:21 CET 2005


I ran into an interesting problem that I think I have solved. I ran a batch job
as "--no-save", electing only to save all objects there before the job started and
one reasonably small object created as the result of the job, ( ~ 19K ).  During the
course of the job several large objects are generated, but not among the list of
things which are saved.

The interesting problem is that the .RData file ends up being around 250 MB in size
larger than it was previously. Inside of R, "object.size" returns a reasonably
accurate estimate, 19K, but somehow there is hidden junk.

I am trying, as a write, the solution the problem. Before the "save" command at the
bottome of the batch file I should delete the un-needed large objects and then call
gc() . My thought is that the save operation needs to generate a large temporary file
and then copy only the parts of it that are requested to be saved and this takes
nearly all of my 1GB of system memory so that the poor "gc" program is shoved off the
stack (or some semblence of this reasoning at least).

Oh... (not so) great news... the job just finished and it looks like my idea was
incorrect.

So I am open to suggestions!

.RData before run:    88951444 bytes
.RData after run:    345671147 bytes

size of object saved:   190588 bytes

and by the way, I have good reason to trust the 19K estimate since the information
contained in the object fits on about 3 screens.




More information about the R-help mailing list