[Rd] Severe memory problem using split()
cstrato at aon.at
Mon Jul 12 22:45:43 CEST 2010
With great interest I followed the discussion:
since I have currently a similar problem:
In a new R session (using xterm) I am importing a simple table
"Hu6800_ann.txt" which has a size of 754KB only:
> ann <- read.delim("Hu6800_ann.txt")
 7129 11
When I call "object.size(ann)" the estimated memory used to store "ann"
is already 2MB:
Now I call "split()" and check the estimated memory used which turns out
to be 3.3GB:
> u2p <- split(ann[,"ProbesetID"],ann[,"UNIT_ID"])
During the R session I am running "top" in another xterm and can see
that the memory usage of R increases to about 550MB RSIZE.
Now I do:
It takes about 3 minutes to complete this call and the memory usage of R
increases to about 1.3GB RSIZE. Furthermore, during evaluation of this
function the free RAM of my Mac decreases to less than 8MB free PhysMem,
until it needs to swap memory. When finished, free PhysMem is 734MB but
the size of R increased to 577MB RSIZE.
Doing "split(ann[,"ProbesetID"],ann[,"UNIT_ID"],drop=TRUE)" did not
change the object.size, only processing was faster and it did use less
memory on my Mac.
Do you have any idea what the reason for this behavior is?
Why is the size of list "u2p" so large?
Do I make any mistake?
Here is my sessionInfo on a MacBook Pro with 2GB RAM:
R version 2.11.1 (2010-05-31)
attached base packages:
 stats graphics grDevices utils datasets methods base
e.m.a.i.l: cstrato at aon.at
More information about the R-devel