[Rd] Suggestion for memory optimization and as.double() with friends
Henrik Bengtsson
hb at stat.berkeley.edu
Wed Mar 28 23:25:37 CEST 2007
Hi,
when doing as.double() on an object that is already a double, the
object seems to be copied internally, doubling the memory requirement.
See example below. Same for as.character() etc. Is this intended?
Example:
% R --vanilla
> x <- double(1e7)
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 234019 6.3 467875 12.5 350000 9.4
Vcells 10103774 77.1 11476770 87.6 10104223 77.1
> x <- as.double(x)
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 234113 6.3 467875 12.5 350000 9.4
Vcells 10103790 77.1 21354156 163.0 20103818 153.4
However, couldn't this easily be avoided by letting as.double() return
the object as is if already a double?
Example:
% R --vanilla
> as.double.double <- function(x, ...) x
> x <- double(1e7)
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 234019 6.3 467875 12.5 350000 9.4
Vcells 10103774 77.1 11476770 87.6 10104223 77.1
> x <- as.double(x)
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 234028 6.3 467875 12.5 350000 9.4
Vcells 10103779 77.1 12130608 92.6 10104223 77.1
What's the catch?
The reason why I bring it up, is because many (most?) methods are
using as.double() etc "just in case" when passing arguments to
.Call(), .Fortran() etc, e.g. stats::smooth.spline():
fit <- .Fortran(R_qsbart, as.double(penalty), as.double(dofoff),
x = as.double(xbar), y = as.double(ybar), w = as.double(wbar), <etc>)
Your memory usage is peaking in the actual call and the garbage
collector cannot clean it up until after the call. This seems to be
waste of memory, especially when the objects are large (100-1000MBs).
Cheers
Henrik
More information about the R-devel
mailing list