dealing with large objects -- memory wasting ?
Martyn Plummer
plummer@iarc.fr
Fri, 04 Jun 1999 19:14:46 +0200 (CEST)
[Long example of matrix() wasting memory
by Martin Maechler (MM) snipped]
A minor point perhaps, but I think there is an error
in your calculations.
If I understand correctly, the problem is that matrix()
assigns a local copy of its answer before returning it.
So a version which does not do this ...
function (data = NA, nrow = 1, ncol = 1, byrow = FALSE)
{
if (missing(nrow))
nrow <- ceiling(length(data)/ncol)
else if (missing(ncol))
ncol <- ceiling(length(data)/nrow)
.Internal(matrix(data, nrow, ncol, byrow))
}
should do better
Using commands like
rm(X); n <- ... ; p <- 20; X <- matrix(rnorm(n*p), n,p); gc()
the largest value of n I could use successfully was about
18000, which still less than what you suggest,
MM> Since we have 747 thousands of them ,
MM> constructing X the double size (400'000) shouldn't be a problem ...
and only 50% greater than what you can do with the standard matrix()
function (n ~ 12000).
I think the answer is that your calculations did not take
into account the argument to matrix - rnorm(n*p) - which
also temporarily takes up as much memory as the final matrix.
With trivial data you can do better:
rm(X); n <- ... ; p <- 20; X <- matrix(0, n,p); gc()
You can assign up to n ~ 37000 with the standard matrix()
function and n ~ 74000 with the modified version, which
is the expected 100% improvement.
MM>There seem to be worse problems when use
MM>
MM> var(x)
MM>
MM>and x is one of those huge n x p matrices...
I couldn't assign a matrix that was big enough to crash var().
Is there a problem here? The fact that the default value of
y is x is not a problem because of lazy evaluation.
If you assigned y in the body of the function ...
function (x, y, na.rm = FALSE, use)
{
if (missing(y))
y <- x
if (missing(use))
use <- if (na.rm)
"complete.obs"
else "all.obs"
cov(x, y, use = use)
}
then you would have problems, but this isn't the case.
What am I missing?
Martyn
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._