[R] outer(-x, x, pmin) cannot allocate

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Dec 23 00:33:16 CET 2004


Berton Gunter <gunter.berton at gene.com> writes:

> David:
> 
> In general, this is not a good question to ask, as one needs to go into the
> bowels of R to find an answer.
> 
> But note that 8000 x 8000 x 4 bytes for double precision  = 256 mb. Now look
> at the code of outer(). Two vectors of this size are created = 512mb. Then
> copies of these must be created to be passed into pmin, I believe, as R
> passes by value. That's 1gb.
> 
> My guess is that "+", as an internal function, avoids the final doubling.
> 
> Corrections/clarifications by knowledgeable R experts cheerfully welcomed.
> I'm on thin ice here.

Well it's 8 bytes to a double, not 4...   If you look inside pmin,
you'll see the first couple of lines saying:

    elts <- list(...)
    mmm <- as.vector(elts[[1]])
    has.na <- FALSE
    for (each in elts[-1]) {
        work <- cbind(mmm, as.vector(each))
        nas <- is.na(work)

which by my counts takes about 6 copies (2 in "elts", 1 in "mmm", 1 in
"elts[-1]", 2 in "work") in addition to the 2 input vectors + a
logical vector of the same length of "work". And that is before
actually operating on anything! You can never be quite sure that the
copying actually takes place since R tries to do virtual copies if it
can, but the empirical data suggests that it does get to something
like 10 or 11 copies in total.

However, somewhat surprisingly, this doesn't help a whole lot:

  x <- 0. + 1:8000
  mypmin <- function(x,y)ifelse(x<y,x,y)
  y <- outer(-x, x, mypmin)

This version seems a little better, but still crosses the 3 GB line
(hmm, rm(y) had probably saved half a GB):
  
  mypmin <- function(x,y) {ix <- x<y; y[ix] <- x[ix]; y}

The "+" variant runs in 1.5 GB which would seem to be the smallest
you can hope for.
-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list