[R] Is it possible to avoid copying arrays when calling list()?
MRipley
mrip027 at gmail.com
Fri Aug 16 18:16:07 CEST 2013
Usually R is pretty good about not copying objects when it doesn't need
to. However, the list() function seems to make unnecessary copies. For
example:
> system.time(x<-double(10^9))
user system elapsed
1.772 4.280 7.017
> system.time(y<-double(10^9))
user system elapsed
2.564 3.368 5.943
> system.time(z<-list(x,y))
user system elapsed
5.520 6.748 12.304
I have a function where I create two large arrays, manipulate them in
certain ways, and then return both as a list. I'm optimizing the
function, so I'd like to be able to build the return list quickly. The
two large arrays drop out of scope immediately after I make the list and
return it, so copying them is completely unnecessary.
Is there some way to do this? I'm not familiar with manipulating lists
through the .Call interface, and haven't been able to find much about
this in the documentation. Might it be possible to write a fast (but
possibly unsafe) list function using .Call that doesn't make copies of
the arguments?
PS A few things I've tried. First, this is not due to triggering
garbage collection -- even if I call gc() before list(x,y), it still
takes a long time.
Also, I've tried rewriting the function by creating the list at the
beginning as in:
result <- list(x=double(10^9),y=double(10^9))
and then manipulating result$x and result$y but this made my code run
slower, as R seemed to be making other unnecessary copies while
manipulating elements of a list like this.
I've considered (though not implemented) creating an environment rather
than a list, and returning the environment, but I'd rather find a simple
way of creating a list without making copies if possible.
More information about the R-help
mailing list