[R] Is it possible to avoid copying arrays when calling list()?
MRipley
mrip027 at gmail.com
Sat Aug 17 14:08:48 CEST 2013
Thanks for the input, but it looks like I found a simple solution.
Turns out that if you assign to lists by name, then R doesn't make extra
copies:
> x<-double(10^9)
> mylist<-list()
> system.time(mylist[[1]]<-x)
user system elapsed
2.992 3.352 6.364
> x<-double(10^9)
> mylist<-list()
> system.time(mylist$x<-x)
user system elapsed
0 0 0
This is on R version 3.0.1.
On 08/16/2013 10:37 PM, David Winsemius wrote:
> On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:
>
>> >If you don't want to copy the data, you can use environments. You can first
>> >define x and y in the global environment and then in the function, use
>> >function get() to get x, y in the global environment. When you change x and
>> >y in the function, x and y also change in the global environment.
>> >
> That doesn't sound like the behavior I expect in R. Do you care to illustrate this?
>
> -- David.
>> >Best,
>> >Gang
>> >
>> >
>> >2013/8/16 MRipley<mrip027 at gmail.com>
>> >
>>> >>Usually R is pretty good about not copying objects when it doesn't need
>>> >>to. However, the list() function seems to make unnecessary copies. For
>>> >>example:
>>> >>
>>>> >>>system.time(x<-double(10^9))
>>> >> user system elapsed
>>> >> 1.772 4.280 7.017
>>>> >>>system.time(y<-double(10^9))
>>> >> user system elapsed
>>> >> 2.564 3.368 5.943
>>>> >>>system.time(z<-list(x,y))
>>> >> user system elapsed
>>> >> 5.520 6.748 12.304
>>> >>
>>> >>I have a function where I create two large arrays, manipulate them in
>>> >>certain ways, and then return both as a list. I'm optimizing the function,
>>> >>so I'd like to be able to build the return list quickly. The two large
>>> >>arrays drop out of scope immediately after I make the list and return it,
>>> >>so copying them is completely unnecessary.
>>> >>
>>> >>Is there some way to do this? I'm not familiar with manipulating lists
>>> >>through the .Call interface, and haven't been able to find much about this
>>> >>in the documentation. Might it be possible to write a fast (but possibly
>>> >>unsafe) list function using .Call that doesn't make copies of the arguments?
>>> >>
>>> >>PS A few things I've tried. First, this is not due to triggering garbage
>>> >>collection -- even if I call gc() before list(x,y), it still takes a long
>>> >>time.
>>> >>
>>> >>Also, I've tried rewriting the function by creating the list at the
>>> >>beginning as in:
>>> >>result <- list(x=double(10^9),y=double(**10^9))
>>> >>and then manipulating result$x and result$y but this made my code run
>>> >>slower, as R seemed to be making other unnecessary copies while
>>> >>manipulating elements of a list like this.
>>> >>
>>> >>I've considered (though not implemented) creating an environment rather
>>> >>than a list, and returning the environment, but I'd rather find a simple
>>> >>way of creating a list without making copies if possible.
>>> >>
>>> >>______________________________**________________
>>> >>R-help at r-project.org mailing list
>>> >>https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> >>PLEASE do read the posting guidehttp://www.R-project.org/**
>>> >>posting-guide.html<http://www.R-project.org/posting-guide.html>
>>> >>and provide commented, minimal, self-contained, reproducible code.
>>> >>
>> >
>> > [[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
> David Winsemius
> Alameda, CA, USA
>
More information about the R-help
mailing list