[R] Is it possible to avoid copying arrays when calling list()?

MRipley mrip027 at gmail.com
Sat Aug 17 14:08:48 CEST 2013


Thanks for the input, but it looks like I found a simple solution. 
Turns out that if you assign to lists by name, then R doesn't make extra 
copies:

 > x<-double(10^9)
 > mylist<-list()
 > system.time(mylist[[1]]<-x)
    user  system elapsed
   2.992   3.352   6.364

 > x<-double(10^9)
 > mylist<-list()
 > system.time(mylist$x<-x)
    user  system elapsed
       0       0       0

This is on R version 3.0.1.


On 08/16/2013 10:37 PM, David Winsemius wrote:
> On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:
>
>> >If you don't want to copy the data, you can use environments. You can first
>> >define x and y in the global environment and then in the function, use
>> >function get() to get x, y in the global environment. When you change x and
>> >y in the function, x and y also change in the global environment.
>> >
> That doesn't sound like the behavior I expect in R. Do you care to illustrate this?
>
> -- David.
>> >Best,
>> >Gang
>> >
>> >
>> >2013/8/16 MRipley<mrip027 at gmail.com>
>> >
>>> >>Usually R is pretty good about not copying objects when it doesn't need
>>> >>to.  However, the list() function seems to make unnecessary copies.  For
>>> >>example:
>>> >>
>>>> >>>system.time(x<-double(10^9))
>>> >>   user  system elapsed
>>> >>  1.772   4.280   7.017
>>>> >>>system.time(y<-double(10^9))
>>> >>   user  system elapsed
>>> >>  2.564   3.368   5.943
>>>> >>>system.time(z<-list(x,y))
>>> >>   user  system elapsed
>>> >>  5.520   6.748  12.304
>>> >>
>>> >>I have a function where I create two large arrays, manipulate them in
>>> >>certain ways, and then return both as a list.  I'm optimizing the function,
>>> >>so I'd like to be able to build the return list quickly.  The two large
>>> >>arrays drop out of scope immediately after I make the list and return it,
>>> >>so copying them is completely unnecessary.
>>> >>
>>> >>Is there some way to do this?  I'm not familiar with manipulating lists
>>> >>through the .Call interface, and haven't been able to find much about this
>>> >>in the documentation.  Might it be possible to write a fast (but possibly
>>> >>unsafe) list function using .Call that doesn't make copies of the arguments?
>>> >>
>>> >>PS A few things I've tried.  First, this is not due to triggering garbage
>>> >>collection -- even if I call gc() before list(x,y), it still takes a long
>>> >>time.
>>> >>
>>> >>Also, I've tried rewriting the function by creating the list at the
>>> >>beginning as in:
>>> >>result <- list(x=double(10^9),y=double(**10^9))
>>> >>and then manipulating result$x and result$y but this made my code run
>>> >>slower, as R seemed to be making other unnecessary copies while
>>> >>manipulating elements of a list like this.
>>> >>
>>> >>I've considered (though not implemented) creating an environment rather
>>> >>than a list, and returning the environment, but I'd rather find a simple
>>> >>way of creating a list without making copies if possible.
>>> >>
>>> >>______________________________**________________
>>> >>R-help at r-project.org  mailing list
>>> >>https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> >>PLEASE do read the posting guidehttp://www.R-project.org/**
>>> >>posting-guide.html<http://www.R-project.org/posting-guide.html>
>>> >>and provide commented, minimal, self-contained, reproducible code.
>>> >>
>> >
>> >	[[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >R-help at r-project.org  mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
> David Winsemius
> Alameda, CA, USA
>



More information about the R-help mailing list