[R] Is it possible to avoid copying arrays when calling list()?

David Winsemius dwinsemius at comcast.net
Sat Aug 17 04:37:47 CEST 2013


On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:

> If you don't want to copy the data, you can use environments. You can first
> define x and y in the global environment and then in the function, use
> function get() to get x, y in the global environment. When you change x and
> y in the function, x and y also change in the global environment.
> 

That doesn't sound like the behavior I expect in R. Do you care to illustrate this?

-- 
David.


> Best,
> Gang
> 
> 
> 2013/8/16 MRipley <mrip027 at gmail.com>
> 
>> Usually R is pretty good about not copying objects when it doesn't need
>> to.  However, the list() function seems to make unnecessary copies.  For
>> example:
>> 
>>> system.time(x<-double(10^9))
>>   user  system elapsed
>>  1.772   4.280   7.017
>>> system.time(y<-double(10^9))
>>   user  system elapsed
>>  2.564   3.368   5.943
>>> system.time(z<-list(x,y))
>>   user  system elapsed
>>  5.520   6.748  12.304
>> 
>> I have a function where I create two large arrays, manipulate them in
>> certain ways, and then return both as a list.  I'm optimizing the function,
>> so I'd like to be able to build the return list quickly.  The two large
>> arrays drop out of scope immediately after I make the list and return it,
>> so copying them is completely unnecessary.
>> 
>> Is there some way to do this?  I'm not familiar with manipulating lists
>> through the .Call interface, and haven't been able to find much about this
>> in the documentation.  Might it be possible to write a fast (but possibly
>> unsafe) list function using .Call that doesn't make copies of the arguments?
>> 
>> PS A few things I've tried.  First, this is not due to triggering garbage
>> collection -- even if I call gc() before list(x,y), it still takes a long
>> time.
>> 
>> Also, I've tried rewriting the function by creating the list at the
>> beginning as in:
>> result <- list(x=double(10^9),y=double(**10^9))
>> and then manipulating result$x and result$y but this made my code run
>> slower, as R seemed to be making other unnecessary copies while
>> manipulating elements of a list like this.
>> 
>> I've considered (though not implemented) creating an environment rather
>> than a list, and returning the environment, but I'd rather find a simple
>> way of creating a list without making copies if possible.
>> 
>> ______________________________**________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list