[R] Is it possible to avoid copying arrays when calling list()?
David Winsemius
dwinsemius at comcast.net
Sat Aug 17 04:37:47 CEST 2013
On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:
> If you don't want to copy the data, you can use environments. You can first
> define x and y in the global environment and then in the function, use
> function get() to get x, y in the global environment. When you change x and
> y in the function, x and y also change in the global environment.
>
That doesn't sound like the behavior I expect in R. Do you care to illustrate this?
--
David.
> Best,
> Gang
>
>
> 2013/8/16 MRipley <mrip027 at gmail.com>
>
>> Usually R is pretty good about not copying objects when it doesn't need
>> to. However, the list() function seems to make unnecessary copies. For
>> example:
>>
>>> system.time(x<-double(10^9))
>> user system elapsed
>> 1.772 4.280 7.017
>>> system.time(y<-double(10^9))
>> user system elapsed
>> 2.564 3.368 5.943
>>> system.time(z<-list(x,y))
>> user system elapsed
>> 5.520 6.748 12.304
>>
>> I have a function where I create two large arrays, manipulate them in
>> certain ways, and then return both as a list. I'm optimizing the function,
>> so I'd like to be able to build the return list quickly. The two large
>> arrays drop out of scope immediately after I make the list and return it,
>> so copying them is completely unnecessary.
>>
>> Is there some way to do this? I'm not familiar with manipulating lists
>> through the .Call interface, and haven't been able to find much about this
>> in the documentation. Might it be possible to write a fast (but possibly
>> unsafe) list function using .Call that doesn't make copies of the arguments?
>>
>> PS A few things I've tried. First, this is not due to triggering garbage
>> collection -- even if I call gc() before list(x,y), it still takes a long
>> time.
>>
>> Also, I've tried rewriting the function by creating the list at the
>> beginning as in:
>> result <- list(x=double(10^9),y=double(**10^9))
>> and then manipulating result$x and result$y but this made my code run
>> slower, as R seemed to be making other unnecessary copies while
>> manipulating elements of a list like this.
>>
>> I've considered (though not implemented) creating an environment rather
>> than a list, and returning the environment, but I'd rather find a simple
>> way of creating a list without making copies if possible.
>>
>> ______________________________**________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list