[R] Why is vector assignment in R recreates the entire vector ?

Duncan Murdoch murdoch.duncan at gmail.com
Wed Sep 1 17:39:18 CEST 2010


On 01/09/2010 11:09 AM, Tal Galili wrote:
> Hello all,
>
> A friend recently brought to my attention that vector assignment actually
> recreates the entire vector on which the assignment is performed.
>
> So for example, the code:
> x[10]<- NA # The original call (short version)
>
> Is really doing this:
> x<- replace(x, list=10, values=NA) # The original call (long version)
> # assigning a whole new vector to x
>
> Which is actually doing this:
> x<- `[<-`(x, list=10, values=NA) # The actual call
>
>
> Assuming this can be explained reasonably to the lay man, my question is,
> why is it done this way ?
>   

Your friend misled you.  The `[<-` function is primitive.  It acts as 
though it does what you describe, but it is free to do internal 
optimizations, and in many cases it does.  The replace() function is a 
regular R-level function so it has much less freedom and is likely to be 
a lot less efficient.

For example, in evaluating the expression x[10] <- NA, in most cases R 
knows that the original vector x will never be needed again, so it won't 
be duplicated.  But in evaluating

replace(x, list=10, values=NA)

R can't be sure, so it would make a duplicate copy.

You can see the difference in the following code:

 > x <- 1:1000
 > tracemem(x)
[1] "<0x0547a6c0>"
 > x[10] <- NA
 > x <- replace(x, list=10, values=NA)
tracemem[0x0547a6c0 -> 0x0488a768]: replace

Only the second version caused x to be duplicated.

One example that looks as though it is doing unnecessary duplication is 
this:

 > x[10] <- 3
tracemem[0x0488a768 -> 0x04881260]:
tracemem[0x04881260 -> 0x05613368]:

I can see that one duplication is necessary (x is being changed from 
type integer to type double), but why two?

Duncan Murdoch

> Why won't it just change the relevant pointer in memory?
>   


> On small vectors it makes no difference.
> But on big vectors this might be (so I suspect) costly (in terms of time).
>
>
> I'm curious for your responses on the subject.
>
> Best,
> Tal
>
>
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
> ----------------------------------------------------------------------------------------------
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list