[Rd] surprising behaviour of names<-

Thu Mar 12 10:05:36 CET 2009

Berwin A Turlach wrote:
> On Wed, 11 Mar 2009 20:31:18 +0100
> Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>
>   
>> Simon Urbanek wrote:
>>     
>>> On Mar 11, 2009, at 10:52 , Simon Urbanek wrote:
>>>
>>>       
>>>> Wacek,
>>>>
>>>> Peter gave you a full answer explaining it very well. If you really
>>>> want to be able to trace each instance yourself, you have to learn
>>>> far more about R internals than you apparently know (and Peter
>>>> hinted at that). Internally x=1 an x=c(1) are slightly different
>>>> in that the former has NAMED(x) = 2 whereas the latter has
>>>> NAMED(x) = 0 which is what causes the difference in behavior as
>>>> Peter explained. The reason is that c(1) creates a copy of the 1
>>>> (which is a constant [=unmutable] thus requiring a copy) and the
>>>> new copy has no other references and thus can be modified and
>>>> hence NAMED(x) = 0.
>>>>
>>>>         
>>> Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above
>>> -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes
>>> NAMED(x) = 1 -- this is just a detail on how things work with
>>> assignment, the explanation above is still correct since
>>> duplication happens conditional on NAMED == 2.
>>>       
>> i guess this is what every user needs to know to understand the
>> behaviour one can observe on the surface? 
>>     
>
> Nope, only users who prefer to write '+'(1,2) instead of 1+2, or
> 'names<-'(x, 'foo') instead of names(x)='foo'.
>
>   

well, as far as i remember, it has been said on this list that in r the
infix syntax is equivalent to the prefix syntax, so no one wanting to
use the form above should be afraid of different semantics;  these two
forms should be perfectly equivalent.  after all,

    x = 1
    names(x) = 'foo'
    names(x)

should return NULL, because when the second assignment is made, we need
to make a copy of the value of x, so it is the copy that should have
changed names, not the value of x (which would still be the original 1).

on the other hand, the fact that

    names(x) = 'foo'

is (or so it seems) a shorthand for

    x = 'names<-'(x, 'foo')

is precisely why i'd think that the prefix 'names<-' should never do
destructive modifications, because that's what x = 'names<-'(x, 'foo'),
and thus also names(x) = 'foo', is for.

i guess the above is sort of blasphemy.

> Attempting to change the name attribute of x via 'names<-'(x, 'foo')
> looks to me as if one relies on a side effect of the function
> 'names<-'; which, in my book would be a bad thing.  

indeed;  so, for coherence, 'names<-' should always do the modification
on a copy.  it would then have semantics different from the infix form
of 'names<-', but at least consistently so.

> I.e. relying on side
> effects of a function, or writing functions with side effects which are
> then called for their side-effects;  this, of course, excludes
> functions like plot() :)  I never had the need to call 'names<-'()
> directly and cannot foresee circumstances in which I would do so.
>   

> Plenty of users, including me, are happy using the latter forms and,
> hence, never have to bother with understanding these implementation
> details or have to bother about them.  
>
> Your mileage obviously varies, but that is when you have to learn about
> these internal details.  If you call functions because of their
> side-effects, you better learn what the side-effects are exactly.
>   

well, i can imagine a user using the prefix 'names<-' precisely under
the assumption that it will perform functionally;  i.e., 'names<-'(x,
'foo') will always produce a copy of x with the new names, and never
change the x.  that there will be a destructive modification made to x
on some, but not all, occasions, is hardly a good thing in this context
-- and it's not a situation where a user wants to use the function
"because of its side effects", quite to the contrary.  this was actually
the situation i had when i first discovered the surprizing behaviour of
'names<-';  i thought 'names<-' did *not* have side effects.

cheers, and thanks for the discussion.
vQ