[Rd] Confused about NAMED

Thu Nov 24 13:04:03 CET 2011

On 11-11-24 6:34 AM, Matthew Dowle wrote:
>>
>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>
>>> Hi,
>>>
>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>> them,
>>> but not the other two?
>>>
>>>> R --vanilla
>>> R version 2.14.0 (2011-10-31)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>>> x = 1L
>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>
>>>> y = 1:10
>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>
>>>> z = data.frame()
>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>> ATTRIB:
>>>   @24fc270 02 LISTSXP g0c0 []
>>>     TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>     @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>     TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>     @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>     TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>     @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>       @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>
>>> It's a little difficult to search for the word "named" but I tried and
>>> found this in R-ints :
>>>
>>>     "Note that optimizing NAMED = 1 is only effective within a primitive
>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>> promise to the argument is evaluated)"
>>>
>>> So might it be that just looking at NAMED using .Internal(inspect()) is
>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>
>> This is tricky business... I'm not quite sure I'll get it right, but let's
>> try
>>
>> When you are assigning a constant, the value you assign is already part of
>> the assignment expression, so if you want to modify it, you must
>> duplicate. So NAMED==2 on z<- 1 is basically to prevent you from
>> accidentally "changing the value of 1". If it weren't, then you could get
>> bitten by code like for(i in 1:2) {z<- 1; if(i==1) z[1]<- 2}.
>>
>> If you're assigning the result of a computation, then the object only
>> exists once, so
>> z<- 0+1  gets NAMED==1.
>>
>> However, if the computation is done by returning a named value from within
>> a function, as in
>>
>>> f<- function(){v<- 1+0; v}
>>> z<- f()
>>
>> then again NAMED==2. This is because the side effects of the function
>> _might_ result in something having a hold on the function environment,
>> e.g. if we had
>>
>> e<- NULL
>> f<- function(){e<<-environment(); v<- 1+0; v}
>> z<- f()
>>
>> then z[1]<- 5 would change e$v too. As it happens, there aren't any side
>> effects in the forme case, but R loses track and assumes the worst.
>>
>
> Thanks a lot, think I follow. That explains x vs y, but why is z NAMED==2?
> The result of data.frame() is an object that exists once (similar to 1:10)
> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the worst
> even on its own functions such as data.frame()?

R has several types of functions -- see the R Internals manual for 
details.  data.frame() is a plain R function, so it is treated no 
differently than any user-written function.  On the other hand, the 
internal function that implements the : operator is a "primitive", so it 
has complete control over its return value, and it can set NAMED in the 
most efficient way.

So you might think that returning a value as an evaluation of a 
primitive adds efficiency, e.g. in Peter's example

f<- function(){v<- 1+0; v + 0}

will return NAMED == 1.  But that's because internally it had to make a 
copy of v before adding 0 to it, so you've probably really made it less 
efficient:  the original version might never modify the result, so it 
might never make a copy.

Duncan Murdoch