[Rd] surprising behaviour of names<-

Thomas Lumley tlumley at u.washington.edu
Mon Mar 16 09:10:33 CET 2009


Wacek,

In this case I think the *tmp* dates from the days before backticks, when it was not a legal name (it still isn't) and it was much, much harder to use illegal names, so the collision issue really didn't exist.

You're right about the documentation.

       -thomas


On Sun, 15 Mar 2009, Wacek Kusnierczyk wrote:

> Berwin A Turlach wrote:
>>
>> Obviously, assuming that R really executes
>> 	*tmp* <- x
>> 	x <- "names<-"('*tmp*', value=c("a","b"))
>> under the hood, in the C code, then *tmp* does not end up in the symbol
>> table and does not persist beyond the execution of
>> 	names(x) <- c("a","b")
>
> to prove that i take you seriously, i have peeked into the code, and
> found that indeed there is a temporary binding for *tmp* made behind the
> scenes -- sort of. unfortunately, it is not done carefully enough to
> avoid possible interference with the user's code:
>
> '*tmp*' = 0
> `*tmp*`
> # 0
>
> x = 1
> names(x) = 'foo'
> `*tmp*`
> # error: object "*tmp*" not found
>
> `*ugly*`
>
> given that `*tmp*`is a perfectly legal (though some would say
> 'non-standard') name, it would be good if somewhere here a warning were
> issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is not just
> any non-standard name, but one that is 'obviously' used under the hood
> to perform black magic.
>
> it also appears that the explanation given in, e.g., the r language
> definition (draft, of course) sec. 3.4.4:
>
> "
> Assignment to subsets of a structure is a special case of a general
> mechanism for complex
> assignment:
> x[3:5] <- 13:15
> The result of this commands is as if the following had been executed
> ‘*tmp*‘ <- x
> x <- "[<-"(‘*tmp*‘, 3:5, value=13:15)
> "
>
> is incomplete (because the final result is not '*tmp*' having the value
> of x, as it might seem, but rather '*tmp*' having been unbound).
>
> so the suggestion for the documenters is to add to the end of the
> section (or wherever else it is appropriate) a warning to the effect
> that in the end '*tmp*' will be removed, even if the user has explicitly
> defined it earlier in the same scope.
>
> or maybe have the implementation not rely on a user-forgeable name? for
> example, the '.Last.value' name is automatically bound to the most
> recently returned value, but it resides in package:base and does not
> collide with bindings using it made by the user:
>
> .Last.value = 0
>
> 1
> .Last.value
> # 0, not 1
>
> 1
> base::.Last.value
> # 1, not 0
>
>
> why could not '*tmp*' be bound and unbound outside of the user's
> namespace? (i guess it's easier to update the docs -- or just ignore the
> issue.)
>
>
> on the margin, traceback('<-') will pick only one of the uses of '<-'
> suggested by the code above:
>
> x <- 1:10
>
> trace('<-')
> x[3:5] <- 13:15
> # trace: x[3:5] <- 13:15
> # trace: x <- `[<-`(`*tmp*`, 3:5, value = 13:15)
>
> which is somewhat confusing, because then '*tmp*' appears in the trace
> somewhat ex machina. (again, the explanation is in the source code, but
> the traceback could have been more informative.)
>
> cheers,
> vQ
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list