[Rd] surprising behaviour of names<-

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sun Mar 15 21:01:33 CET 2009


Berwin A Turlach wrote:
>
> Obviously, assuming that R really executes 
> 	*tmp* <- x
> 	x <- "names<-"('*tmp*', value=c("a","b"))
> under the hood, in the C code, then *tmp* does not end up in the symbol
> table and does not persist beyond the execution of 
> 	names(x) <- c("a","b")
>
>   

to prove that i take you seriously, i have peeked into the code, and
found that indeed there is a temporary binding for *tmp* made behind the
scenes -- sort of. unfortunately, it is not done carefully enough to
avoid possible interference with the user's code:

'*tmp*' = 0
`*tmp*`
# 0

x = 1
names(x) = 'foo'
`*tmp*`
# error: object "*tmp*" not found

`*ugly*`

given that `*tmp*`is a perfectly legal (though some would say
'non-standard') name, it would be good if somewhere here a warning were
issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is not just
any non-standard name, but one that is 'obviously' used under the hood
to perform black magic.

it also appears that the explanation given in, e.g., the r language
definition (draft, of course) sec. 3.4.4:

"
Assignment to subsets of a structure is a special case of a general
mechanism for complex
assignment:
x[3:5] <- 13:15
The result of this commands is as if the following had been executed
‘*tmp*‘ <- x
x <- "[<-"(‘*tmp*‘, 3:5, value=13:15)
"

is incomplete (because the final result is not '*tmp*' having the value
of x, as it might seem, but rather '*tmp*' having been unbound).

so the suggestion for the documenters is to add to the end of the
section (or wherever else it is appropriate) a warning to the effect
that in the end '*tmp*' will be removed, even if the user has explicitly
defined it earlier in the same scope.

or maybe have the implementation not rely on a user-forgeable name? for
example, the '.Last.value' name is automatically bound to the most
recently returned value, but it resides in package:base and does not
collide with bindings using it made by the user:

.Last.value = 0

1
.Last.value
# 0, not 1

1
base::.Last.value
# 1, not 0


why could not '*tmp*' be bound and unbound outside of the user's
namespace? (i guess it's easier to update the docs -- or just ignore the
issue.)


on the margin, traceback('<-') will pick only one of the uses of '<-'
suggested by the code above:

x <- 1:10

trace('<-')
x[3:5] <- 13:15
# trace: x[3:5] <- 13:15
# trace: x <- `[<-`(`*tmp*`, 3:5, value = 13:15)

which is somewhat confusing, because then '*tmp*' appears in the trace
somewhat ex machina. (again, the explanation is in the source code, but
the traceback could have been more informative.)

cheers,
vQ



More information about the R-devel mailing list