[R] Peculiar behavior of attached objects

Greg Hammett hammett at princeton.edu
Sat Aug 17 16:38:06 CEST 2002

I've just discovered R and think it is terrific.  I quickly reproduced
results with a few lines of R commands that 7 years ago I had to do with
a larger fortran code and many calls to NAG routines.  (I'm mostly a
computational plasma physicist, but occasionally delve into statistical
analysis of data.)

But I've come accross a very peculiar behavior of attached objects that
cost me hours of searching for a bug, and it would be nice if the R
developers could implement a small change to make the language easier to

The problem was originally buried in a much larger code, but I've boiled
it down to a 6 line example:


> d <- data.frame(y=10)
> attach(d)
> d$y <- 20


The online help for attach() warns not to assign to the short variable
name "y", as that creates a new variable named "y" and the original
variable "d$y" remains unchanged.  So I assumed that I could assign to
the fully qualified name "d$y", and indeed that successfully changed the
value of d$y:


> d$y
[1] 20
> y
[1] 10
> ls()
[1] "d"


However, unbeknownst to me at first, it also created a new variable "y"
that keeps the original value of "d$y" and no longer points to the
present value of "d$y$".  Furthermore, this new variable "y" doesn't
show up in the list of objects reported by ls()!  (This is unlike the
example given in help(attach), where the new variable "height" created
by the assignment shows up in the ls() object list.)  If a user assumes
that "y" points to the present value of "d$y$, as the attach() command
usually does, he will have bugs that will be very hard to track down.

Although the new variable "y" is hidden from the ls() list of objects,
it will be removed by doing a detach("d") command:


> detach("d")
> y
Error: Object "y" not found


I can't think of any good reason why R should behave like this.  I've
tried this same example in Splus, and was surprised to see that it has
the same behavior, so I suppose R at least has compatible
peculiarities.  I understand that assigning to a short variable name
when attach is operational is supposed to create a new variable instead
of modifying the original:

> d <- data.frame(y=10)
> attach(d)
> y <- 20
> d$y
[1] 10

and that a lot of R code might have been written assuming this behavior
so it probably shouldn't be changed at this point.  But if one makes an
assignment to a fully qualified long variable name, I can't think of any
good reason for a new semi-hidden variable to be created.  Thus I think
that R should instead do the following:

> d <- data.frame(y=10)
> attach(d)
> d$y <- 20
> y
[1] 20

This seems to me to be a much more natural and intuitive behavior that
the user should expect.  Compatibility issues may require adding a
switch to allow users to get the old behavior if they really wanted, but
I can't think of how any users could have relied on this undocumented


I'm new to R, so perhaps I'm missing something that could be explained
to me.  If it is decided not to change R's behavior, then at the least 
I suggest that the example given by help(attach) be extended by
appending the following:

     women$height <- height*2.54  ## Don't try to do this either, as it
     ## will still create a new variable "height" with the original
     ## values of women$height.  I.e., height no longer points to the
     ## present value of women$height:

     sd(women$height-height)   # shows 6.88709

     ## furthermore, this new variable is not listed by ls() and
     ## disappears after doing detach("women")
     height   # gives an error message

Greg Hammett    hammett at princeton.edu
Lecturer with rank of Professor, 
   Astrophysical Sciences, Princeton University
Principal Research Physicist, 
   Princeton Plasma Physics Laboratory
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list