[R] environments again
Thomas Lumley
tlumley at u.washington.edu
Mon Dec 17 23:55:34 CET 2001
Yes, there's a bug. It's not as simple as your email suggests -- and it
provides a nice illustration of why a reproducible example is much more
helpful than a hypothesis ("adding an extra argument makes it unable to
find the first argument")
Also note that the problem doesn't happen if the variables are in a
data= argument, which is a simple way to stop this happening and is
generally a Good Thing.
Read on for more detail than you probably want. Here's a simpler version
that shows the problem and doesn't use common variable names -- with your
functions and the internals of aov there were too many things called `y'
for my taste.
ff<-function(){
why<-1:10
ex<-rep(0:1,5)
ess<-rep(1:5,2)
print(aov(why~ex)) # works
print(aov(why~ex+Error(ess))) # doesn't
}
So the problem has something to do with Error() terms.
The traceback() shows that the error occurs inside aov, when it is creates
a new lm(why~ess) call to handle the Error(). At this point we have
Browse[1]> ecall
lm(formula = why ~ ess, singular.ok = TRUE, method = "qr", qr = TRUE)
Browse[1]> eval(ecall,parent.frame())
Error in eval(expr, envir, enclos) : Object "why" not found
but evaluating seemly the same explicit formula works
Browse[1]> eval(quote(lm(formula = why ~ ess, singular.ok = TRUE, method =
"qr", qr = TRUE)),parent.frame())
Call:
lm(formula = why ~ ess, method = "qr", qr = TRUE, singular.ok = TRUE)
Coefficients:
(Intercept) ess
2.5 1.0
This suggests that we have a problem with formula environments, and indeed
Browse[1]> ls(env=environment(formula(ecall)))
[1] "Call" "Terms" "allTerms" "contrasts" "data"
[6] "eTerm" "ecall" "errorterm" "formula" "indError"
[11] "intercept" "lmcall" "opcons" "projections" "qr"
where the original formula argument has
Browse[1]> ls(env=environment(formula))
[1] "ess" "ex" "why"
agreeing with
Browse[1]> ls(env=parent.frame())
[1] "ess" "ex" "why"
So it's a bug in aov() caused by the relatively new scoping rules for
formulas, where variables that aren't found in a specified data frame are
now sought in the environment of the formula.
In most cases this is an improvement over the previous rules, but it
causes problems for functions that do surgery on formulas, like aov() and
coxph().
I think a fix should be simple but it may be too late for 1.4.0, which is
due nearly tomorrow.
-thomas
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list