[R] Strange scope problem
Angelo Canty
canty at math.mcmaster.ca
Thu Oct 16 01:50:47 CEST 2003
Hi,
I have come across the following problem which seems to be a scoping
issue but I'm at a loss to see why this is so or to find a good
workaround.
Suppose I have a function to get a prediction after model selection
using the step function.
step.pred <- function(dat, x0) {
fit.model <- step(lm(y~., data=dat), trace=F)
predict(fit.model, x0, se.fit=T)
}
This function works sometimes for example
set.seed(1)
X.1 <- data.frame(x1=rnorm(20), x2=rnorm(20), x3=rnorm(20))
y.1 <- 5+as.matrix(X.1[,1:2])%*%matrix(c(1,1))+rnorm(20)
Xy.1 <- data.frame(X.1,y=y.1)
x0.1 <- data.frame(x1=-1,x2=-1, x3=-1)
step.pred(Xy.1, x0.1)
$fit
[1] 3.359540
$se.fit
[1] 0.523629
$df
[1] 16
$residual.scale
[1] 1.093526
but most often it crashes as in
set.seed(2)
X.2 <- data.frame(x1=rnorm(20), x2=rnorm(20), x3=rnorm(20))
y.2 <- 5+as.matrix(X.2[,1:2])%*%matrix(c(1,1))+rnorm(20)
Xy.2 <- data.frame(X.2,y=y.2)
x0.2 <- data.frame(x1=-1,x2=-1, x3=-1)
step.pred(Xy.2, x0.2)
Error in model.frame.default(formula = y ~ x1 + x2, data = dat,
drop.unused.levels = TRUE) :
Object "dat" not found
The difference seems to be that for the first dataset, step retains
all three variables whereas for the second it drops one of them.
> step(lm(y~.,data=Xy.1), trace=F)
Call:
lm(formula = y ~ x1 + x2 + x3, data = Xy.1)
Coefficients:
(Intercept) x1 x2 x3
4.8347 0.8937 1.0331 -0.4516
> step(lm(y~.,data=Xy.2), trace=F)
Call:
lm(formula = y ~ x1 + x2, data = Xy.2)
Coefficients:
(Intercept) x1 x2
5.0802 0.9763 1.1369
One possible workaround is to explicitely assign the local variable
dat in the .GlobalEnv as in
step.pred1 <- function(dat, x0) {
assign("dat",dat, envir=.GlobalEnv)
fit.model <- step(lm(y~., data=dat), trace=F)
predict(fit.model, x0, se.fit=T)
}
I don't like this method since it would overwrite anything else called
dat in .GlobalEnv. I realize that I could give it an obscure name but
the potential for damage still remains. Am I missing something obvious
here? If not, is it possible to work around this problem in such a way
that .GlobalEnv does not need to be touched?
In S-Plus I would use
assign("dat",dat, frame=1)
which works but that is not available (AFAIK) in R. Is there
something similar that I can use?
I am using R 1.6.1 for Unix on a Sun Workstation. I know that I need
to upgrade but our sysadmin doesn't regard it as priority!
Thanks for any help you can give for this.
Angelo
------------------------------------------------------------------
| Angelo J. Canty Email: cantya at mcmaster.ca |
| Mathematics and Statistics Phone: (905) 525-9140 x 27079 |
| McMaster University Fax : (905) 522-0935 |
| 1280 Main St. W. |
| Hamilton ON L8S 4K1 |
More information about the R-help
mailing list