[Rd] [R] difference in using with() and the "data" argument in glm (PR#9338)

murdoch at stats.uwo.ca murdoch at stats.uwo.ca
Fri Nov 3 15:12:29 CET 2006


I've redirected this reply from r-help to the bugs list.

On 11/3/2006 8:25 AM, vito muggeo wrote:
> Dear all,
> I am dealing with the following (apparently simple problem):
> For some reasons I am interested in passing variables from a dataframe 
> to a specific environment, and in fitting a standard glm:
> 
> dati<-data.frame(y=rnorm(10),x1=runif(10),x2=runif(10))
> KK<-new.env()
> for(i in 1:ncol(dati)) assign(names(dati[i]),dati[[i]],envir=KK)
> #Now the following two lines work correctly:
> coef(glm(y~x1+x2,data=KK))
> with(KK,coef(glm(y~x1+x2)))
> 
> #However if I write the above code inside a function, with() does not 
> appear to work..
> 
> ff<-function(Formula,Data,method=1){
>      KK<-new.env()
>      for(i in 1:ncol(Data)) assign(names(Data[i]),Data[[i]],envir=KK)
>      o<-if(method==1) glm(Formula,data=KK) else with(KK,glm(Formula))
>      o}
> 
>  > ff(y~x1+x2,dati,1) #it works
> Call:  glm(formula = Formula, data = KK)
> ..[SNIP]..
>  > ff(y~x1+x2,dati,2) #it does not
> Error in eval(expr, envir, enclos) : object "y" not found
>  >
> 
> Could anyone to explain such difference? I believed that
> "with(data,glm(formula))" and "glm(formula,data)" were equivalent.

I think this is a bug in terms.formula.  Near the end it has

     environment(terms) <- environment(x)

where x is the formula.  Since "y" isn't defined in that environment, it
fails.  It would work for you with

     environment(terms) <- data

but see below.

A workaround that should work for you is to put

environment(Formula) <- KK

before the call to glm.

I'm not going to make the patch I suggest above, because I don't think 
it's consistent with the expected behaviour of glm() in the case where 
some of the terms in the formula are supposed to come from 
environment(x), and some from "data".

I don't know how to handle that case properly:  I think it requires a 
different search strategy than R employs (but I might be wrong).  This 
isn't a problem with the workaround I suggested to you, because there 
the parent of KK is environment(x), but that wouldn't be true in general.

Duncan Murdoch



More information about the R-devel mailing list