[Rd] issues with environment handling in model.frame()

Berry, Charles ccberry @end|ng |rom he@|th@uc@d@edu
Sat May 2 20:09:35 CEST 2020



> On May 2, 2020, at 5:30 AM, Antoine Fabri <antoine.fabri using gmail.com> wrote:
> 
> Dear all,
> 
> model.frame behaves in a way I don't expect when both its formula and
> subset argument are passed through a function call.
> 

See the help page

?formula

in particular the section headed 'Environments'.

Then look at the help for 

?model.frame

which says

"All the variables in formula, subset and in ... are looked for first in data and then in the environment of formula (see the help for formula for further details) and collected into a data frame."


> This works as expected:
> 
> model.frame(~wool, warpbreaks, breaks < 15)
> #>    wool
> #> 14    A
> #> 23    A
> #> 29    B
> #> 50    B
> fun1 <- function(y) model.frame(~wool, warpbreaks, y)
> fun1(with(warpbreaks, breaks < 15))
> #>    wool
> #> 14    A
> #> 23    A
> #> 29    B
> #> 50    B
> 

Here the fornmula `~wool' has the environment created when fun1 is called and so does `y'.

So model.frame finds them both after first looking for them in `warpbreaks'.


> but this doesn't:
> 
> fun2 <- function(x, y) model.frame(x, warpbreaks, y)
> fun2(~wool, with(warpbreaks, breaks < 15))
> #> Error in eval(substitute(subset), data, env): object 'y' not found
> 


Here the formula has environment `<environment: R_GlobalEnv>' and `y' has the environment created when fun2 is called. 

So model.frame looks in warpbreaks, then in `<environment: R_GlobalEnv>' and doesn't find y in either.

Similar issues apply in the use of formulas below.

> model.frame is used by xtabs() and aggregate() so the following won't work
> either:
> fun3 <- function(x, y) xtabs(x, warpbreaks, y)
> fun3(~wool, with(warpbreaks, breaks < 15))
> #> Error in eval(substitute(subset), data, env): object 'y' not found
> 
> fun4 <- function(x, y) aggregate(x, warpbreaks, length, subset = y)
> fun4(breaks ~ wool, with(warpbreaks, breaks < 15))
> #> Error in eval(substitute(subset), data, env): object 'y' not found
> 


Each of these cases can be made to work by inserting

	environment(x) <- environment()

before the call to model.frame in each of your fun[2-4] functions.

However, this can lead to headaches downstream if you need to save the formula and use it in another function.  If this is where you are headed some time spent studying lm() and methods for "lm" objects may help.

[rest deleted]

HTH,

Chuck


More information about the R-devel mailing list