[R] model.frame: how does one use it?

hadley wickham h.wickham at gmail.com
Fri Jun 15 21:29:31 CEST 2007


On 6/15/07, Deepayan Sarkar <deepayan.sarkar at gmail.com> wrote:
> On 6/15/07, Dirk Eddelbuettel <edd at debian.org> wrote:
> >
> > Philipp Benner reported a Debian bug report against r-cran-rpart aka rpart.
> > In short, the issue has to do with how rpart evaluates a formula and
> > supporting arguments, in particular 'weights'.
> >
> > A simple contrived example is
> >
> > -----------------------------------------------------------------------------
> > library(rpart)
> >
> > ## using data from help(rpart), set up simple example
> > myformula <- formula(Kyphosis ~ Age + Number + Start)
> > mydata <- kyphosis
> > myweight <- abs(rnorm(nrow(mydata)))
> >
> > goodFunction <- function(mydata, myformula, myweight) {
> >   hyp <- rpart(myformula, data=mydata, weights=myweight, method="class")
> >   prev <- hyp
> > }
> > goodFunction(mydata, myformula, myweight)
> > cat("Ok\n")
> >
> > ## now remove myweight and try to compute it inside a function
> > rm(myweight)
> >
> > badFunction <- function(mydata, myformula) {
> >   myweight <- abs(rnorm(nrow(mydata)))
> >   mf <- model.frame(myformula, mydata, myweight)
> >   print(head(df))
> >   hyp <- rpart(myformula,
> >                data=mf,
> >                weights=myweight,
> >                method="class")
> >   prev <- hyp
> > }
> > badFunction(mydata, myformula)
> > cat("Done\n")
> > -----------------------------------------------------------------------------
> >
> > Here goodFunction works, but only because myweight (with useless random
> > weights, but that is not the point here) is found from the calling
> > environment.
> >
> > badFunction fails after we remove myweight from there:
> >
> > :~> cat /tmp/philipp.R | R --slave
> > Ok
> > Error in eval(expr, envir, enclos) : object "myweight" not found
> > Execution halted
> > :~>
> >
> > As I was able to replicate it, I reported this to the package maintainer.  It
> > turns out that seemingly all is well as this is supposed to work this way,
> > and I got a friendly pointer to study model.frame and its help page.
> >
> > Now I am stuck as I can't make sense of model.frame -- see badFunction
> > above. I would greatly appreciate any help in making rpart work with a local
> > argument weights so that I can tell Philipp that there is no bug.  :)
>
> I don't know if ?model.frame is the best place page to look. There's a
> more detailed description at
>
> http://developer.r-project.org/nonstandard-eval.pdf
>
> but here are the non-standard evaluation rules as I understand them:
> given a name in either (1) the formula or (2) ``special'' arguments like
> 'weights' in this case, or 'subset', try to find the name
>
> 1. in 'data'
> 2. failing that, in environment(formula)
> 3. failing that, in the enclosing environment, and so on.
>
> By 'name', I mean a symbol, such as 'Age' or 'myweight'.  So
> basically, everything is as you would expect if the name is visible in
> data, but if not, the search starts in the environment of the formula,
> not the environment where the function call is being made (which is
> the standard evaulation behaviour).  This is a feature, not a bug
> (things would be a lot more confusing if it were the other way round).

Could you give an example?  It's always seemed confusing to me and I
don't see why looking in the environment of the formula helps.

Hadley



More information about the R-help mailing list