[R] invalid variable type in model.frame within a function

Ingmar Visser I.Visser at uva.nl
Fri Mar 24 13:52:45 CET 2006


> On Thu, 23 Mar 2006, Ingmar Visser wrote:
> 
>> Dear expeRts,
>> 
>> I came across the following error in using model.frame:
>> 
>> # make a data.frame
>> jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10))
>> # spec of formula
>> mf1=y~x1+x2
>> # make the model.frame
>> mf=model.frame(formula=mf1,data=jet,weights=rvar)
>> 
>> Which gives the desired output:
> <output snipped>
>> However, doing this inside another function like this:
>> 
>> makemodelframe <- function(formula,data,weights) {
>>    mf=model.frame(formula=formula,data=data,weights=weights)
>>    mf
>> }
>> 
>> produces the following error:
>> 
>>> makemodelframe(mf1,jet,weights=rvar)
>> Error in model.frame(formula, rownames, variables, varnames, extras,
>> extranames,  :
>>    invalid variable type
>> 
>> 
>> Searching the R-help archives I came across bug-reports about this but
>> couldn't figure out whehter the bug was solved or whether there are
>> work-arounds available.
> 
> It's not a bug. There have been bug reports about related issues (and also
> about this issue, but they tend to be marked "not a bug").
> 
> If you think about it, how could
>     makemodelframe(mf1,jet,weights=rvar)
> 
> possibly work?
> 
> R passes variables by value, so rvar has to be evaluated before the
> function is called. But rvar is not the name of any global
> variable (it's just a column in data frame), so how can R know where to
> look?
> 
> The reason that people think it might work is by analogy with model.frame
> and the regression commands, where
>    model.frame(y~x, data=d, weights=w)
> does somehow retrieve d$w as the weight.  This analogy tends to override
> programming commonsense and make people believe that R will somehow know
> where to find the weights.
> 
> Now, since model.frame() *does* manage to find the weights, it must be
> possible, and it is.  That doesn't make it a good idea, though. Regression
> commands and model.frame() do some fairly advanced trickery to make it
> work. This is documented on developer.r-project.org.
> 
> I don't think it's a good idea for people to write code like this. I
> should admit (especially since it's Lent at the moment, and so is an
> appropriate time to repent one's past errors) that I lobbied Ross and
> Robert to make model.frame() work compatibly with S-PLUS in its treatment
> of weights= arguments (when porting the survival package, nearly ten
> years ago).  They were reluctant at the time, and I now think they were
> right, although this level of S-PLUS compatibility might have been
> unavoidable.
> 
> I would advise writing your code so that you the call looks like
>    makemodelframe(mf1,jet,weights=~rvar)
> That is, pass all the variables that are going to be evaluated in the
> data= argument as formulas (or as quoted expressions).  This is basically
> what lme() does, where you supply two formulas and then various other bits
> and pieces as objects. It is what my survey package does.
> 
> Then a user can do
>    makemodelframe(mf1,jet,weights=rvar)
> if rvar is a variable in the current environment and
>    makemodelframe(mf1,jet,weights=~rvar)
> if rvar is a variable in the data= argument, and both will work.

I'm still getting the same error using:

> jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10))
> # spec of formula
> mf1=y~x1+x2
> 
> makemodelframe <- function(formula,data,weights) {
+     mf=model.frame(formula=formula,data=data,weights=weights)
+     mf
+ }
> 
> makemodelframe(mf1,jet,weights=jet$rvar)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  : 
    invalid variable type
> makemodelframe(mf1,jet,weights=~rvar)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  : 
    invalid variable type

> There is some discussion of this in a note on "Nonstandard evaluation" on
> the developer.r-project.org webpage, including a function that will
> produce a single model frame from multiple formulas.
> 
> 
> Now, I think there are some exceptions to this recommendation, and I don't
> have a very clear definition of them. I think of them as "macro-like"
> functions that evaluate a supplied expression in some special context
> Functions like this in base R include with() and capture.output(), and you
> will find some more nice simple examples in the mitools package. For these
> functions it really isn't ambiguous where the evaluation takes place.  A
> related issue is functions such as the plot() methods that use the
> unevaluated forms of their arguments as labels. Again, the evaluation
> of the labels isn't ambiguous, because it doesn't even happen.
> 
> With a few exceptions like these, though, I think its a bad idea
> to subvert the pass-by-value illusion in R. This was a lot more than you
> probably wanted to know.

ingmar




More information about the R-help mailing list