[R] invalid variable type in model.frame within a function
Ingmar Visser
I.Visser at uva.nl
Fri Mar 24 13:52:45 CET 2006
> On Thu, 23 Mar 2006, Ingmar Visser wrote:
>
>> Dear expeRts,
>>
>> I came across the following error in using model.frame:
>>
>> # make a data.frame
>> jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10))
>> # spec of formula
>> mf1=y~x1+x2
>> # make the model.frame
>> mf=model.frame(formula=mf1,data=jet,weights=rvar)
>>
>> Which gives the desired output:
> <output snipped>
>> However, doing this inside another function like this:
>>
>> makemodelframe <- function(formula,data,weights) {
>> mf=model.frame(formula=formula,data=data,weights=weights)
>> mf
>> }
>>
>> produces the following error:
>>
>>> makemodelframe(mf1,jet,weights=rvar)
>> Error in model.frame(formula, rownames, variables, varnames, extras,
>> extranames, :
>> invalid variable type
>>
>>
>> Searching the R-help archives I came across bug-reports about this but
>> couldn't figure out whehter the bug was solved or whether there are
>> work-arounds available.
>
> It's not a bug. There have been bug reports about related issues (and also
> about this issue, but they tend to be marked "not a bug").
>
> If you think about it, how could
> makemodelframe(mf1,jet,weights=rvar)
>
> possibly work?
>
> R passes variables by value, so rvar has to be evaluated before the
> function is called. But rvar is not the name of any global
> variable (it's just a column in data frame), so how can R know where to
> look?
>
> The reason that people think it might work is by analogy with model.frame
> and the regression commands, where
> model.frame(y~x, data=d, weights=w)
> does somehow retrieve d$w as the weight. This analogy tends to override
> programming commonsense and make people believe that R will somehow know
> where to find the weights.
>
> Now, since model.frame() *does* manage to find the weights, it must be
> possible, and it is. That doesn't make it a good idea, though. Regression
> commands and model.frame() do some fairly advanced trickery to make it
> work. This is documented on developer.r-project.org.
>
> I don't think it's a good idea for people to write code like this. I
> should admit (especially since it's Lent at the moment, and so is an
> appropriate time to repent one's past errors) that I lobbied Ross and
> Robert to make model.frame() work compatibly with S-PLUS in its treatment
> of weights= arguments (when porting the survival package, nearly ten
> years ago). They were reluctant at the time, and I now think they were
> right, although this level of S-PLUS compatibility might have been
> unavoidable.
>
> I would advise writing your code so that you the call looks like
> makemodelframe(mf1,jet,weights=~rvar)
> That is, pass all the variables that are going to be evaluated in the
> data= argument as formulas (or as quoted expressions). This is basically
> what lme() does, where you supply two formulas and then various other bits
> and pieces as objects. It is what my survey package does.
>
> Then a user can do
> makemodelframe(mf1,jet,weights=rvar)
> if rvar is a variable in the current environment and
> makemodelframe(mf1,jet,weights=~rvar)
> if rvar is a variable in the data= argument, and both will work.
I'm still getting the same error using:
> jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10))
> # spec of formula
> mf1=y~x1+x2
>
> makemodelframe <- function(formula,data,weights) {
+ mf=model.frame(formula=formula,data=data,weights=weights)
+ mf
+ }
>
> makemodelframe(mf1,jet,weights=jet$rvar)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
invalid variable type
> makemodelframe(mf1,jet,weights=~rvar)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
invalid variable type
> There is some discussion of this in a note on "Nonstandard evaluation" on
> the developer.r-project.org webpage, including a function that will
> produce a single model frame from multiple formulas.
>
>
> Now, I think there are some exceptions to this recommendation, and I don't
> have a very clear definition of them. I think of them as "macro-like"
> functions that evaluate a supplied expression in some special context
> Functions like this in base R include with() and capture.output(), and you
> will find some more nice simple examples in the mitools package. For these
> functions it really isn't ambiguous where the evaluation takes place. A
> related issue is functions such as the plot() methods that use the
> unevaluated forms of their arguments as labels. Again, the evaluation
> of the labels isn't ambiguous, because it doesn't even happen.
>
> With a few exceptions like these, though, I think its a bad idea
> to subvert the pass-by-value illusion in R. This was a lot more than you
> probably wanted to know.
ingmar
More information about the R-help
mailing list