[R] Caution on the use of model.matrix.
Rolf Turner
rolf at math.unb.ca
Thu Jun 2 21:14:34 CEST 2005
Brian Ripley wrote:
> <snip> But the real problem is more likely that Rolf has not passed
> model.matrix a model frame, so it calls model.frame() internally. The
> help page is a bit confused in that it says
>
> data: a data frame created with 'model.frame'.
>
> which the default for the argument is not. So a better solution would
> then be to call model.frame and pass a model frame to model.matrix.
>
> delete.response() might also be useful.
>
> The suggested warning only applies if `data' is not supplied.
I don't grok this. I ***did*** supply data (in the form
of a data frame, not a model frame). My call was of the form
X <- model.matrix(fmla,XXX)
where (originally) ``fmla'' was a formula with the structure
``y ~ x + w + z'', and XXX was a data frame with columns
``y'', ``x'', ``w'', and ``z''. (The response variable ``y''
had NAs in it, which caused the problem.) The data frame XXX was
``input data''; it was not created with model.frame, but it
was data nonetheless.
I replaced the forgoing call with
X <- model.matrix(fmla[-2],XXX)
(the ``-2'' causing the ``y'' part of the formula to
be discarded) and got the results I wanted.
There may be a better way of achieving my goal, but
I'm happy with my method --- unless someone points out
lurking hazzards that have so far not been apparent to me.
I merely wanted to point out to others the somewhat
unintuitive behaviour of model.matrix.
cheers,
Rolf Turner
rolf at math.unb.ca
More information about the R-help
mailing list