[Rd] model.matrix and na.action
Ben Bolker
bbolker at gmail.com
Wed Apr 29 21:07:45 CEST 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I've finally been able to piece this together, but I wonder if I've
got it right/whether there is anywhere the behaviour of `model.matrix`
with respect to `na.action` is more *explicitly* documented.
* model.matrix() respects the 'na.action' argument associated with
its data.
* If the 'data' argument is a model frame with an "na.action"
attribute, then that is used.
* If the 'data' argument is _not_ a model frame (which does go
against the implicit suggestion of ?model.matrix), model.frame() is
used on the data, which means that by default the global na.option
setting is used.
* the intended design is that one should first construct the model
frame using an explicit `na.action` and then pass it to `model.matrix`.
(After spending a few hours figuring this out and constructing the
e-mail, it has turned from a question into a request for confirmation
... I do think a couple of extra sentences of explication in the
documentation for dummies like me wouldn't hurt, I would be happy to
submit a documentation patch if that seems worthwhile.)
- --------
I've tried looking through model.matrix.default and through the
modelmatrix function in src/library/stats/src/model.c , but it's
pretty hairy ...
Related discussion:
http://stackoverflow.com/questions/5616210/model-matrix-with-na-action-null
http://stackoverflow.com/questions/6447708/model-matrix-generates-fewer-rows-than-original-data-frame
https://stat.ethz.ch/pipermail/r-help/2008-December/183509.html
https://stat.ethz.ch/pipermail/r-help/2001-August/014483.html (BDR
says here "?model.matrix does tell you the second argument should be
the result of model.frame, which is a pretty strong hint." ...)
==========
mm <- function(newdata,form=~x,na.action=na.pass,set.opts=FALSE) {
if (set.opts) {
op <- options(na.action=na.action)
on.exit(options(op))
}
## try with raw data and with model.frame with na.action specified
X1 <- model.matrix(form, mfnew <- model.frame(newdata,
na.action=na.action))
X2 <- model.matrix(form, newdata)
return(c(any(is.na(X1[,"x"])),any(is.na(X2[,"x"]))))
}
options("na.action") ## na.omit
d <- data.frame(x=c(NA,NA,1:5))
mm(d) ## TRUE FALSE
mm(d,set.opts=TRUE) ## TRUE TRUE
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAEBAgAGBQJVQSwBAAoJEOCV5YRblxUHr8gIAIEUEuZ0nbNQGmslpnEuLEiB
mdGVemWFXSUzs/+267GxBj5LvIi3SqOfYe6nMPd6VPHB8HSAzl3Spln+6a13U566
sgNq6dmqApDOjTNGklskA1VcjPHGMx3AOANjGnObQUfLti8G+y+CYV6NnnzoT23q
eeBUobwDqs/nfWkgiQcPY2iVQYGs6q03S4jJtyFkJgs3Wqn6croIXwUFAZIsjvmp
wf6BxvFFZEtAkDHdO3nC/LtOjkeh/TBnvXjzmfI9jlyiI0wkLrdd4hoXt3TmL94y
L3nXvHf0Ntb74Gyjg9o4dGU3Gl6iZTRsW7Dqbz9PdYOWGUnQ/t5BftO3dOpKvHU=
=GZR8
-----END PGP SIGNATURE-----
More information about the R-devel
mailing list