[R-sig-ME] NAs in fixed effects
Ben Bolker
bbolker at gmail.com
Sun Aug 14 16:05:06 CEST 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11-01-05 08:18 AM, Henrik Thurfjell wrote:
> Dear listmembers
>
> I have a rather large dataset which needs one random effect to be
> analysed properly (group). I also have many explanatory variables,
> each with a few NAs at different places. I can easily enough fit a
> model with 2 fixed effects, but as the number of fixed effects
> increase so does tha NAs as they are not in the same rows. I am no
> statistician, and this may be a naive question, but is there a way to
> fit each fixed efefct with its full data? My first ide was to use
> lme, where na.pass works, but I found this comment by Bates;
>
> "I don't think you want to use na.pass here. The underlying C code
> for fitting lme or lmer models doesn't take kindly to finding NA's in
> the data."
>
> I couldnt make na.pass work in any other package dealing with mixed
> models.
>
> there are a few NAs in the data scattered throughout, eliminating the
> data severly (although each single variable only have 0-10%NA)
>
> This may illustrate what my problem is;
>
> x<-c(1:10) y<-c(1:10) pa<-c(NA,2:10) pb<-c(1,NA,3:10)
> pc<-c(1:2,NA,4:10) pd<-c(1:3,NA,5:10) pe<-c(1:4,NA,6:10)
> pf<-c(1:5,NA,7:10) group<-factor(rep(c("A","B"), each=5))
>
> Ignore that the data is not enough to analyse and only has two levels
> of the random effect, that is not the important bit. I want my model
> (x~y+pa+pb+pc+pd+pe+pf+(1|group)) to use 10 values on y and 9 values
> on the p variables. not 4 values on all.
>
> Is that even possible?
This is a can of worms -- I don't think there is a universally
accepted, simple solution.
I would recommend that you start by finding a copy of Frank Harrell's
book on _Regression Modeling Strategies_ and read the chapter on
imputation. As a very quick and dirty start, you might consider
replacing the NAs in your data with the means or medians of the other
values: there is an impute() function in the e1071 package that does
this (also see the mclust and Hmisc packages for more complex imputation
strategies).
Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk5H1hIACgkQc5UpGjwzenOGjwCdF6njWcisYICdeyv61jbykc/a
yQgAnjGhSkvgM214tdtQK7+pjR9UKaJW
=71Ra
-----END PGP SIGNATURE-----
More information about the R-sig-mixed-models
mailing list