[R-sig-ME] missing data in lme, lmer, PROC MIXED

Mon Jul 28 15:05:32 CEST 2008

Ken,

Does M-Plus actually impute values for the missing cells in the model
matrix for the fixed effects? Is this a default behavior of m-plus, or
does one need to be cognizant of this and implement a particular
imputation strategy?

In general, this kind of question comes up all the time on the
multilevel listserv. There are constant suggestions that many of the
multilevel software packages automagically "handle" missing data because
they use "maximum likelihood". 

> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org 
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf 
> Of Ken Beath
> Sent: Monday, July 28, 2008 8:22 AM
> To: MHH Stevens
> Cc: R Mixed Models; Stevens,Martin Henry H. Dr.
> Subject: Re: [R-sig-ME] missing data in lme, lmer, PROC MIXED
> 
> On 28/07/2008, at 9:04 PM, M Henry H Stevens wrote:
> 
> > Thanks Ken. I have been assuming that they meant missing 
> covariates (a 
> > subject provided most of the predictors, but not all). So I take it 
> > that SAS does no imputation on its own-that the user would 
> need to do 
> > that (if they wanted?). lme does not do anything like that.
> >
> 
> Yes, neither SAS or R or most programs handle missing covariates  
> automatically. The only program I know is MPlus which is a general  
> latent variable modelling program. I turned off the missing data  
> handling as for one model it resulted in an 11 dimensional 
> integration.
> 
> Ken
> 
> > Hank
> >
> > On Sat, 2008-07-26 at 22:39 -0400, Ken Beath wrote:
> >> On 26/07/2008, at 7:28 AM, M Henry H Stevens wrote:
> >>
> >>> Hi folks,
> >>> I have colleagues who comfortably state that "missing 
> data" are ok  
> >>> in
> >>> "mixed models" - because "the program (PROC MIXED) handles missing
> >>> data
> >>> -- I have a hard time imagining what it does.
> >>>
> >>> To those of you who use both R and SAS, I was wondering 
> if you might
> >>> share insight into what these do.
> >>>
> >>> As far as I know, for lme:
> >>> 'na.action="na.omit" ' or na.exclude, removes the rows with any
> >>> missing
> >>> data.
> >>>
> >>
> >> This depends. If the missing data is the dependent and it 
> is missing
> >> at random then as mixed models are fitted using maximum 
> likelihood it
> >> will produce results that are optimal. Roughly (there are 
> some really
> >> technical definitions for missing data and I haven't 
> checked them) if
> >> we don't know the outcome and the reason it is missing isn't due to
> >> its value or the other data then we can simply leave it out of the
> >> likelihood equation it as it has no useful information. A 
> problem is
> >> when data being missing provides this sort of information 
> and is very
> >> difficult to model. An example is if observations above a certain
> >> value are more likely to be missing.
> >>
> >> An alternative method of dealing with repeated data is to produce a
> >> summary for each subject or cluster, for example by averaging the  
> >> last
> >> three visits. This doesn't correctly handle missing data 
> although the
> >> loss in efficiency is usually small and it can work well, provided
> >> only a small proportion is missing.
> >>
> >> What R and SAS don't deal with directly is missing data in the
> >> covariates. This takes a bit more work, for example using multiple
> >> imputation. Here the complete case method where an observation with
> >> any missing data is removed will result in a loss of efficiency
> >> compared to what can be achieved.
> >>
> >> Ken
> > -- 
> >
> > Dr. Hank Stevens, Associate Professor
> > 338 Pearson Hall
> > Botany Department
> > Miami University
> > Oxford, OH 45056
> >
> > Office: (513) 529-4206
> > Lab: (513) 529-4262
> > FAX: (513) 529-4243
> > http://www.cas.muohio.edu/~stevenmh/
> > http://www.cas.muohio.edu/ecology
> > http://www.muohio.edu/botany/
> >
> > "If the stars should appear one night in a thousand years, 
> how would  
> > men
> > believe and adore." -Ralph Waldo Emerson, writer and philosopher
> > (1803-1882)
> >
> >
> >
> >
> >
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>