[R-sig-ME] missing data in lme, lmer, PROC MIXED
Ken Beath
kjbeath at kagi.com
Sun Jul 27 04:39:26 CEST 2008
On 26/07/2008, at 7:28 AM, M Henry H Stevens wrote:
> Hi folks,
> I have colleagues who comfortably state that "missing data" are ok in
> "mixed models" - because "the program (PROC MIXED) handles missing
> data
> -- I have a hard time imagining what it does.
>
> To those of you who use both R and SAS, I was wondering if you might
> share insight into what these do.
>
> As far as I know, for lme:
> 'na.action="na.omit" ' or na.exclude, removes the rows with any
> missing
> data.
>
This depends. If the missing data is the dependent and it is missing
at random then as mixed models are fitted using maximum likelihood it
will produce results that are optimal. Roughly (there are some really
technical definitions for missing data and I haven't checked them) if
we don't know the outcome and the reason it is missing isn't due to
its value or the other data then we can simply leave it out of the
likelihood equation it as it has no useful information. A problem is
when data being missing provides this sort of information and is very
difficult to model. An example is if observations above a certain
value are more likely to be missing.
An alternative method of dealing with repeated data is to produce a
summary for each subject or cluster, for example by averaging the last
three visits. This doesn't correctly handle missing data although the
loss in efficiency is usually small and it can work well, provided
only a small proportion is missing.
What R and SAS don't deal with directly is missing data in the
covariates. This takes a bit more work, for example using multiple
imputation. Here the complete case method where an observation with
any missing data is removed will result in a loss of efficiency
compared to what can be achieved.
Ken
More information about the R-sig-mixed-models
mailing list