[R-sig-ME] Best way to handle missing data?

landon hurley ljrhurley at gmail.com
Fri Feb 27 06:36:22 CET 2015

Hash: SHA512

On 02/26/2015 09:30 PM, Bonnie Dixon wrote:
> Dear list;
> A member of my dissertation committee who uses SAS, recommended that I use
> full information maximum likelihood estimation (FIML) (described here:
> http://www.statisticalhorizons.com/wp-content/uploads/MissingDataByML.pdf),
> which is the easiest way to handle missing data in SAS.  Is there an
> equivalent procedure in R?

If you are interested in having maximum likelihood methods, you can use
either ml or reml, specified with the method flag for the nlme command.
However, ml is the default method for estimating parameters for nlme,
and you shouldn't need to do anything at all, outside specify the model.
- From your email, it seems that you are saying that the number of
observations/groups is not reporting the number that you are expecting
there to be though. Is that correct? This is assuming you are content
with the multivariate normal assumption, and are not trying to analyse
discrete outcomes.

> I actually did try mice also (method "2l.norm"), but it seemed that Amelia
> was preferable for imputation.  Mice seems to only be able to impute one
> variable, whereas Amelia can impute as many variables as have missing data
> producing 100% complete data sets as output.

Mice will impute the entire dataset. Off hand, I believe the syntax
would look something like mice(data, m= , method= , maxit= ), where m is
the number of independent datasets being imputed (generally you want
25+), maxit being at least 10, and the method being a vector of
character indications of how you want to impute each of the variables,
in the same order that the appear if you use the command names(data). If
you specified 2l.norm, it should have attempted to impute all the
variables using that method, which may not have worked. What mice does
is impute each marginal variable, using the other variables to predict
the true value, done the number of times (random draws using Gibbs
sampling) within each imputation that is specified with the maxit flag,
for m times.

Again, nlme is by default using maximum likelihood though --you
shouldn't need to change anything, as long as you are content with the
MVN and missing at random assumptions for your data.


- -- 
Violence is the last refuge of the incompetent.
Version: GnuPG v1.4.11 (GNU/Linux)


More information about the R-sig-mixed-models mailing list