[R-sig-ME] Best way to handle missing data?

Fri Feb 27 06:36:22 CET 2015

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 02/26/2015 09:30 PM, Bonnie Dixon wrote:
> Dear list;
> 
> A member of my dissertation committee who uses SAS, recommended that I use
> full information maximum likelihood estimation (FIML) (described here:
> http://www.statisticalhorizons.com/wp-content/uploads/MissingDataByML.pdf),
> which is the easiest way to handle missing data in SAS.  Is there an
> equivalent procedure in R?

If you are interested in having maximum likelihood methods, you can use
either ml or reml, specified with the method flag for the nlme command.
However, ml is the default method for estimating parameters for nlme,
and you shouldn't need to do anything at all, outside specify the model.
- From your email, it seems that you are saying that the number of
observations/groups is not reporting the number that you are expecting
there to be though. Is that correct? This is assuming you are content
with the multivariate normal assumption, and are not trying to analyse
discrete outcomes.

> I actually did try mice also (method "2l.norm"), but it seemed that Amelia
> was preferable for imputation.  Mice seems to only be able to impute one
> variable, whereas Amelia can impute as many variables as have missing data
> producing 100% complete data sets as output.

Mice will impute the entire dataset. Off hand, I believe the syntax
would look something like mice(data, m= , method= , maxit= ), where m is
the number of independent datasets being imputed (generally you want
25+), maxit being at least 10, and the method being a vector of
character indications of how you want to impute each of the variables,
in the same order that the appear if you use the command names(data). If
you specified 2l.norm, it should have attempted to impute all the
variables using that method, which may not have worked. What mice does
is impute each marginal variable, using the other variables to predict
the true value, done the number of times (random draws using Gibbs
sampling) within each imputation that is specified with the maxit flag,
for m times.

Again, nlme is by default using maximum likelihood though --you
shouldn't need to change anything, as long as you are content with the
MVN and missing at random assumptions for your data.

landon

- -- 
Violence is the last refuge of the incompetent.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBCgAGBQJU8AJWAAoJEDeph/0fVJWsJ58P/R06GjLjjdaRTJPTT3/6d4xr
EkcQmW1+bH8NZkSBlUzYk/CVmZ/EGK71KIjcdSTzDusAyh9neyXvh5zQiPU287Tl
VRQlOtbLlgoW0rE+x0uFd6PLwsCQRkck2upSU4sCyEpq+/ZSkGUTuE2VsUVCu27y
z4Ecl9sw+s93IpJGj91b9PjdH8g8RysZR7CH/FCfvpzXrRalFTtC75oP8VXEdMWp
rYTqh2/sCds29x/qbS1oxrlWSN0/NuYeTgBE+uCYZ4QxTmQO8JmJA9Sn0k5kKbjU
l1RiZhd48vUj6BFpKCw6HDn1jBVeURXVPlUOBXCFDg13vJBhYdnZAR/nRGQe3dqG
leA/+Ajyyu+fHxlN7T73Nk7nYSM2YfVYJcBT+ALtqf2XWXaHti5rQMi0YaaEI3TN
tTzAEDTjYbt0WCJ4er+pXCcZIVBUoepFH708XFL8LNZ95E/qmsKTTydN+PPmjzIJ
OpGOjDx1Xk0Xc8rKGhAJ/hJbDd7bqmaqrkfa2ydxSd20IPlGMPlx3Fk+2K2l+JyF
qYI7Y3+qGd0YSOGacg+uwEGt6KSEvWsbrx2Vfreifi0p1H4koSySqccaCBvDhVKu
0BBPoG7ErZ0bTpDWQrAChtPAb2jYEbBLCtdKqKezNHFw5/tNEKQFAUvVSu0OByeY
4IG8phi2yApsZ4yEdt/v
=DiLG
-----END PGP SIGNATURE-----