[R-sig-ME] Best way to handle missing data?
bmdixon at ucdavis.edu
Mon Mar 2 20:37:49 CET 2015
Thanks for this suggestion, Malcolm. Here is an example in which I use
Amelia/Zelig with the "africa" data set that is available in Amelia.
I extracted the average standard deviation of the random effects from the
result produced by Zelig. (In this example, I am using the version of the
summary.MI function found here:
Perhaps this approach will work for my purposes.
# Get packages
# Look at the data
# Impute the missing data
amelia(x = africa,
m = 30,
cs = "country",
ts = "year",
logs = "gdp_pc")
# Create a model:
zelig(formula = gdp_pc ~ infl + tag(infl | country),
data = africa.am$imputations,
model = "ls.mixed")
# The combined fixed effects:
# The average standard deviation of the random intercepts and slopes:
On Fri, Feb 27, 2015 at 4:47 AM, Malcolm Fairbrother <
M.Fairbrother at bristol.ac.uk> wrote:
> Hi Bonnie,
> I have not seen a formal treatment of this issue, but from the Amelia
> documentation, my understanding is that if you want an estimate of the
> random effects variance, you can just take the average of the estimates
> from the model fitted to each imputed dataset. This is true for any
> parameter, from the sounds of what Honaker, King, and Blackwell have
> "you can combine directly and use as the multiple imputation estimate of
> this parameter, q ̄, the average of them separate estimates"
> Even if Zelig doesn't report the RE variance estimates automatically, they
> must be "in there" somewhere... I'm sure you can extract them. Or maybe
> skip Zelig, and just use Amelia, and extract the estimated RE variances
> from each fitted model (presumably using lme4)?
> Date: Thu, 26 Feb 2015 21:20:33 -0800
>> From: Bonnie Dixon <bmdixon at ucdavis.edu>
>> To: Mitchell Maltenfort <mmalten at gmail.com>
>> Cc: "r-sig-mixed-models at r-project.org"
>> <r-sig-mixed-models at r-project.org>
>> Subject: Re: [R-sig-ME] Best way to handle missing data?
>> I actually did try mice also (method "2l.norm"), but it seemed that Amelia
>> was preferable for imputation. Mice seems to only be able to impute one
>> variable, whereas Amelia can impute as many variables as have missing data
>> producing 100% complete data sets as output.
>> However, most of the missing data in the data set I am working with is in
>> just one variable, so I could consider using mice, and just imputing the
>> variable that has the most missing data, while omitting observations that
>> have missing data in any of the other variables. But the pooled results
>> from mice only seem to include the fixed effects of the model, so this
>> still leaves me wondering how to report the random effects, which are very
>> important to my research question.
>> When using Amelia to impute, the packages Zelig and ZeligMultilevel can be
>> used to combine the results from each of the models. But again, only the
>> fixed effects seem to be included in the output, so I am not sure how to
>> report on the random effects.
>> On Thu, Feb 26, 2015 at 8:33 PM, Mitchell Maltenfort <mmalten at gmail.com>
>> > Mice might be the package you need
>> > On Thursday, February 26, 2015, Bonnie Dixon <bmdixon at ucdavis.edu>
>> >> Dear list;
>> >> I am using nlme to create a repeated measures (i.e. 2 level) model.
>> >> is missing data in several of the predictor variables. What is the
>> >> way to handle this situation? The variable with (by far) the most
>> >> data is the best predictor in the model, so I would not want to remove
>> >> I am also trying to avoid omitting the observations with missing data,
>> >> because that would require omitting almost 40% of the observations and
>> >> would result in a substantial loss of power.
>> >> A member of my dissertation committee who uses SAS, recommended that I
>> >> full information maximum likelihood estimation (FIML) (described here:
>> >> ),
>> >> which is the easiest way to handle missing data in SAS. Is there an
>> >> equivalent procedure in R?
>> >> Alternatively, I have tried several approaches to multiple imputation.
>> >> For
>> >> example, I used the package, Amelia, which appears to handle the
>> >> structure of the data appropriately, to generate five imputed versions
>> >> the data set, and then used lapply to run my model on each. But I am
>> >> sure how to combine the resulting five models into one final result. I
>> >> will need a final result that enables me to report, not just the fixed
>> >> effects of the model, but also the random effects variance components
>> >> ideally, the distributions across the population of the random
>> >> and slopes, and correlations between them.
>> >> Many thanks for any suggestions on how to proceed.
>> >> Bonnie
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models