[R-sig-ME] Using broom.mixed library with lme4

Wed Nov 11 07:52:19 CET 2020

If you're only interested in the pooled estimate of the random effect, not its standard error (or within-imputation variance) then you can just take its mean across the imputations.

Additionally, the miceadds package has a functionality for pooling both fixed and random effects from lme4. The summary gives standard errors for the random effects, but I don't know how they've been derived and (as Phillip said) I wouldn't recommend using or relying on them for inference.

A quick example is:

library(lme4)
library(miceadds)

data("sleepstudy")
sleepstudy$Reaction[sample(1:180,30)] = NA

imp <- mice(sleepstudy, m = 5, print=F)
fit <- with(imp, lmer(Reaction ~ Days + (Days | Subject)))

summary(lmer_pool(fit$analyses))

Dr Andrew Johnson
BPsych(Hons), MBiostat, PhD, GStat.
Research Associate | School of Psychology

Curtin University 
Email | andrew.johnson using curtin.edu.au 
Web | www.curtin.edu.au 

CRICOS Provider Code 00301J 

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> On Behalf Of Simon Harmel
Sent: Wednesday, 11 November 2020 8:42 AM
To: Phillip Alday <phillip.alday using mpi.nl>
Cc: r-sig-mixed-models <r-sig-mixed-models using r-project.org>; Zach Simpson <zpsimpso using gmail.com>
Subject: Re: [R-sig-ME] Using broom.mixed library with lme4

Much appreciated Phillip!

On Tue, Nov 10, 2020 at 6:23 PM Phillip Alday <phillip.alday using mpi.nl> wrote:

> As I indicated in my initial response, I'm guessing no because mice 
> depends on the model fitting function returning standard errors for 
> imputed estimates and lme4 does not do that for random effects. (The 
> sampling distribution of the random effects is very skewed, so 
> standard errors really don't make much sense as a summary.)
>
> Although I really, really don't recommend it, there are R packages 
> implementing standard errors for variance components in lme4. (To show 
> my lack of recommendation, I'm not providing links.) With those, you 
> could then compute the pooled estimates / implement the necessary 
> functions to have mice do it for you.
>
> Phillip
>
> On 11/11/20 12:16 am, Simon Harmel wrote:
> > Sorry guys, I found what the problem was! I didn't use `summary()` 
> > after the pool() call! However, it there a way to find out how the 
> > estimates of random-effects can be pooled across the imputations?
> >
> > On Tue, Nov 10, 2020 at 5:01 PM Phillip Alday <phillip.alday using mpi.nl 
> > <mailto:phillip.alday using mpi.nl>> wrote:
> >
> >     I'm a bit confused now and think a point is being missed, 
> > although
> I'm
> >     not sure where.
> >
> >     Zach pointed out that the Rubin Rule is being applied. Both Zach 
> > and
> I
> >     pointed out that you still get the pooled estimate and that the 
> > ubar
> and
> >     b columns are uncertainty on the pooled estimate. In some quick tries
> >     with example datasets, it seems that you don't get the ubar and b
> >     columns for lm() fits, but I'm guessing that this means that it's not
> >     possible to break down the variance of t (total variance of estimate)
> >     into between and within estimates for non mixed models.
> >
> >     In other words, it seems that pool does more for lmer() than it 
> > does
> for
> >     lm(), not less.
> >
> >     Your original question asked about the random effects, and we pointed
> >     out that the the pool() table doesn't have the random effects 
> > from
> the
> >     lmer() fit.
> >
> >     This becomes even clearer if you call summary() on the value of
> pool():
> >
> >     > summary(pool(fit))
> >              term  estimate  std.error statistic        df      p.value
> >     1 (Intercept) 4.8932366 0.08966606  54.57178 521.81717 0.000000e+00
> >     2         sex 0.8772664 0.05428838  16.15938  11.17712 4.226618e-09
> >
> >     (and that use of summary() is from the documentation Zach linked).
> >
> >     Can you try to pose your question a different way if we're
> >     misunderstanding?
> >
> >     Best,
> >     Phillip
> >
> >     On 10/11/20 11:26 pm, Simon Harmel wrote:
> >     > Thanks.  I think a point is being missed. I did have carefully
> >     read the
> >     > documentation.  And pool() is in fact supposed to apply the 
> > RUBIN
> RULE
> >     > and pool across the estimates from 5 imputed datasets.
> >     >
> >     > For lm() objects, pool() does what I describe above. But it 
> > seems,
> it
> >     > doesn't do the same for the lmer() objects.
> >     >
> >     > That was really the core of my question and l was wondering if a
> >     > solution to it might be available?
> >     >
> >     >
> >     > There is no documentation on that.
> >     >
> >     > On Tue, Nov 10, 2020, 3:28 PM Zach Simpson <zpsimpso using gmail.com
> >     <mailto:zpsimpso using gmail.com>
> >     > <mailto:zpsimpso using gmail.com <mailto:zpsimpso using gmail.com>>> wrote:
> >     >
> >     >     Just to add to Phillip's answer, the ubar has to do with the
> >     multiple
> >     >     imputation procedure: ubar is the within-imputation variance
> >     of the
> >     >     sex term. You fitted 5 lmer models on 5 'completed' datasets
> whose
> >     >     NA's were filled using default imputation procedures from mice.
> >     >     mice::pool() combines these m=5 fits using Rubin's rules.
> There's
> >     >     uncertainty in model estimates due to the data itself as well
> >     as the
> >     >     imputation uncertainty.
> >     >
> >     >     Stef van Buuren has put online an enormous amount of
> >     documentation,
> >     >     which would pay dividends to read:
> >     >
> >     >     https://amices.org/mice/
> >     >
> >     >     HTH
> >     >     Zach
> >     >
> >     >     > Why do you think ubar is for the random effects? In my very
> >     quick skim
> >     >     > of the documentation, I didn't see anything indicating that.
> >     >     Looking at
> >     >     > the structures in `fit`, I see:
> >     >     >
> >     >     > > tidy(fit$analyses[[5]])
> >     >     > # A tibble: 4 x 6
> >     >     >   effect   group    term            estimate std.error
> statistic
> >     >     >   <chr>    <chr>    <chr>              <dbl>     <dbl>
>  <dbl>
> >     >     > 1 fixed    NA       (Intercept)        4.91     0.0855
> 57.4
> >     >     > 2 fixed    NA       sex                0.853    0.0350
> 24.4
> >     >     > 3 ran_pars school   sd__(Intercept)    0.820   NA
>  NA
> >     >     > 4 ran_pars Residual sd__Observation    0.767   NA
>  NA
> >     >     >
> >     >     > None of the random effects line up with the output of pool().
> >     >     Moreover,
> >     >     > the pool() documentation notes that it needs the standard
> >     error of
> >     >     each
> >     >     > estimate, but lme4 doesn't produce those (for good reason)
> >     for random
> >     >     > effects, so pool() won't produce pooled estimates for the
> random
> >     >     effects.
> >     >     >
> >     >     > The pool() documentation mentions the mipo class, so I
> >     looked at ?mipo
> >     >     > and found this:
> >     >     >
> >     >     >
> >     >     >        ‘estimate’  Pooled complete data estimate
> >     >     >        ‘ubar’      Within-imputation variance of ‘estimate’
> >     >     >        ‘b’         Between-imputation variance of ‘estimate’
> >     >     >        ‘t’         Total variance, of ‘estimate’
> >     >     >        ‘dfcom’     Degrees of freedom in complete data
> >     >     >        ‘df’        Degrees of freedom of $t$-statistic
> >     >     >        ‘riv’       Relative increase in variance
> >     >     >        ‘lambda’    Proportion attributable to the missingness
> >     >     >        ‘fmi’       Fraction of missing information
> >     >     >
> >     >     >
> >     >     > So `ubar` and `b` are perhaps random effects, but not in the
> >     sense
> >     >     > you're thinking of, but rather the random effects that go
> into
> >     >     > imputation procedures (this is a guess on my part). I don't
> >     know much
> >     >     > about imputation, but I suspect this is analogous to the
> >     parallels
> >     >     > between mixed models and meta-analysis
> >     >     >
> >     (http://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer).
> But
> >     >     > again, this is rapidly getting out of my area of expertise
> and
> >     >     into the
> >     >     > expertise of other members of this list (e.g. Wolfgang
> >     Viechtbauer for
> >     >     > meta analysis).
> >     >     >
> >     >     > Phillip
> >     >     >
> >     >     > On 10/11/20 7:17 am, Simon Harmel wrote:
> >     >     > > Dear All,
> >     >     > >
> >     >     > > Belwo, I've used library `broom.mixed` and imputed some
> data
> >     >     with library
> >     >     > > `mice` to then fit a "random-intercept" `lmer()` model.
> >     >     > >
> >     >     > > BUT I wonder why after I `pool()` my analyses, there is an
> >     extra
> >     >     "ubar"
> >     >     > > (random-effect) for slope (`sex`) which is not even in the
> >     model?!
> >     >     > >
> >     >     > > library(mice)
> >     >     > > library(lme4)
> >     >     > > library(broom.mixed)
> >     >     > >
> >     >     > > imp <- mice(popmis, m = 5) # `popmis` is a dataset from
> `mice`
> >     >     > >
> >     >     > > fit <- with(data = imp, exp = lme4::lmer(popular ~ sex +
> >     >     (1|school)))
> >     >     > >
> >     >     > > pool(fit)
> >     >     > >
> >     >     > > ### `ubar` is the random effect for intercept
> >     (0.007524509) BUT
> >     >     WHY we see
> >     >     > > a ubar ALSO for `sex` (0.001177781)?
> >     >     > >
> >     >     > > Class: mipo    m = 5
> >     >     > >          term m  estimate        ubar            b
> >        t dfcom
> >     >     > >  df
> >     >     > > 1 (Intercept) 5 4.9007789 0.007524509 0.0004845564
> >     0.008105977  1996
> >     >     > > 547.44383
> >     >     > > 2         sex 5 0.8617941 0.001177781 0.0015867795
> >     0.003081916  1996
> >     >     > >  10.33653
> >     >     > >         riv     lambda       fmi
> >     >     > > 1 0.0772765 0.07173321 0.0751060
> >     >     > > 2 1.6167147 0.61784141 0.6751515
> >     >     > >
> >     >     > >       [[alternative HTML version deleted]]
> >     >     > >
> >     >     > > _______________________________________________
> >     >     > > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>
> >     >     <mailto:R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>> mailing list
> >     >     > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     >     > >
> >     >
> >
>

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models using r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models