[R-sig-ME] Using broom.mixed library with lme4

Wed Nov 11 00:44:24 CET 2020

As I indicated in my initial response, I'm guessing no because mice
depends on the model fitting function returning standard errors for
imputed estimates and lme4 does not do that for random effects. (The
sampling distribution of the random effects is very skewed, so standard
errors really don't make much sense as a summary.)

Although I really, really don't recommend it, there are R packages
implementing standard errors for variance components in lme4. (To show
my lack of recommendation, I'm not providing links.) With those, you
could then compute the pooled estimates / implement the necessary
functions to have mice do it for you.

Phillip

On 11/11/20 12:16 am, Simon Harmel wrote:
> Sorry guys, I found what the problem was! I didn't use `summary()` after
> the pool() call! However, it there a way to find out how the estimates
> of random-effects can be pooled across the imputations?
> 
> On Tue, Nov 10, 2020 at 5:01 PM Phillip Alday <phillip.alday using mpi.nl
> <mailto:phillip.alday using mpi.nl>> wrote:
> 
>     I'm a bit confused now and think a point is being missed, although I'm
>     not sure where.
> 
>     Zach pointed out that the Rubin Rule is being applied. Both Zach and I
>     pointed out that you still get the pooled estimate and that the ubar and
>     b columns are uncertainty on the pooled estimate. In some quick tries
>     with example datasets, it seems that you don't get the ubar and b
>     columns for lm() fits, but I'm guessing that this means that it's not
>     possible to break down the variance of t (total variance of estimate)
>     into between and within estimates for non mixed models.
> 
>     In other words, it seems that pool does more for lmer() than it does for
>     lm(), not less.
> 
>     Your original question asked about the random effects, and we pointed
>     out that the the pool() table doesn't have the random effects from the
>     lmer() fit.
> 
>     This becomes even clearer if you call summary() on the value of pool():
> 
>     > summary(pool(fit))
>              term  estimate  std.error statistic        df      p.value
>     1 (Intercept) 4.8932366 0.08966606  54.57178 521.81717 0.000000e+00
>     2         sex 0.8772664 0.05428838  16.15938  11.17712 4.226618e-09
> 
>     (and that use of summary() is from the documentation Zach linked).
> 
>     Can you try to pose your question a different way if we're
>     misunderstanding?
> 
>     Best,
>     Phillip
> 
>     On 10/11/20 11:26 pm, Simon Harmel wrote:
>     > Thanks.  I think a point is being missed. I did have carefully
>     read the
>     > documentation.  And pool() is in fact supposed to apply the RUBIN RULE
>     > and pool across the estimates from 5 imputed datasets.
>     >
>     > For lm() objects, pool() does what I describe above. But it seems, it
>     > doesn't do the same for the lmer() objects. 
>     >
>     > That was really the core of my question and l was wondering if a
>     > solution to it might be available?
>     >
>     >
>     > There is no documentation on that.
>     >
>     > On Tue, Nov 10, 2020, 3:28 PM Zach Simpson <zpsimpso using gmail.com
>     <mailto:zpsimpso using gmail.com>
>     > <mailto:zpsimpso using gmail.com <mailto:zpsimpso using gmail.com>>> wrote:
>     >
>     >     Just to add to Phillip's answer, the ubar has to do with the
>     multiple
>     >     imputation procedure: ubar is the within-imputation variance
>     of the
>     >     sex term. You fitted 5 lmer models on 5 'completed' datasets whose
>     >     NA's were filled using default imputation procedures from mice.
>     >     mice::pool() combines these m=5 fits using Rubin's rules. There's
>     >     uncertainty in model estimates due to the data itself as well
>     as the
>     >     imputation uncertainty.
>     >
>     >     Stef van Buuren has put online an enormous amount of
>     documentation,
>     >     which would pay dividends to read:
>     >
>     >     https://amices.org/mice/
>     >
>     >     HTH
>     >     Zach
>     >
>     >     > Why do you think ubar is for the random effects? In my very
>     quick skim
>     >     > of the documentation, I didn't see anything indicating that.
>     >     Looking at
>     >     > the structures in `fit`, I see:
>     >     >
>     >     > > tidy(fit$analyses[[5]])
>     >     > # A tibble: 4 x 6
>     >     >   effect   group    term            estimate std.error statistic
>     >     >   <chr>    <chr>    <chr>              <dbl>     <dbl>     <dbl>
>     >     > 1 fixed    NA       (Intercept)        4.91     0.0855      57.4
>     >     > 2 fixed    NA       sex                0.853    0.0350      24.4
>     >     > 3 ran_pars school   sd__(Intercept)    0.820   NA           NA
>     >     > 4 ran_pars Residual sd__Observation    0.767   NA           NA
>     >     >
>     >     > None of the random effects line up with the output of pool().
>     >     Moreover,
>     >     > the pool() documentation notes that it needs the standard
>     error of
>     >     each
>     >     > estimate, but lme4 doesn't produce those (for good reason)
>     for random
>     >     > effects, so pool() won't produce pooled estimates for the random
>     >     effects.
>     >     >
>     >     > The pool() documentation mentions the mipo class, so I
>     looked at ?mipo
>     >     > and found this:
>     >     >
>     >     >
>     >     >        ‘estimate’  Pooled complete data estimate
>     >     >        ‘ubar’      Within-imputation variance of ‘estimate’
>     >     >        ‘b’         Between-imputation variance of ‘estimate’
>     >     >        ‘t’         Total variance, of ‘estimate’
>     >     >        ‘dfcom’     Degrees of freedom in complete data
>     >     >        ‘df’        Degrees of freedom of $t$-statistic
>     >     >        ‘riv’       Relative increase in variance
>     >     >        ‘lambda’    Proportion attributable to the missingness
>     >     >        ‘fmi’       Fraction of missing information
>     >     >
>     >     >
>     >     > So `ubar` and `b` are perhaps random effects, but not in the
>     sense
>     >     > you're thinking of, but rather the random effects that go into
>     >     > imputation procedures (this is a guess on my part). I don't
>     know much
>     >     > about imputation, but I suspect this is analogous to the
>     parallels
>     >     > between mixed models and meta-analysis
>     >     >
>     (http://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer). But
>     >     > again, this is rapidly getting out of my area of expertise and
>     >     into the
>     >     > expertise of other members of this list (e.g. Wolfgang
>     Viechtbauer for
>     >     > meta analysis).
>     >     >
>     >     > Phillip
>     >     >
>     >     > On 10/11/20 7:17 am, Simon Harmel wrote:
>     >     > > Dear All,
>     >     > >
>     >     > > Belwo, I've used library `broom.mixed` and imputed some data
>     >     with library
>     >     > > `mice` to then fit a "random-intercept" `lmer()` model.
>     >     > >
>     >     > > BUT I wonder why after I `pool()` my analyses, there is an
>     extra
>     >     "ubar"
>     >     > > (random-effect) for slope (`sex`) which is not even in the
>     model?!
>     >     > >
>     >     > > library(mice)
>     >     > > library(lme4)
>     >     > > library(broom.mixed)
>     >     > >
>     >     > > imp <- mice(popmis, m = 5) # `popmis` is a dataset from `mice`
>     >     > >
>     >     > > fit <- with(data = imp, exp = lme4::lmer(popular ~ sex +
>     >     (1|school)))
>     >     > >
>     >     > > pool(fit)
>     >     > >
>     >     > > ### `ubar` is the random effect for intercept
>     (0.007524509) BUT
>     >     WHY we see
>     >     > > a ubar ALSO for `sex` (0.001177781)?
>     >     > >
>     >     > > Class: mipo    m = 5
>     >     > >          term m  estimate        ubar            b       
>        t dfcom
>     >     > >  df
>     >     > > 1 (Intercept) 5 4.9007789 0.007524509 0.0004845564
>     0.008105977  1996
>     >     > > 547.44383
>     >     > > 2         sex 5 0.8617941 0.001177781 0.0015867795
>     0.003081916  1996
>     >     > >  10.33653
>     >     > >         riv     lambda       fmi
>     >     > > 1 0.0772765 0.07173321 0.0751060
>     >     > > 2 1.6167147 0.61784141 0.6751515
>     >     > >
>     >     > >       [[alternative HTML version deleted]]
>     >     > >
>     >     > > _______________________________________________
>     >     > > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>     >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>> mailing list
>     >     > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     >     > >
>     >
>