[R-sig-ME] Perfectly correlated random effects (when they shouldn't be)

Sat Jul 18 22:29:53 CEST 2015

Genuinely had no idea about that brms package in R. Is it new? It looks
very intriguing, though the syntax I'm seeing in the reference manual
doesn't appear to be working. Do you mind if I see the code? Maybe I'm
doing it wrong.

I followed Ben's advice and pursued a blmer strategy with a disaggregated
version of the random effects variable. I ran the full models (so: not the
simple reproducible example) and got something that makes more sense. I
also corrected for various weird data points (i.e. countries whose surveys
were likely botched in execution). Here is the summary of the random
effects.

Random effects:
 Groups       Name        Variance Std.Dev. Corr
 country:wave (Intercept) 0.16091  0.4011
 country      (Intercept) 0.34949  0.5912
 region       (Intercept) 0.04604  0.2146
              eti         0.24501  0.4950    0.20
              sti         0.18997  0.4359   -0.45 -0.04
Number of obs: 208551, groups:  country:wave, 143; country, 82; region, 13

I think this makes sense. The regions ultimately don't vary that much from
each other, which I think is intuitive, given the data. The correlations
among the random slopes are importantly not 0, -1, or 1.

In a full model (with controls), the significance of the z-value associated
with x1 is 1.741 (coeff: .516, sd: .296), which is below the conventional
cutoff of z = 1.96. My field is kind of a "cult of p < .05" crowd, but I'm
not terribly bothered by this. I do believe what I'm doing here is a very
conservative test of where x1 is discernible from zero across the range of
the data.

As for the comparison between x1 in fit2_brm versus the maximal model in
lme4: I honestly don't know. I learn something new each time I do a mixed
effects analysis, but I can offer no immediate answer for that.

I definitely do appreciate the input of this listserv. I'm more than
grateful for your time and patience with me.

Best regards,
Steve

On Sat, Jul 18, 2015 at 2:40 PM, Paul Buerkner <paul.buerkner at gmail.com>
wrote:

> I managed to fit the models with bayesian techniques using the package
> brms.
>
> Because your data is so huge, I decided to run only 500 iterations, of
> which 100 are warmup (i.e. burnin). As brms uses Stan on the backend, the
> quality of the samples is relatively high so that 400 markov samples should
> be enough for our purposes.
>
> Below I present the results of the maximal model as well as the model
> withour any random effects for x1 and x2. The related plots are attached.
>
> First, the maximal model (fit1_brm):
>
> Family: bernoulli (logit)
> Formula: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 + x1 +
> x2 | region)
>    Data: data (Number of observations: 212570)
> Samples: 1 chains, each with n.iter = 500; n.warmup = 100; n.thin = 1;
>          total post-warmup samples = 400
>
> Random Effects:
> ~country (Number of levels: 82)
>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)     0.59      0.08     0.44     0.75        140    1
>
> ~country:wave (Number of levels: 143)
>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)     0.42      0.04     0.34     0.51        148    1
>
> ~region (Number of levels: 6)
>                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)         0.24      0.20     0.01     0.77        300    1
> sd(x1)                0.41      0.38     0.02     1.44        398    1
> sd(x2)                0.57      0.45     0.03     1.75        289    1
> cor(Intercept,x1)     0.04      0.47    -0.84     0.86        400    1
> cor(Intercept,x2)    -0.20      0.46    -0.90     0.70        400    1
> cor(x1,x2)           -0.14      0.50    -0.92     0.87        400    1
>
> Fixed Effects:
>           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> Intercept    -0.61      0.15    -0.91    -0.36        400    1
> x1            0.43      0.39    -0.31     1.20        400    1
> x2            0.02      0.40    -0.69     0.81        344    1
>
> Samples were drawn using NUTS(diag_e). For each parameter, Eff.Sample is a
> crude measure of effective sample size, and Rhat is the potential scale
> reduction factor on split chains (at convergence, Rhat = 1).
>
>
>
>
> >From what I see, the results differ from lme4 in four aspects.
>
> 1. Indeed, there seems to be not enough evidence in the data to estimate
> the random effects correlations reasonably, as implied by the huge CIs. The
> Estimates (means in this case) are close to 0 instead of -1 or 1, which
> seems more intuitive from my perspective.
> 2. While the random effects standard deviations of country and country:wave
> are similar to lme4 (their posterior distributions are nearly symmetric;
> see plot_fit1_brm in the attachment), the sds of region are considerably
> larger than the ones estimated by lme4. This is because the mode is always
> smaller than the mean when the distributions a right skewed. Especially,
> from my perspective, they do not seem to be so small that removing them
> from the model would be justified (I am currently running the same model
> without correlations but the results shouldnt be too different).
> 3. The estimate of x1 is a bit smaller than in lme4 (0.43 against 0.49).
> Unfortunately, I do not have any immediate explanation for this...
> 4. The standard error of x1 has nearly doubled as compared to lme4 (0.39
> against 0.2).  As a result, the CI is almost twice a large.  Most likely,
> this is because the random effects sd of x1 is much larger than in lme4,
> which also affects the standard error of the fixed effect.
>
> To test, whether 4. is solely because of the random effects of x1, I
> completely removed them in the second model (fit2_brm):
>
> Family: bernoulli (logit)
> Formula: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 | region)
>    Data: data (Number of observations: 212570)
> Samples: 1 chains, each with n.iter = 500; n.warmup = 100; n.thin = 1;
>          total post-warmup samples = 400
>
> Random Effects:
> ~country (Number of levels: 82)
>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)     0.61      0.08     0.49     0.77         96 1.03
>
> ~country:wave (Number of levels: 143)
>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)      0.4      0.04     0.34     0.49        197 1.01
>
> ~region (Number of levels: 6)
>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sd(Intercept)     0.27      0.19     0.03     0.72        278    1
>
> Fixed Effects:
>           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> Intercept    -0.61      0.15    -0.95    -0.31        400 1.01
> x1            0.42      0.25    -0.05     0.89        400 1.00
> x2           -0.04      0.22    -0.46     0.37        400 1.00
>
> Samples were drawn using NUTS(diag_e). For each parameter, Eff.Sample is a
> crude measure of effective sample size, and Rhat is the potential scale
> reduction factor on split chains (at convergence, Rhat = 1).
>
>
> The standard error of x1 indeed went down being close to what lme4
> estimated (0.25 against 0.24). What I find suspicius is that the maximal
> model in lme4 (i.e. the one including random effects for x1) estimates
> lower
> standard error for the fixed effect of x1 than the model without those
> random effects (see the results of Reinhold Klingl). From my perspective,
> this is unlikely to be correct. Whats your opinon on this?
>
>
> 2015-07-15 23:18 GMT+02:00 Ben Bolker <bbolker at gmail.com>:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On 15-07-15 04:38 PM, Ben Bolker wrote:
> > > On 15-07-15 12:52 PM, Paul Buerkner wrote:
> > >> if you look at the results from a baysian perspective, it seems
> > >> to be a typcial "problem" of ML-procedures estimating the mode.
> > >>
> > >> The mode is nothing special, just the point where the density is
> > >> maximal. When you have skewed distribution (as usual for
> > >> correlations) the mode will often be close to the borders of the
> > >> region of definition (-1 or 1 in this case). The posterior
> > >> distribution of the correlation, however, can still be very wide
> > >> ranging from strong negative correlation to strong positive
> > >> correlation, especially when the number of levels of a grouping
> > >> factor is not that large. In those cases, zero (i.e.
> > >> insignificant) correlation is a very likely value even if the
> > >> mode itself is extreme.
> > >>
> > >> I tried fitting your models with bayesian R packages (brms and
> > >> MCMCglmm). Unfortunately, because you have so many observations
> > >> and quite a few random effects, they run relatively slow so i am
> > >> still waiting for the results.
> > >
> >
> >  You can also use blme, which implements a very thin Bayesian wrapper
> > around [g]lmer and does maximum _a posteriori_ (i.e. Bayesian mode)
> > estimates with
> > weak (but principled) priors on the random effects -- it's based on
> >
> > Chung, Yeojin, Sophia Rabe-Hesketh, Vincent Dorie, Andrew Gelman, and
> > Jingchen Liu. “A Nondegenerate Penalized Likelihood Estimator for
> > Variance Parameters in Multilevel Models.” Psychometrika 78, no. 4
> > (March 12, 2013): 685–709. doi:10.1007/s11336-013-9328-2.
> >
> >   Profile 'likelihood' confidence intervals based on blme will get you
> > a reasonable approximation of the width of the credible interval,
> > although it's a little bit of a cheesy/awkward combination between
> > marginal (proper Bayesian) and conditional (MAP/cheesy-Bayesian)
> > measures of uncertainty.
> >
> >
> >
> >
> >
> > >> 2015-07-15 3:45 GMT+02:00 svm <steven.v.miller at gmail.com>:
> > >>
> > >>> I considered that. I disaggregated the region random effect
> > >>> from 6 to 18 (the latter of which approximates the World Bank's
> > >>> region classification). I'm still encountering the same curious
> > >>> issue.
> > >>>
> > >>> Random effects: Groups       Name        Variance  Std.Dev.
> > >>> Corr country:wave (Intercept) 0.1530052 0.39116 country
> > >>> (Intercept) 0.3735876 0.61122 wbregion     (Intercept)
> > >>> 0.0137822 0.11740 x1        0.0009384 0.03063  -1.00 x2
> > >>> 0.0767387 0.27702  -1.00  1.00 Number of obs: 212570, groups:
> > >>> country:wave, 143; country, 82; wbregion, 18
> > >>>
> > >>> For what it's worth: the model estimates fine. The results are
> > >>> intuitive and theoretically consistent. They also don't change
> > >>> if I were to remove that region random effect. I'd like to keep
> > >>> the region random effect (with varying slopes) in the model.
> > >>> I'm struggling with what I should think about the perfect
> > >>> correlations.
> > >>>
> > >>> On Tue, Jul 14, 2015 at 9:07 PM, Jake Westfall
> > >>> <jake987722 at hotmail.com> wrote:
> > >>>
> > >>>> Hi Steve,
> > >>>>
> > >>>>
> > >>>> I think the issue is that estimating 3 variances and 3
> > >>>> covariances for regions is quite ambitious given that there
> > >>>> are only 6 regions. I think it's not surprising that the
> > >>>> model has a hard time getting good estimates of those
> > >>>> parameters.
> > >>>>
> > >>>>
> > >>>> Jake
> > >>>>
> > >>>>> Date: Tue, 14 Jul 2015 20:53:01 -0400 From:
> > >>>>> steven.v.miller at gmail.com To:
> > >>>>> r-sig-mixed-models at r-project.org Subject: [R-sig-ME]
> > >>>>> Perfectly correlated random effects (when they
> > >>>> shouldn't be)
> > >>>>
> > >>>>>
> > >>>>> Hi all,
> > >>>>>
> > >>>>> I'm a long-time reader and wanted to raise a question I've
> > >>>>> seen asked
> > >>>> here
> > >>>>> before about correlated random effects. Past answers I have
> > >>>>> encountered
> > >>>> on
> > >>>>> this listserv explain that perfectly correlated random
> > >>>>> effects suggest model overfitting and variances of random
> > >>>>> effects that are effectively
> > >>>> zero
> > >>>>> and can be omitted for a simpler model. In my case, I don't
> > >>>>> think
> > >>> that's
> > >>>>> what is happening here, though I could well be fitting a
> > >>>>> poor model in glmer.
> > >>>>>
> > >>>>> I'll describe the nature of the data first. I'm modeling
> > >>> individual-level
> > >>>>> survey data for countries across multiple waves and am
> > >>>>> estimating the region of the globe as a random effect as
> > >>>>> well. I have three random
> > >>>> effects
> > >>>>> (country, country-wave, and region). In the region random
> > >>>>> effect, I am allowing country-wave-level predictors to have
> > >>>>> varying slopes. My
> > >>> inquiry
> > >>>>> is whether some country-wave-level contextual indicator can
> > >>>>> have an
> > >>>> overall
> > >>>>> effect (as a fixed effect), the effect of which can vary by
> > >>>>> region. In other words: is the effect of some country-level
> > >>>>> indicator (e.g. unemployment) in a given year different for
> > >>>>> countries in Western Europe than for countries in Africa
> > >>>>> even if, on average, there is a positive
> > >>> or
> > >>>>> negative association at the individual-level? These
> > >>>>> country-wave-level predictors that I allow to vary by
> > >>>>> region are the ones reporting
> > >>> perfect
> > >>>>> correlation and I'm unsure how to interpret that (or if I'm
> > >>>>> estimating
> > >>>> the
> > >>>>> model correctly).
> > >>>>>
> > >>>>> I should also add that I have individual-level predictors
> > >>>>> as well as country-wave-level predictors, though it's the
> > >>>>> latter that concerns me. Further, every non-binary
> > >>>>> indicator in the model is standardized by two standard
> > >>>>> deviations.
> > >>>>>
> > >>>>> For those interested, I have a reproducible (if rather
> > >>>>> large) example below. Dropbox link to the data is here:
> > >>>>>
> > >>>>
> > >>>
> >
> https://www.dropbox.com/s/t29jfwm98tsdr71/correlated-random-effects.csv?dl=0
> > >>>>>
> > >>>>>
> > >>>
> > In this reproducible example, y is the outcome variable and x1 and x2
> > >>> are
> > >>>>> two country-wave-level predictors I allow to vary by
> > >>>>> region. Both x1
> > >>> and
> > >>>> x2
> > >>>>> are interval-level predictors that I standardized to have a
> > >>>>> mean of
> > >>> zero
> > >>>>> and a standard deviation of .5 (per Gelman's (2008)
> > >>>>> recommendation).
> > >>>>>
> > >>>>> I estimate the following model.
> > >>>>>
> > >>>>> summary(M1 <- glmer(y ~ x1 + x2 + (1 | country) + (1 |
> > >>>>> country:wave) +
> > >>>> (1 +
> > >>>>> x1 + x2 | region), data=subset(Data),
> > >>>>> family=binomial(link="logit")))
> > >>>>>
> > >>>>> The results are theoretically intuitive. I think they make
> > >>>>> sense.
> > >>>> However,
> > >>>>> I get a report of perfect correlation for the varying
> > >>>>> slopes of the
> > >>>> region
> > >>>>> random effect.
> > >>>>>
> > >>>>> Random effects: Groups Name Variance Std.Dev. Corr
> > >>>>> country:wave (Intercept) 0.15915 0.3989 country (Intercept)
> > >>>>> 0.32945 0.5740 region (Intercept) 0.01646 0.1283 x1 0.02366
> > >>>>> 0.1538 1.00 x2 0.13994 0.3741 -1.00 -1.00 Number of obs:
> > >>>>> 212570, groups: country:wave, 143; country, 82; region,
> > >>> 6
> > >>>>>
> > >>>>> What should I make of this and am I estimating this model
> > >>>>> wrong? For
> > >>> what
> > >>>>> it's worth, the dotplot of the region random effect (with
> > >>>>> conditional variance) makes sense and is theoretically
> > >>>>> intuitive, given my data. (
> > >>>>> http://i.imgur.com/mrnaJ77.png)
> > >>>>>
> > >>>>> Any help would be greatly appreciated.
> > >>>>>
> > >>>>> Best regards, Steve
> > >>>>>
> > >>>>> [[alternative HTML version deleted]]
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> R-sig-mixed-models at r-project.org mailing list
> > >>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> -- Steven V. Miller Assistant Professor Department of Political
> > >>> Science Clemson University http://svmiller.com
> > >>>
> > >>> [[alternative HTML version deleted]]
> > >>>
> > >>> _______________________________________________
> > >>> R-sig-mixed-models at r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >>>
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> _______________________________________________
> > >> R-sig-mixed-models at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >>
> > >
> >
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.11 (GNU/Linux)
> >
> > iQEcBAEBAgAGBQJVps4fAAoJEOCV5YRblxUH0wQIANBg6CaKoHuM6RQY5VltpEbk
> > 5+RYc0tIYXmvNzGesG0QTQaLz0A5cx5mo0EGxsKQq8vUz2ycRlSlcYo9uI0K/xft
> > D8MMhdVr8QhIW2RtoWPNPzn6HIe276CFnHg4Co+3vbMcccbvTvWvxsDYaT/LOlRn
> > JoVjN/HcOscMOQkAxZV6elYBZe+kbVVhOS0SNo3Bt5P528EuWIxaRlC2lO5aoHSL
> > 1cgLn5uyWLsxb3Cuu3FctwYfYOk9hsEXNM/EGMleshDq6umGtSm9lqiM8vqgSnMl
> > Iyp2A+r3fkRzfEZyWv0Ygi4OA0iZ5/BSH44+sR60hj/qSpqGYwUQ+fIrfKXAYTw=
> > =cHT0
> > -----END PGP SIGNATURE-----
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]