[R-sig-ME] Perfectly correlated random effects (when they shouldn't be)

Thierry Onkelinx thierry.onkelinx at inbo.be
Sun Jul 19 14:52:01 CEST 2015


Dear all,

I tried to tackle this with the INLA package. See
http://rpubs.com/INBOstats/perfect_correlation for the results.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-07-18 23:38 GMT+02:00 Paul Buerkner <paul.buerkner op gmail.com>:

> That looks better.
>
> I used the developmental version of brms from github available through
>
> library(devtools)
> install_github("paul-buerkner/brms")
>
> because the release version on CRAN does not yet support all relevant
> features (the country:wave syntax is not accepted by brms 0.3.0).
>
> The syntax i used was
>
> library(brms)
> fit1_brm <- brm(y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 +  x1
> + x2 | region),
>                             data = data, family = "bernoulli", n.warmup =
> 100, n.iter = 500, n.chains = 1)
> summary(fit1_brm)
> plot(fit1_brm)
>
> fit2_brm <- brm(y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 |
> region),
>                             data = data, family = "bernoulli", n.warmup =
> 100, n.iter = 500, n.chains = 1)
> summary(fit2_brm)
> plot(fit2_brm)
>
> Remember that you have to install Stan to make brms work. It will take
> several hours to fit those models, so be patient. :-)
>
> 2015-07-18 22:29 GMT+02:00 svm <steven.v.miller op gmail.com>:
>
> > Genuinely had no idea about that brms package in R. Is it new? It looks
> > very intriguing, though the syntax I'm seeing in the reference manual
> > doesn't appear to be working. Do you mind if I see the code? Maybe I'm
> > doing it wrong.
> >
> > I followed Ben's advice and pursued a blmer strategy with a disaggregated
> > version of the random effects variable. I ran the full models (so: not
> the
> > simple reproducible example) and got something that makes more sense. I
> > also corrected for various weird data points (i.e. countries whose
> surveys
> > were likely botched in execution). Here is the summary of the random
> > effects.
> >
> > Random effects:
> >  Groups       Name        Variance Std.Dev. Corr
> >  country:wave (Intercept) 0.16091  0.4011
> >  country      (Intercept) 0.34949  0.5912
> >  region       (Intercept) 0.04604  0.2146
> >               eti         0.24501  0.4950    0.20
> >               sti         0.18997  0.4359   -0.45 -0.04
> > Number of obs: 208551, groups:  country:wave, 143; country, 82; region,
> 13
> >
> > I think this makes sense. The regions ultimately don't vary that much
> from
> > each other, which I think is intuitive, given the data. The correlations
> > among the random slopes are importantly not 0, -1, or 1.
> >
> > In a full model (with controls), the significance of the z-value
> > associated with x1 is 1.741 (coeff: .516, sd: .296), which is below the
> > conventional cutoff of z = 1.96. My field is kind of a "cult of p < .05"
> > crowd, but I'm not terribly bothered by this. I do believe what I'm doing
> > here is a very conservative test of where x1 is discernible from zero
> > across the range of the data.
> >
> > As for the comparison between x1 in fit2_brm versus the maximal model in
> > lme4: I honestly don't know. I learn something new each time I do a mixed
> > effects analysis, but I can offer no immediate answer for that.
> >
> > I definitely do appreciate the input of this listserv. I'm more than
> > grateful for your time and patience with me.
> >
> > Best regards,
> > Steve
> >
> > On Sat, Jul 18, 2015 at 2:40 PM, Paul Buerkner <paul.buerkner op gmail.com>
> > wrote:
> >
> >> I managed to fit the models with bayesian techniques using the package
> >> brms.
> >>
> >> Because your data is so huge, I decided to run only 500 iterations, of
> >> which 100 are warmup (i.e. burnin). As brms uses Stan on the backend,
> the
> >> quality of the samples is relatively high so that 400 markov samples
> >> should
> >> be enough for our purposes.
> >>
> >> Below I present the results of the maximal model as well as the model
> >> withour any random effects for x1 and x2. The related plots are
> attached.
> >>
> >> First, the maximal model (fit1_brm):
> >>
> >> Family: bernoulli (logit)
> >> Formula: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 + x1 +
> >> x2 | region)
> >>    Data: data (Number of observations: 212570)
> >> Samples: 1 chains, each with n.iter = 500; n.warmup = 100; n.thin = 1;
> >>          total post-warmup samples = 400
> >>
> >> Random Effects:
> >> ~country (Number of levels: 82)
> >>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)     0.59      0.08     0.44     0.75        140    1
> >>
> >> ~country:wave (Number of levels: 143)
> >>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)     0.42      0.04     0.34     0.51        148    1
> >>
> >> ~region (Number of levels: 6)
> >>                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)         0.24      0.20     0.01     0.77        300    1
> >> sd(x1)                0.41      0.38     0.02     1.44        398    1
> >> sd(x2)                0.57      0.45     0.03     1.75        289    1
> >> cor(Intercept,x1)     0.04      0.47    -0.84     0.86        400    1
> >> cor(Intercept,x2)    -0.20      0.46    -0.90     0.70        400    1
> >> cor(x1,x2)           -0.14      0.50    -0.92     0.87        400    1
> >>
> >> Fixed Effects:
> >>           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> Intercept    -0.61      0.15    -0.91    -0.36        400    1
> >> x1            0.43      0.39    -0.31     1.20        400    1
> >> x2            0.02      0.40    -0.69     0.81        344    1
> >>
> >> Samples were drawn using NUTS(diag_e). For each parameter, Eff.Sample
> is a
> >> crude measure of effective sample size, and Rhat is the potential scale
> >> reduction factor on split chains (at convergence, Rhat = 1).
> >>
> >>
> >>
> >>
> >> >From what I see, the results differ from lme4 in four aspects.
> >>
> >> 1. Indeed, there seems to be not enough evidence in the data to estimate
> >> the random effects correlations reasonably, as implied by the huge CIs.
> >> The
> >> Estimates (means in this case) are close to 0 instead of -1 or 1, which
> >> seems more intuitive from my perspective.
> >> 2. While the random effects standard deviations of country and
> >> country:wave
> >> are similar to lme4 (their posterior distributions are nearly symmetric;
> >> see plot_fit1_brm in the attachment), the sds of region are considerably
> >> larger than the ones estimated by lme4. This is because the mode is
> always
> >> smaller than the mean when the distributions a right skewed. Especially,
> >> from my perspective, they do not seem to be so small that removing them
> >> from the model would be justified (I am currently running the same model
> >> without correlations but the results shouldnt be too different).
> >> 3. The estimate of x1 is a bit smaller than in lme4 (0.43 against 0.49).
> >> Unfortunately, I do not have any immediate explanation for this...
> >> 4. The standard error of x1 has nearly doubled as compared to lme4 (0.39
> >> against 0.2).  As a result, the CI is almost twice a large.  Most
> likely,
> >> this is because the random effects sd of x1 is much larger than in lme4,
> >> which also affects the standard error of the fixed effect.
> >>
> >> To test, whether 4. is solely because of the random effects of x1, I
> >> completely removed them in the second model (fit2_brm):
> >>
> >> Family: bernoulli (logit)
> >> Formula: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 | region)
> >>    Data: data (Number of observations: 212570)
> >> Samples: 1 chains, each with n.iter = 500; n.warmup = 100; n.thin = 1;
> >>          total post-warmup samples = 400
> >>
> >> Random Effects:
> >> ~country (Number of levels: 82)
> >>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)     0.61      0.08     0.49     0.77         96 1.03
> >>
> >> ~country:wave (Number of levels: 143)
> >>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)      0.4      0.04     0.34     0.49        197 1.01
> >>
> >> ~region (Number of levels: 6)
> >>               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> sd(Intercept)     0.27      0.19     0.03     0.72        278    1
> >>
> >> Fixed Effects:
> >>           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> >> Intercept    -0.61      0.15    -0.95    -0.31        400 1.01
> >> x1            0.42      0.25    -0.05     0.89        400 1.00
> >> x2           -0.04      0.22    -0.46     0.37        400 1.00
> >>
> >> Samples were drawn using NUTS(diag_e). For each parameter, Eff.Sample
> is a
> >> crude measure of effective sample size, and Rhat is the potential scale
> >> reduction factor on split chains (at convergence, Rhat = 1).
> >>
> >>
> >> The standard error of x1 indeed went down being close to what lme4
> >> estimated (0.25 against 0.24). What I find suspicius is that the maximal
> >> model in lme4 (i.e. the one including random effects for x1) estimates
> >> lower
> >> standard error for the fixed effect of x1 than the model without those
> >> random effects (see the results of Reinhold Klingl). From my
> perspective,
> >> this is unlikely to be correct. Whats your opinon on this?
> >>
> >>
> >> 2015-07-15 23:18 GMT+02:00 Ben Bolker <bbolker op gmail.com>:
> >>
> >> > -----BEGIN PGP SIGNED MESSAGE-----
> >> > Hash: SHA1
> >> >
> >> > On 15-07-15 04:38 PM, Ben Bolker wrote:
> >> > > On 15-07-15 12:52 PM, Paul Buerkner wrote:
> >> > >> if you look at the results from a baysian perspective, it seems
> >> > >> to be a typcial "problem" of ML-procedures estimating the mode.
> >> > >>
> >> > >> The mode is nothing special, just the point where the density is
> >> > >> maximal. When you have skewed distribution (as usual for
> >> > >> correlations) the mode will often be close to the borders of the
> >> > >> region of definition (-1 or 1 in this case). The posterior
> >> > >> distribution of the correlation, however, can still be very wide
> >> > >> ranging from strong negative correlation to strong positive
> >> > >> correlation, especially when the number of levels of a grouping
> >> > >> factor is not that large. In those cases, zero (i.e.
> >> > >> insignificant) correlation is a very likely value even if the
> >> > >> mode itself is extreme.
> >> > >>
> >> > >> I tried fitting your models with bayesian R packages (brms and
> >> > >> MCMCglmm). Unfortunately, because you have so many observations
> >> > >> and quite a few random effects, they run relatively slow so i am
> >> > >> still waiting for the results.
> >> > >
> >> >
> >> >  You can also use blme, which implements a very thin Bayesian wrapper
> >> > around [g]lmer and does maximum _a posteriori_ (i.e. Bayesian mode)
> >> > estimates with
> >> > weak (but principled) priors on the random effects -- it's based on
> >> >
> >> > Chung, Yeojin, Sophia Rabe-Hesketh, Vincent Dorie, Andrew Gelman, and
> >> > Jingchen Liu. “A Nondegenerate Penalized Likelihood Estimator for
> >> > Variance Parameters in Multilevel Models.” Psychometrika 78, no. 4
> >> > (March 12, 2013): 685–709. doi:10.1007/s11336-013-9328-2.
> >> >
> >> >   Profile 'likelihood' confidence intervals based on blme will get you
> >> > a reasonable approximation of the width of the credible interval,
> >> > although it's a little bit of a cheesy/awkward combination between
> >> > marginal (proper Bayesian) and conditional (MAP/cheesy-Bayesian)
> >> > measures of uncertainty.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > >> 2015-07-15 3:45 GMT+02:00 svm <steven.v.miller op gmail.com>:
> >> > >>
> >> > >>> I considered that. I disaggregated the region random effect
> >> > >>> from 6 to 18 (the latter of which approximates the World Bank's
> >> > >>> region classification). I'm still encountering the same curious
> >> > >>> issue.
> >> > >>>
> >> > >>> Random effects: Groups       Name        Variance  Std.Dev.
> >> > >>> Corr country:wave (Intercept) 0.1530052 0.39116 country
> >> > >>> (Intercept) 0.3735876 0.61122 wbregion     (Intercept)
> >> > >>> 0.0137822 0.11740 x1        0.0009384 0.03063  -1.00 x2
> >> > >>> 0.0767387 0.27702  -1.00  1.00 Number of obs: 212570, groups:
> >> > >>> country:wave, 143; country, 82; wbregion, 18
> >> > >>>
> >> > >>> For what it's worth: the model estimates fine. The results are
> >> > >>> intuitive and theoretically consistent. They also don't change
> >> > >>> if I were to remove that region random effect. I'd like to keep
> >> > >>> the region random effect (with varying slopes) in the model.
> >> > >>> I'm struggling with what I should think about the perfect
> >> > >>> correlations.
> >> > >>>
> >> > >>> On Tue, Jul 14, 2015 at 9:07 PM, Jake Westfall
> >> > >>> <jake987722 op hotmail.com> wrote:
> >> > >>>
> >> > >>>> Hi Steve,
> >> > >>>>
> >> > >>>>
> >> > >>>> I think the issue is that estimating 3 variances and 3
> >> > >>>> covariances for regions is quite ambitious given that there
> >> > >>>> are only 6 regions. I think it's not surprising that the
> >> > >>>> model has a hard time getting good estimates of those
> >> > >>>> parameters.
> >> > >>>>
> >> > >>>>
> >> > >>>> Jake
> >> > >>>>
> >> > >>>>> Date: Tue, 14 Jul 2015 20:53:01 -0400 From:
> >> > >>>>> steven.v.miller op gmail.com To:
> >> > >>>>> r-sig-mixed-models op r-project.org Subject: [R-sig-ME]
> >> > >>>>> Perfectly correlated random effects (when they
> >> > >>>> shouldn't be)
> >> > >>>>
> >> > >>>>>
> >> > >>>>> Hi all,
> >> > >>>>>
> >> > >>>>> I'm a long-time reader and wanted to raise a question I've
> >> > >>>>> seen asked
> >> > >>>> here
> >> > >>>>> before about correlated random effects. Past answers I have
> >> > >>>>> encountered
> >> > >>>> on
> >> > >>>>> this listserv explain that perfectly correlated random
> >> > >>>>> effects suggest model overfitting and variances of random
> >> > >>>>> effects that are effectively
> >> > >>>> zero
> >> > >>>>> and can be omitted for a simpler model. In my case, I don't
> >> > >>>>> think
> >> > >>> that's
> >> > >>>>> what is happening here, though I could well be fitting a
> >> > >>>>> poor model in glmer.
> >> > >>>>>
> >> > >>>>> I'll describe the nature of the data first. I'm modeling
> >> > >>> individual-level
> >> > >>>>> survey data for countries across multiple waves and am
> >> > >>>>> estimating the region of the globe as a random effect as
> >> > >>>>> well. I have three random
> >> > >>>> effects
> >> > >>>>> (country, country-wave, and region). In the region random
> >> > >>>>> effect, I am allowing country-wave-level predictors to have
> >> > >>>>> varying slopes. My
> >> > >>> inquiry
> >> > >>>>> is whether some country-wave-level contextual indicator can
> >> > >>>>> have an
> >> > >>>> overall
> >> > >>>>> effect (as a fixed effect), the effect of which can vary by
> >> > >>>>> region. In other words: is the effect of some country-level
> >> > >>>>> indicator (e.g. unemployment) in a given year different for
> >> > >>>>> countries in Western Europe than for countries in Africa
> >> > >>>>> even if, on average, there is a positive
> >> > >>> or
> >> > >>>>> negative association at the individual-level? These
> >> > >>>>> country-wave-level predictors that I allow to vary by
> >> > >>>>> region are the ones reporting
> >> > >>> perfect
> >> > >>>>> correlation and I'm unsure how to interpret that (or if I'm
> >> > >>>>> estimating
> >> > >>>> the
> >> > >>>>> model correctly).
> >> > >>>>>
> >> > >>>>> I should also add that I have individual-level predictors
> >> > >>>>> as well as country-wave-level predictors, though it's the
> >> > >>>>> latter that concerns me. Further, every non-binary
> >> > >>>>> indicator in the model is standardized by two standard
> >> > >>>>> deviations.
> >> > >>>>>
> >> > >>>>> For those interested, I have a reproducible (if rather
> >> > >>>>> large) example below. Dropbox link to the data is here:
> >> > >>>>>
> >> > >>>>
> >> > >>>
> >> >
> >>
> https://www.dropbox.com/s/t29jfwm98tsdr71/correlated-random-effects.csv?dl=0
> >> > >>>>>
> >> > >>>>>
> >> > >>>
> >> > In this reproducible example, y is the outcome variable and x1 and x2
> >> > >>> are
> >> > >>>>> two country-wave-level predictors I allow to vary by
> >> > >>>>> region. Both x1
> >> > >>> and
> >> > >>>> x2
> >> > >>>>> are interval-level predictors that I standardized to have a
> >> > >>>>> mean of
> >> > >>> zero
> >> > >>>>> and a standard deviation of .5 (per Gelman's (2008)
> >> > >>>>> recommendation).
> >> > >>>>>
> >> > >>>>> I estimate the following model.
> >> > >>>>>
> >> > >>>>> summary(M1 <- glmer(y ~ x1 + x2 + (1 | country) + (1 |
> >> > >>>>> country:wave) +
> >> > >>>> (1 +
> >> > >>>>> x1 + x2 | region), data=subset(Data),
> >> > >>>>> family=binomial(link="logit")))
> >> > >>>>>
> >> > >>>>> The results are theoretically intuitive. I think they make
> >> > >>>>> sense.
> >> > >>>> However,
> >> > >>>>> I get a report of perfect correlation for the varying
> >> > >>>>> slopes of the
> >> > >>>> region
> >> > >>>>> random effect.
> >> > >>>>>
> >> > >>>>> Random effects: Groups Name Variance Std.Dev. Corr
> >> > >>>>> country:wave (Intercept) 0.15915 0.3989 country (Intercept)
> >> > >>>>> 0.32945 0.5740 region (Intercept) 0.01646 0.1283 x1 0.02366
> >> > >>>>> 0.1538 1.00 x2 0.13994 0.3741 -1.00 -1.00 Number of obs:
> >> > >>>>> 212570, groups: country:wave, 143; country, 82; region,
> >> > >>> 6
> >> > >>>>>
> >> > >>>>> What should I make of this and am I estimating this model
> >> > >>>>> wrong? For
> >> > >>> what
> >> > >>>>> it's worth, the dotplot of the region random effect (with
> >> > >>>>> conditional variance) makes sense and is theoretically
> >> > >>>>> intuitive, given my data. (
> >> > >>>>> http://i.imgur.com/mrnaJ77.png)
> >> > >>>>>
> >> > >>>>> Any help would be greatly appreciated.
> >> > >>>>>
> >> > >>>>> Best regards, Steve
> >> > >>>>>
> >> > >>>>> [[alternative HTML version deleted]]
> >> > >>>>>
> >> > >>>>> _______________________________________________
> >> > >>>>> R-sig-mixed-models op r-project.org mailing list
> >> > >>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> > >>>>
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>> -- Steven V. Miller Assistant Professor Department of Political
> >> > >>> Science Clemson University http://svmiller.com
> >> > >>>
> >> > >>> [[alternative HTML version deleted]]
> >> > >>>
> >> > >>> _______________________________________________
> >> > >>> R-sig-mixed-models op r-project.org mailing list
> >> > >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> > >>>
> >> > >>
> >> > >> [[alternative HTML version deleted]]
> >> > >>
> >> > >> _______________________________________________
> >> > >> R-sig-mixed-models op r-project.org mailing list
> >> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> > >>
> >> > >
> >> >
> >> > -----BEGIN PGP SIGNATURE-----
> >> > Version: GnuPG v1.4.11 (GNU/Linux)
> >> >
> >> > iQEcBAEBAgAGBQJVps4fAAoJEOCV5YRblxUH0wQIANBg6CaKoHuM6RQY5VltpEbk
> >> > 5+RYc0tIYXmvNzGesG0QTQaLz0A5cx5mo0EGxsKQq8vUz2ycRlSlcYo9uI0K/xft
> >> > D8MMhdVr8QhIW2RtoWPNPzn6HIe276CFnHg4Co+3vbMcccbvTvWvxsDYaT/LOlRn
> >> > JoVjN/HcOscMOQkAxZV6elYBZe+kbVVhOS0SNo3Bt5P528EuWIxaRlC2lO5aoHSL
> >> > 1cgLn5uyWLsxb3Cuu3FctwYfYOk9hsEXNM/EGMleshDq6umGtSm9lqiM8vqgSnMl
> >> > Iyp2A+r3fkRzfEZyWv0Ygi4OA0iZ5/BSH44+sR60hj/qSpqGYwUQ+fIrfKXAYTw=
> >> > =cHT0
> >> > -----END PGP SIGNATURE-----
> >> >
> >> > _______________________________________________
> >> > R-sig-mixed-models op r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> >
> >>
> >> _______________________________________________
> >> R-sig-mixed-models op r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list