[R-sig-ME] Perfectly correlated random effects (when they shouldn't be)

Wed Jul 15 13:25:44 CEST 2015

Actually, I overlooked that the region-related variance component of x2 can
also be removed from the GLMM without loss of goodness of fit. Thus, I end
up with an intercept-only GLMM as the most parsimonious model for this data
set. Here is the anova() output:

> anova(zcpM3, zcpM2, zcpM1, M1)
Data: data
Models:
zcpM3: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 | region)
zcpM2: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 + x2 || region)
zcpM1: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 + x1 + x2 ||
region)
M1: y ~ x1 + x2 + (1 | country) + (1 | country:wave) + (1 + x1 + x2 |
region)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
zcpM3  6 258193 258255 -129091   258181                          (only
intercept, parsimonious GLMM)
zcpM2  7 258195 258267 -129091   258181 0.1882      1     0.6644
zcpM1  8 258197 258279 -129091   258181 0.0000      1     1.0000
M1    11 258201 258314 -129090   258179 1.8885      3     0.5959
(overparameterized max GLMM)

There is another interesting aspect to this reduction of model complexity.
The parsimonious GLMM yields the SMALLEST z-value for the significant fixed
effect of x1. Thus, this is a counterexample to the claim that maximal
models are always anti-conservative. Whether this is the case or not may
depend on whether the maximal model is overparameterized or not. Here are
the corresponding fixed-effect estimates for x1 and x2:

Fixed effects for x1 across GLMMs:
                    Estimate Std. Error z value Pr(>|z|)
zcpM3: x1           0.51004    0.23859    2.138  0.0325 *  (only intercept,
parsimonious GLMM)
zcpM2: x1           0.515822   0.167005   3.089  0.00201 **
zcpM1: x1           0.516168   0.149704   3.448  0.000565 ***
M1:    x1           0.48913    0.19972    2.449  0.0143 *
 (overparameterized max GLMM)

Fixed effects for x2 across GLMMs:
zcpM3: x2          -0.02743    0.15881   -0.173  0.8629    (only intercept,
parsimonious GLMM)
zcpM2: x2          -0.008715   0.151586  -0.057  0.95416
zcpM1: x2          -0.008701   0.155332  -0.056  0.955331
M1:    x2           0.04422    0.17376    0.254  0.7991
 (overparameterized max GLMM)

On Wed, Jul 15, 2015 at 10:57 AM, Reinhold Kliegl <reinhold.kliegl at gmail.com
> wrote:

> It looks like your model is overparameterized. Specifically, although you
> have a very large number of observations (N=212570),  the data do not
> support the six model parameters you try to estimate for the random-effects
> term of region. You can see this quickly with the rePCA() function. There
> was also a convergence warning for me, but perhaps this was due to the fact
> that I worked with the full set (as provided in the dropbox file), whereas
> you used a subset.
>
> As a first step, I forced the correlation parameters to zero with the
> ||-syntax (zcpM1 GLMM below). A LRT suggests that dropping the three
> correlation parameters does not decrease the goodness of fit; Chi-sq(3) =
> 1.9, p=.5959.  This could very well be a consequence of the small number of
> regions, as Jake pointed out; and even 18 regions may not be enough for a
> reliable estimation of three correlation parameters.
>
> Moreover, the rePCA() suggests that even the three remaining model
> parameters for the random-effects term of region are quite marginally
> supported by the data. In the zero-correlation parameter GLMM, the variance
> component for x1 was estimated to be suspiciously small.  Therefore, in a
> second step, I removed the region-related x1 variance component . Again,
> according to a LRT there was no significant loss in goodness of fit;
> Chi-sq(1) = 0.
>
> The R script is inserted below. Please note that taking a correlation
> parameter or a variance component out of a model does not mean that it is
> zero, but that the data do not support a model that assumes that it is
> different from zero.  zcpM2 (below) looks like an adequate parsimonious
> GLMM for these data.
>
>
> ###
>
> # Note: The original post used a subset of the data, but it is not clear
> how they were selected. I used all the data.
>
> library(lme4)
> library(RePsychLing)
> data <- read.csv("correlated-random-effects.csv", header=TRUE)
> str(data)
> data$country <- factor(data$country)
> data$region <- factor(data$region)
>
> # no convergence problem
> print(summary(M1 <- glmer(y ~ x1 + x2 + (1 | country) + (1 | country:wave)
> +
>                           (1 +  x1 + x2 | region), data=data,
> family=binomial(link="logit"))))
>
> summary(rePCA(M1))
>
> # convergence problem
> print(summary(zcpM1 <- glmer(y ~ x1 + x2 + (1 | country) + (1 |
> country:wave) +
>                              (1 +  x1 + x2 || region), data=data,
> family=binomial(link="logit"))))
> summary(rePCA(zcpM1))
>
> # ok
> print(summary(zcpM2 <- glmer(y ~ x1 + x2 + (1 | country) + (1 |
> country:wave) +
>                                    (1 + x2 || region), data=data,
> family=binomial(link="logit"))))
> summary(rePCA(zcpM2))
>
> anova(zcpM2, zcpM1, M1)
>
> ####
>
>
>
>
> On Wed, Jul 15, 2015 at 3:45 AM, svm <steven.v.miller at gmail.com> wrote:
>
>> I considered that. I disaggregated the region random effect from 6 to 18
>> (the latter of which approximates the World Bank's region classification).
>> I'm still encountering the same curious issue.
>>
>> Random effects:
>>  Groups       Name        Variance  Std.Dev. Corr
>>  country:wave (Intercept) 0.1530052 0.39116
>>  country      (Intercept) 0.3735876 0.61122
>>  wbregion     (Intercept) 0.0137822 0.11740
>>               x1        0.0009384 0.03063  -1.00
>>               x2         0.0767387 0.27702  -1.00  1.00
>> Number of obs: 212570, groups:  country:wave, 143; country, 82; wbregion,
>> 18
>>
>>  For what it's worth: the model estimates fine. The results are intuitive
>> and theoretically consistent. They also don't change if I were to remove
>> that region random effect. I'd like to keep the region random effect (with
>> varying slopes) in the model. I'm struggling with what I should think
>> about
>> the perfect correlations.
>>
>> On Tue, Jul 14, 2015 at 9:07 PM, Jake Westfall <jake987722 at hotmail.com>
>> wrote:
>>
>> > Hi Steve,
>> >
>> >
>> > I think the issue is that estimating 3 variances and 3 covariances for
>> > regions is quite ambitious given that there are only 6 regions. I think
>> > it's not surprising that the model has a hard time getting good
>> estimates
>> > of those parameters.
>> >
>> >
>> > Jake
>> >
>> > > Date: Tue, 14 Jul 2015 20:53:01 -0400
>> > > From: steven.v.miller at gmail.com
>> > > To: r-sig-mixed-models at r-project.org
>> > > Subject: [R-sig-ME] Perfectly correlated random effects (when they
>> > shouldn't be)
>> >
>> > >
>> > > Hi all,
>> > >
>> > > I'm a long-time reader and wanted to raise a question I've seen asked
>> > here
>> > > before about correlated random effects. Past answers I have
>> encountered
>> > on
>> > > this listserv explain that perfectly correlated random effects suggest
>> > > model overfitting and variances of random effects that are effectively
>> > zero
>> > > and can be omitted for a simpler model. In my case, I don't think
>> that's
>> > > what is happening here, though I could well be fitting a poor model in
>> > > glmer.
>> > >
>> > > I'll describe the nature of the data first. I'm modeling
>> individual-level
>> > > survey data for countries across multiple waves and am estimating the
>> > > region of the globe as a random effect as well. I have three random
>> > effects
>> > > (country, country-wave, and region). In the region random effect, I am
>> > > allowing country-wave-level predictors to have varying slopes. My
>> inquiry
>> > > is whether some country-wave-level contextual indicator can have an
>> > overall
>> > > effect (as a fixed effect), the effect of which can vary by region. In
>> > > other words: is the effect of some country-level indicator (e.g.
>> > > unemployment) in a given year different for countries in Western
>> Europe
>> > > than for countries in Africa even if, on average, there is a positive
>> or
>> > > negative association at the individual-level? These country-wave-level
>> > > predictors that I allow to vary by region are the ones reporting
>> perfect
>> > > correlation and I'm unsure how to interpret that (or if I'm estimating
>> > the
>> > > model correctly).
>> > >
>> > > I should also add that I have individual-level predictors as well as
>> > > country-wave-level predictors, though it's the latter that concerns
>> me.
>> > > Further, every non-binary indicator in the model is standardized by
>> two
>> > > standard deviations.
>> > >
>> > > For those interested, I have a reproducible (if rather large) example
>> > > below. Dropbox link to the data is here:
>> > >
>> >
>> https://www.dropbox.com/s/t29jfwm98tsdr71/correlated-random-effects.csv?dl=0
>> > >
>> > > In this reproducible example, y is the outcome variable and x1 and x2
>> are
>> > > two country-wave-level predictors I allow to vary by region. Both x1
>> and
>> > x2
>> > > are interval-level predictors that I standardized to have a mean of
>> zero
>> > > and a standard deviation of .5 (per Gelman's (2008) recommendation).
>> > >
>> > > I estimate the following model.
>> > >
>> > > summary(M1 <- glmer(y ~ x1 + x2 + (1 | country) + (1 | country:wave) +
>> > (1 +
>> > > x1 + x2 | region), data=subset(Data), family=binomial(link="logit")))
>> > >
>> > > The results are theoretically intuitive. I think they make sense.
>> > However,
>> > > I get a report of perfect correlation for the varying slopes of the
>> > region
>> > > random effect.
>> > >
>> > > Random effects:
>> > > Groups Name Variance Std.Dev. Corr
>> > > country:wave (Intercept) 0.15915 0.3989
>> > > country (Intercept) 0.32945 0.5740
>> > > region (Intercept) 0.01646 0.1283
>> > > x1 0.02366 0.1538 1.00
>> > > x2 0.13994 0.3741 -1.00 -1.00
>> > > Number of obs: 212570, groups: country:wave, 143; country, 82;
>> region, 6
>> > >
>> > > What should I make of this and am I estimating this model wrong? For
>> what
>> > > it's worth, the dotplot of the region random effect (with conditional
>> > > variance) makes sense and is theoretically intuitive, given my data. (
>> > > http://i.imgur.com/mrnaJ77.png)
>> > >
>> > > Any help would be greatly appreciated.
>> > >
>> > > Best regards,
>> > > Steve
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > _______________________________________________
>> > > R-sig-mixed-models at r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>>
>>
>>
>> --
>> Steven V. Miller
>> Assistant Professor
>> Department of Political Science
>> Clemson University
>> http://svmiller.com
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>

	[[alternative HTML version deleted]]