[R-sig-ME] "nesting for free" in lme4. Do I have this right?

Thu Apr 5 22:51:58 CEST 2018

I don't (yet) have permission to give you this data, but I've asked
for permission and hope I can show in future if you are interested.

One of the students showed me a model and I said "oh, no, that won't
work" but glmer calculated estimates. He ran a model for US public
opinion with "(1|state) + (1|region)" and I thought it should be "(1 |
region/state)", but it turns out results are the same. This is what I
mean "nesting for free" in lme4.

This author's field of study uses lme4 to calculate shrinkage
estimates for states, gender/ethnic subgroups. Things that I would
treat as individual level fixed effects, like gender, age, and
education, are inserted as random effects. Then they use the
conditional modes for constructing other quantities they want.

After the student ran his model, I suggested instead: "(1 | state)".
This also converges, but the state-by-state effect estimates differ
from the nested model. I thought that was interesting.  The main
purpose here is to retrieve the conditional modes, NOT the variance
estimates.

Which model is better/correct?

Nested

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: y ~ (1 | regionf/state)
Control: glmerControl(optimizer = "Nelder_Mead")

     AIC      BIC   logLik deviance df.resid
 18888.9  18911.5  -9441.4  18882.9    13875

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.0856 -0.8567 -0.7726  1.1097  1.4309

Random effects:
 Groups        Name        Variance Std.Dev.
 state:regionf (Intercept) 0.05461  0.2337
 regionf       (Intercept) 0.01226  0.1107
Number of obs: 13878, groups:  state:regionf, 50; regionf, 4

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.26732    0.06965  -3.838 0.000124 ***
---

Here are region and first few state conditional modes from the nested model:

$state
                (Intercept)
Alabama        -0.188083365
Alaska         -0.103191227
Arizona         0.049081765
Arkansas       -0.108260203
California      0.436002015
Colorado        0.035086075
Connecticut     0.046271020
Delaware        0.080796357
Florida         0.062439432
...

$regionf
           (Intercept)
Northeast  0.139654257
Midwest   -0.002894665
South     -0.106236023
West      -0.027772517

For reference, the first few random effects from (1 | state) are:

> ranef(individual.modeld)
$state
                (Intercept)
Alabama        -0.260553538
Alaska         -0.116579160
Arizona         0.046677297
Arkansas       -0.172353191
California      0.435397028
Colorado        0.032704790
Connecticut     0.162169289
Delaware        0.057368626
Florida        -0.013638881
....

The "total state effect" for, say, Alabama is the sum of the South
effect and the Alabama effect,
and that's different from the (1 | state) model.

Alabama nested: -0.106236023 -0.188083365 = -0.2943194

Alabama from (1 | state): -0.26

Those are slightly different. Is one better?

Seems to me the difference between the two is simply the shrinkage
effect of the PLS estimator, there's no substantively important
meaning in it.  The shrinkage is applied separately to (1 | region)
and (1 | state). The correlation between region and state effects is
assumed 0.  It is manufacturing 2 random effect layers where there
should be just one.

If state is treated as a fixed effect, of course, the region variable
is unidentifiable. Should the random effect model really be allowed to
manufacture a difference?  That makes me think (1 | state) is the
"right model" and "(1 | region/state)" is different only as an
artifact.

Likelihood ratio test seems to say there is no big difference, which
is a relief.  But what if there were?

individual.modeld: y ~ (1 | state)
individual.modelc: y ~ (1 | regionf/state)
                  Df   AIC   BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
individual.modeld  2 18889 18904 -9442.6    18885
individual.modelc  3 18889 18912 -9441.4    18883 2.2477      1     0.1338

More and more, I have the feeling that the random effects model
fabricates an ability to estimate quantities that are not really
indentifiable.  Estimating state-level-random effects on top of
randomly assigned regional scores is only possible because of the
technicalities related to shrinkage. Maybe. I'm not sure, that's why
I'm asking you what you think.

pj
-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write to me directly, please address me at pauljohn at ku.edu.