[R-sig-ME] "nesting for free" in lme4. Do I have this right?
Jake Westfall
jake.a.westfall at gmail.com
Thu Apr 5 23:07:40 CEST 2018
Hi Paul,
The "(1|state) + (1|region)" and "(1|region/state)" models are equivalent
if states are explicitly nested (as opposed to implicitly nested) in
regions. Explicitly nested means that the states in each region are labeled
with unique IDs that don't repeat across regions. This equivalence is
because (1|region/state) is just internally expanded to (1|region) +
(1|region:state), but the "region:state" interaction term has the same
number of unique levels as a simple "state" term if all states have unique
IDs.
The model with just "(1|state)" is not equivalent to either of the
aforementioned models. It has fewer parameters and it lumps some of the
region variance in with the state variance.
As for identifiability, because you have many observations within each
state, a model with random effects for both state and region is definitely
identifiable, there shouldn't be any technical or conceptual problem. And
it's probably a better model since it appears there is at least non-trivial
region variance in your data and the model converges fine.
Jake
On Thu, Apr 5, 2018 at 3:51 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> I don't (yet) have permission to give you this data, but I've asked
> for permission and hope I can show in future if you are interested.
>
> One of the students showed me a model and I said "oh, no, that won't
> work" but glmer calculated estimates. He ran a model for US public
> opinion with "(1|state) + (1|region)" and I thought it should be "(1 |
> region/state)", but it turns out results are the same. This is what I
> mean "nesting for free" in lme4.
>
> This author's field of study uses lme4 to calculate shrinkage
> estimates for states, gender/ethnic subgroups. Things that I would
> treat as individual level fixed effects, like gender, age, and
> education, are inserted as random effects. Then they use the
> conditional modes for constructing other quantities they want.
>
> After the student ran his model, I suggested instead: "(1 | state)".
> This also converges, but the state-by-state effect estimates differ
> from the nested model. I thought that was interesting. The main
> purpose here is to retrieve the conditional modes, NOT the variance
> estimates.
>
> Which model is better/correct?
>
> Nested
>
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) [glmerMod]
> Family: binomial ( logit )
> Formula: y ~ (1 | regionf/state)
> Control: glmerControl(optimizer = "Nelder_Mead")
>
> AIC BIC logLik deviance df.resid
> 18888.9 18911.5 -9441.4 18882.9 13875
>
> Scaled residuals:
> Min 1Q Median 3Q Max
> -1.0856 -0.8567 -0.7726 1.1097 1.4309
>
> Random effects:
> Groups Name Variance Std.Dev.
> state:regionf (Intercept) 0.05461 0.2337
> regionf (Intercept) 0.01226 0.1107
> Number of obs: 13878, groups: state:regionf, 50; regionf, 4
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.26732 0.06965 -3.838 0.000124 ***
> ---
>
>
> Here are region and first few state conditional modes from the nested
> model:
>
> $state
> (Intercept)
> Alabama -0.188083365
> Alaska -0.103191227
> Arizona 0.049081765
> Arkansas -0.108260203
> California 0.436002015
> Colorado 0.035086075
> Connecticut 0.046271020
> Delaware 0.080796357
> Florida 0.062439432
> ...
>
> $regionf
> (Intercept)
> Northeast 0.139654257
> Midwest -0.002894665
> South -0.106236023
> West -0.027772517
>
> For reference, the first few random effects from (1 | state) are:
>
> > ranef(individual.modeld)
> $state
> (Intercept)
> Alabama -0.260553538
> Alaska -0.116579160
> Arizona 0.046677297
> Arkansas -0.172353191
> California 0.435397028
> Colorado 0.032704790
> Connecticut 0.162169289
> Delaware 0.057368626
> Florida -0.013638881
> ....
>
> The "total state effect" for, say, Alabama is the sum of the South
> effect and the Alabama effect,
> and that's different from the (1 | state) model.
>
> Alabama nested: -0.106236023 -0.188083365 = -0.2943194
>
> Alabama from (1 | state): -0.26
>
> Those are slightly different. Is one better?
>
> Seems to me the difference between the two is simply the shrinkage
> effect of the PLS estimator, there's no substantively important
> meaning in it. The shrinkage is applied separately to (1 | region)
> and (1 | state). The correlation between region and state effects is
> assumed 0. It is manufacturing 2 random effect layers where there
> should be just one.
>
> If state is treated as a fixed effect, of course, the region variable
> is unidentifiable. Should the random effect model really be allowed to
> manufacture a difference? That makes me think (1 | state) is the
> "right model" and "(1 | region/state)" is different only as an
> artifact.
>
> Likelihood ratio test seems to say there is no big difference, which
> is a relief. But what if there were?
>
> individual.modeld: y ~ (1 | state)
> individual.modelc: y ~ (1 | regionf/state)
> Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
> individual.modeld 2 18889 18904 -9442.6 18885
> individual.modelc 3 18889 18912 -9441.4 18883 2.2477 1 0.1338
>
>
> More and more, I have the feeling that the random effects model
> fabricates an ability to estimate quantities that are not really
> indentifiable. Estimating state-level-random effects on top of
> randomly assigned regional scores is only possible because of the
> technicalities related to shrinkage. Maybe. I'm not sure, that's why
> I'm asking you what you think.
>
> pj
> --
> Paul E. Johnson http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis
> http://crmda.ku.edu
>
> To write to me directly, please address me at pauljohn at ku.edu.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list