[R-sig-ME] "nesting for free" in lme4. Do I have this right?

Fri Apr 6 17:16:56 CEST 2018

I agree that the terminology could go either way.  FWIW the GLMM FAQ
uses Thierry's convention, i.e. "If the lower-level random effect has
the same labels within each larger group (e.g. blocks 1, 2, 3, 4
within sites A, B, and C) then the explicit nesting (1|a/b) is
required. It seems to be considered best practice to code the nested
level uniquely (e.g. A1, A2, …, B1, B2, …) so that confusion between
nested and crossed effects is less likely" (similarly in my chapter 14
of Fox et al's Ecological Statistics book).

On Fri, Apr 6, 2018 at 10:47 AM, Jake Westfall
<jake.a.westfall at gmail.com> wrote:
> Hi Thierry,
>
> I see your point. I usually think of implicit vs. explicit nesting in terms
> of how the factor levels are labeled in the dataset: the levels are
> explicitly nested if you can see the nesting just from looking at the data
> frame, and they're implicitly nested if you can't tell that they're nested
> just from looking at the data frame--because they appear to be
> crossed--instead you have to have outside, background knowledge about the
> dataset. If you instead view the issue from the perspective of the syntax,
> then yes, it would make sense to call (1|region/state) explicit and (1|region)
> + (1|state) implicit.
>
> Jake
>
> On Fri, Apr 6, 2018 at 3:08 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be>
> wrote:
>
>> Dear Jake,
>>
>> IMHO you switched the implicit and explicit nesting. (1|region/state)
>> is explicit because you tell the model that those should be nested.
>> Implicit nesting occurs when you add the random effect as crossed
>> (1|region) + (1|state) but since each state has a unique ID and each
>> state occurs in only one region, the result is identical to the nested
>> random effects. Hence the implicit nesting.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>> AND FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkelinx at inbo.be
>> Havenlaan 88 bus 73, 1000 Brussel
>> www.inbo.be
>>
>> ////////////////////////////////////////////////////////////
>> ///////////////////////////////
>> To call in the statistician after the experiment is done may be no
>> more than asking him to perform a post-mortem examination: he may be
>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
>> not ensure that a reasonable answer can be extracted from a given body
>> of data. ~ John Tukey
>> ////////////////////////////////////////////////////////////
>> ///////////////////////////////
>>
>>
>>
>>
>> 2018-04-05 23:07 GMT+02:00 Jake Westfall <jake.a.westfall at gmail.com>:
>> > Hi Paul,
>> >
>> > The "(1|state) + (1|region)" and "(1|region/state)" models are equivalent
>> > if states are explicitly nested (as opposed to implicitly nested) in
>> > regions. Explicitly nested means that the states in each region are
>> labeled
>> > with unique IDs that don't repeat across regions. This equivalence is
>> > because (1|region/state) is just internally expanded to (1|region) +
>> > (1|region:state), but the "region:state" interaction term has the same
>> > number of unique levels as a simple "state" term if all states have
>> unique
>> > IDs.
>> >
>> > The model with just "(1|state)" is not equivalent to either of the
>> > aforementioned models. It has fewer parameters and it lumps some of the
>> > region variance in with the state variance.
>> >
>> > As for identifiability, because you have many observations within each
>> > state, a model with random effects for both state and region is
>> definitely
>> > identifiable, there shouldn't be any technical or conceptual problem. And
>> > it's probably a better model since it appears there is at least
>> non-trivial
>> > region variance in your data and the model converges fine.
>> >
>> > Jake
>> >
>> > On Thu, Apr 5, 2018 at 3:51 PM, Paul Johnson <pauljohn32 at gmail.com>
>> wrote:
>> >
>> >> I don't (yet) have permission to give you this data, but I've asked
>> >> for permission and hope I can show in future if you are interested.
>> >>
>> >> One of the students showed me a model and I said "oh, no, that won't
>> >> work" but glmer calculated estimates. He ran a model for US public
>> >> opinion with "(1|state) + (1|region)" and I thought it should be "(1 |
>> >> region/state)", but it turns out results are the same. This is what I
>> >> mean "nesting for free" in lme4.
>> >>
>> >> This author's field of study uses lme4 to calculate shrinkage
>> >> estimates for states, gender/ethnic subgroups. Things that I would
>> >> treat as individual level fixed effects, like gender, age, and
>> >> education, are inserted as random effects. Then they use the
>> >> conditional modes for constructing other quantities they want.
>> >>
>> >> After the student ran his model, I suggested instead: "(1 | state)".
>> >> This also converges, but the state-by-state effect estimates differ
>> >> from the nested model. I thought that was interesting.  The main
>> >> purpose here is to retrieve the conditional modes, NOT the variance
>> >> estimates.
>> >>
>> >> Which model is better/correct?
>> >>
>> >> Nested
>> >>
>> >> Generalized linear mixed model fit by maximum likelihood (Laplace
>> >>   Approximation) [glmerMod]
>> >>  Family: binomial  ( logit )
>> >> Formula: y ~ (1 | regionf/state)
>> >> Control: glmerControl(optimizer = "Nelder_Mead")
>> >>
>> >>      AIC      BIC   logLik deviance df.resid
>> >>  18888.9  18911.5  -9441.4  18882.9    13875
>> >>
>> >> Scaled residuals:
>> >>     Min      1Q  Median      3Q     Max
>> >> -1.0856 -0.8567 -0.7726  1.1097  1.4309
>> >>
>> >> Random effects:
>> >>  Groups        Name        Variance Std.Dev.
>> >>  state:regionf (Intercept) 0.05461  0.2337
>> >>  regionf       (Intercept) 0.01226  0.1107
>> >> Number of obs: 13878, groups:  state:regionf, 50; regionf, 4
>> >>
>> >> Fixed effects:
>> >>             Estimate Std. Error z value Pr(>|z|)
>> >> (Intercept) -0.26732    0.06965  -3.838 0.000124 ***
>> >> ---
>> >>
>> >>
>> >> Here are region and first few state conditional modes from the nested
>> >> model:
>> >>
>> >> $state
>> >>                 (Intercept)
>> >> Alabama        -0.188083365
>> >> Alaska         -0.103191227
>> >> Arizona         0.049081765
>> >> Arkansas       -0.108260203
>> >> California      0.436002015
>> >> Colorado        0.035086075
>> >> Connecticut     0.046271020
>> >> Delaware        0.080796357
>> >> Florida         0.062439432
>> >> ...
>> >>
>> >> $regionf
>> >>            (Intercept)
>> >> Northeast  0.139654257
>> >> Midwest   -0.002894665
>> >> South     -0.106236023
>> >> West      -0.027772517
>> >>
>> >> For reference, the first few random effects from (1 | state) are:
>> >>
>> >> > ranef(individual.modeld)
>> >> $state
>> >>                 (Intercept)
>> >> Alabama        -0.260553538
>> >> Alaska         -0.116579160
>> >> Arizona         0.046677297
>> >> Arkansas       -0.172353191
>> >> California      0.435397028
>> >> Colorado        0.032704790
>> >> Connecticut     0.162169289
>> >> Delaware        0.057368626
>> >> Florida        -0.013638881
>> >> ....
>> >>
>> >> The "total state effect" for, say, Alabama is the sum of the South
>> >> effect and the Alabama effect,
>> >> and that's different from the (1 | state) model.
>> >>
>> >> Alabama nested: -0.106236023 -0.188083365 = -0.2943194
>> >>
>> >> Alabama from (1 | state): -0.26
>> >>
>> >> Those are slightly different. Is one better?
>> >>
>> >> Seems to me the difference between the two is simply the shrinkage
>> >> effect of the PLS estimator, there's no substantively important
>> >> meaning in it.  The shrinkage is applied separately to (1 | region)
>> >> and (1 | state). The correlation between region and state effects is
>> >> assumed 0.  It is manufacturing 2 random effect layers where there
>> >> should be just one.
>> >>
>> >> If state is treated as a fixed effect, of course, the region variable
>> >> is unidentifiable. Should the random effect model really be allowed to
>> >> manufacture a difference?  That makes me think (1 | state) is the
>> >> "right model" and "(1 | region/state)" is different only as an
>> >> artifact.
>> >>
>> >> Likelihood ratio test seems to say there is no big difference, which
>> >> is a relief.  But what if there were?
>> >>
>> >> individual.modeld: y ~ (1 | state)
>> >> individual.modelc: y ~ (1 | regionf/state)
>> >>                   Df   AIC   BIC  logLik deviance  Chisq Chi Df
>> Pr(>Chisq)
>> >> individual.modeld  2 18889 18904 -9442.6    18885
>> >> individual.modelc  3 18889 18912 -9441.4    18883 2.2477      1
>>  0.1338
>> >>
>> >>
>> >> More and more, I have the feeling that the random effects model
>> >> fabricates an ability to estimate quantities that are not really
>> >> indentifiable.  Estimating state-level-random effects on top of
>> >> randomly assigned regional scores is only possible because of the
>> >> technicalities related to shrinkage. Maybe. I'm not sure, that's why
>> >> I'm asking you what you think.
>> >>
>> >> pj
>> >> --
>> >> Paul E. Johnson   http://pj.freefaculty.org
>> >> Director, Center for Research Methods and Data Analysis
>> >> http://crmda.ku.edu
>> >>
>> >> To write to me directly, please address me at pauljohn at ku.edu.
>> >>
>> >> _______________________________________________
>> >> R-sig-mixed-models at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >>
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models