[R-sig-ME] Dropping correlations bet. random-effects in lme4 syntax
Phillip Alday
ph||||p@@|d@y @end|ng |rom mp|@n|
Wed Oct 7 23:39:12 CEST 2020
Please keep the list in CC.
I would strongly advise against statistical testing of the correlation
parameters. You can technically do a likelihood ratio test comparing
models fit with | and with ||, because the zero-correlation models are a
special case of the unrestricted-correlation model, but DO NOT DO THIS.
The estimates of the correlation parameters are notoriously unstable: we
typically don't have nearly enough data to estimate them (because you
need both a large number of observations AND a large number of groups).
Moreover, the correlation estimates are intertwined with the estimates
of the variance parameters. The sampling distribution for the variance
parameters often looks a lot like a mixture of a point-mass at zero and
some other, bounded distribution. This results in the sampling
distribution of the correlation parameter also having a point mass at
±1. We have an example of this in the MixedModels.jl docs
(https://juliastats.org/MixedModels.jl/dev/bootstrap/); there is
probably an lme4 equivalent somewhere on the internet.
For a more practical take on what happens when leave out the correlation
parameters, see e.g. this blog post by John Kruschke
https://doingbayesiandataanalysis.blogspot.com/2019/07/shrinkage-in-hierarchical-models-random.html
or this take on dropping correlations to simplify model structure but
still do good inference by Reinhold Kliegl
https://rpubs.com/Reinhold/22193
Phillip
On 4/10/20 9:33 am, Simon Harmel wrote:
> Hi Phillip,
>
> I need to ask one follow up question. Suppose I want to test only one
> correlation in `m1`'s random-effects covariance matrix below (pick
> anyone correlation you want) for its statistical significance. I assume
> that I must use a likelihood ratio test that compares `m1` with a
> "second model" in which that correlation is absent, right?
>
> How would you specify that "second model" so you can perform a
> likelihood ratio test?
>
> m1 <- lmer(y ~ A * B * C + (A * B * C | group), data = data)
>
> On Sat, Oct 3, 2020 at 1:54 PM Simon Harmel <sim.harmel using gmail.com
> <mailto:sim.harmel using gmail.com>> wrote:
>
> Awesome, thanks!
>
> On Sat, Oct 3, 2020 at 12:40 PM Phillip Alday <phillip.alday using mpi.nl
> <mailto:phillip.alday using mpi.nl>> wrote:
>
> On 03/10/2020 19:24, Simon Harmel wrote:
>> Thanks for the additional information about the use of `0+` in
>> the context of categorical variables:)
>>
>> So, by splitting the specification of the grouping variable like:
>>
>> lmer(y ~ A * B * C + (A * C | group) + (B|group) , data = data)
>>
>> I get the below correlation matrix. So, here we have created 2
>> different intercepts, one for each "split" of the same
>> grouping variable, right?
>
> Yes, by default R adds in an intercept. I recommend always
> writing 1+ or 0+ to make explicit what you want.
>
> (This was what I meant with "Note that life gets tricky with the
> interaction terms." in my first reply.)
>
>>
>> I ask this because B has no correlation with the first split's
>> (intercept) but it has a singular correlation with the second
>> split's (intercept). The output looks confusing to a novice
>> like me.
>>
>>
>> (Intercept) A C A:C (Intercept) B
>> (Intercept) 1.000 -0.794 -0.195 0.953 0 0
>> A -0.794 1.000 0.278 -0.854 0 0
>> C -0.195 0.278 1.000 0.028 0 0
>> A:C 0.953 -0.854 0.028 1.000 0 0
>> (Intercept) 0.000 0.000 0.000 0.000 1 -1
>> B 0.000 0.000 0.000 0.000 -1 1
>
> Singularity isn't a problem per se -- it's mathematically well
> defined and being able to fit singular covariance matrices for
> the random effects was one of the algorithmic innovations lme4
> has compared to its predecessor nlme.
>
> Singularity can be a problem for inference (because it's often a
> sign of overfitting).... but it makes sense here. B has no
> correlation with the first intercept because you set it to be
> zero via your formula specification, so that term is forced to
> be zero in the model computation. The second intercept is
> perfectly correlated with B because it (the second intercept) is
> redundant in the model, but it can't correlate with the first
> intercept (because that correlation is forced to zero by your
> model specification) and so it collapses into B.
>
> If B is continuous, you can avoid this with 0+B. If B is
> categorical, then 0+B will still be overparameterized, but you
> can try 0+dummy(B) to set up indicator variable for one of the
> two levels. (I can tell from your output that B is continuous or
> only has two levels because there is only one slope for it.)
>
> Phillip
>
>>
>> On Sat, Oct 3, 2020 at 11:52 AM Phillip Alday
>> <phillip.alday using mpi.nl <mailto:phillip.alday using mpi.nl>> wrote:
>>
>> I have no idea what 0* means, but 0+ means "suppress the
>> intercept" (which has knock-on effects for categorical
>> variables and whether they're represented in the model as
>> (nlevels-1) contrasts or nlevels).
>>
>> For the other things: try it out. The output of
>> summary(m1) will show you which levels and correlations
>> were kept.
>>
>> On 03/10/2020 18:44, Simon Harmel wrote:
>>> Thanks Phillip. What would be the meaning of placing `0
>>> +` next to any of the random effects (e.g., B) as shown
>>> in m2?
>>>
>>> m1 <- lmer(y ~ A * B * C + (A * C | group) + (B|group) ,
>>> data = data)
>>>
>>> m2 <- lmer(y ~ A * B * C + (A * 0+ B * C | group), data
>>> = data)
>>>
>>> On Sat, Oct 3, 2020 at 11:33 AM Phillip Alday
>>> <phillip.alday using mpi.nl <mailto:phillip.alday using mpi.nl>> wrote:
>>>
>>> You can split the specification of your grouping to
>>> achieve this, at
>>> least in part:
>>>
>>> lmer(y ~ A * B * C + (A * C | group) + (B|group) ,
>>> data = data)
>>>
>>> Note that life gets tricky with the interaction terms.
>>>
>>> Phillip
>>>
>>> On 03/10/2020 06:35, Simon Harmel wrote:
>>> > Hello all,
>>> >
>>> > I know to drop all correlations among all level-1
>>> predictors in the random
>>> > part of an lmer() call, I can use `||`. But I was
>>> wondering how to drop
>>> > correlations (a) "individually" or (b) "in pairs"?
>>> >
>>> > Example of (a) is how to drop the correlation of B
>>> with others (A & C)?
>>> > Example of (b) is how to drop the correlation
>>> between B and C?
>>> >
>>> > lmer(y ~ A * B * C + (A * B * C || group), data =
>>> data)
>>> >
>>> > Thanks,
>>> > Simon
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > R-sig-mixed-models using r-project.org
>>> <mailto:R-sig-mixed-models using r-project.org> mailing list
>>> >
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
More information about the R-sig-mixed-models
mailing list