[R-sig-ME] Problem specifying uncorrelated random intercepts and slopes for a multi-df covariate

Thu Aug 4 17:07:39 CEST 2022

Many thanks, Ben and João. I did as advised, converting x2 into two dummies
and specifying the random effects as *(x2B+x2C||id)*. This yields the
correct number of estimated parameters (1 random intercepts, 2 random
slopes). However, there's something I don't understand about the results.

Firstly, there's a warning about a singular fit, which I take to mean that
some parameters are inestimable. Judging from the following output, I
gather that it must be the 2 random slopes, which are estimated at
essentially zero:

*> summary(slopes.nocorr)*
*...*

*Random effects: Groups Name        Variance  Std.Dev.  id     (Intercept)
5.539e-01 7.443e-01 id.1   x2B         1.546e-14 1.243e-07 id.2   x2C
  0.000e+00 0.000e+00Number of obs: 1405, groups:  id, 292Fixed effects:
        Estimate Std. Error z value Pr(>|z|)    (Intercept) -0.93057
 0.27450  -3.390 0.000699 ***x1           0.51158    0.26550   1.927
0.053997 .  x2B          2.54505    0.20936  12.156  < 2e-16 ***x2C
 2.30179    0.30480   7.552 4.29e-14 ***x3          -0.77494    0.11660
 -6.646 3.01e-11 ***x4           0.24489    0.04957   4.940 7.80e-07 ***x5
          0.28619    0.13810   2.072 0.038235 *  x6          -1.07816
 0.90224  -1.195 0.232091    x7          -0.67521    0.32810  -2.058
0.039595 *  *
*x8          -0.76275    0.28824  -2.646 0.008138 ** *

It seems very strange that the random slopes should be inestimable: x2B and
x2C are not exceedingly scarce conditions: there are 277 observations of
the former, 91 of the latter. There are 52 IDs with observations of both xB
= 1 and xB = 0. And there are 33 IDs with observations of both xC = 1 and
xC = 0. So, I don't understand why the random slopes couldn't be estimated.

Stranger still, if I fit an otherwise identical model with *correlated* random
effects, the random-slope estimates suddenly do differ from 0 (although
there's still a singularity warning). Like so:

*> slopes.corr <-   glmer(y ~ (x2B+x2C|id)  + x1 + x2B + x2C + x3 + x4 + x5
+ x6 + x7 + x8, family = binomial, data = mydata, control =
glmerControl(optimizer = "optimx", optCtrl = list(method = "nlm")), nAGQ =
1)*
*...*

*Random effects: Groups Name        Variance Std.Dev. Corr        id
(Intercept) 0.5827   0.7633                      x2B         0.0798
0.2825   -0.22              x2C         0.2060   0.4539   -0.68  0.86Number
of obs: 1405, groups:  id, 292*

*Fixed effects:            Estimate Std. Error z value Pr(>|z|)
 (Intercept) -0.98131    0.30416  -3.226  0.00125 ** x1           0.56391
 0.29692   1.899  0.05754 .  x2B          2.56529    0.28735   8.928  <
2e-16 ***x2C          2.17394    0.41814   5.199 2.00e-07 ***x3
 -0.77852    0.11699  -6.655 2.84e-11 ***x4           0.24384    0.04982
4.895 9.84e-07 ***x5           0.28790    0.14005   2.056  0.03981 *  x6
       -1.08438    0.91036  -1.191  0.23359    x7          -0.66753
 0.32962  -2.025  0.04285 *  x8          -0.75425    0.28913  -2.609
 0.00909 ** *

It boggles my mind that the 2 random slopes should be inestimable in the
simpler model (with no correlation params) but somehow become estimable
when you introduce 3 more parameters by allowing random-effect
correlations. My brain has melted. Does anyone have a clue what's going on?
The anonymized datafile is available here <https://file.io/VKruszwJBwcK>.

Best,

Juho

to 4. elok. 2022 klo 0.30 João Veríssimo (jl.verissimo using gmail.com) kirjoitti:

> (1+x2 || id) is shorter notation for (1 | id) + (0 + x2 | id ).
> And because x2 is a factor, suppressing the intercept leads to the
> 'cell-mean coding' of x2: what is being estimated is the between-id
> variation around the means of each level, A, B, and C (and their
> correlation).
>
> In order to get what you want, turn x2 into two numeric variables
> according to its contrasts. For example:
> x2num1 <- ifelse(x2=="B", 1, 0)
> x2num2 <- ifelse(x2=="C", 1, 0)
>
> Then (1 + x2num1 + x2num2 || id) will give you the random intercept, two
> random slopes and no correlations.
>
> João
>
> On 03/08/2022 21:10, Juho Kristian Ruohonen wrote:
> > Dear List,
> >
> > This is a logistic GLMM with 1 grouping factor + 8 fixed-effect
> covariates.
> > One of the fixed effects, namely x2, has three unordered categories. This
> > is the covariate for whose 2 non-reference categories I want to estimate
> > random slopes, along with the random intercepts with which I don't expect
> > the slopes to be correlated. But I fail:
> >
> >
> >
> > *> VarCorr(glmer(y ~ (x2||id) + x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8,
> > family = binomial, data = mydata, control = glmerControl(optimizer =
> > "optimx", optCtrl = list(method = "nlm")), nAGQ = 1))*
> >
> >
> > *boundary (singular) fit: see help('isSingular')*
> >
> >
> >
> >
> > * Groups Name        Std.Dev. Corr        id     (Intercept) 0.00000
> >         id.1   x2A         0.76331                     x2B
>  0.75422
> >   0.931              x2C         0.56139  0.807 0.967*
> >
> > ^ Why is it reporting correlations when I told it not to? And why is it
> > reporting the intercept variance as zero (which is wholly implausible)?
> And
> > why is it reporting a "random slope" for the reference category of x2?
> It's
> > the reference category, for crying out loud! It's not supposed to get an
> > estimate.
> >
> > Consultation of the lme4 manual
> > <https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf> (page
> 7)
> > suggests the following alternative syntax for specifying random slopes
> > uncorrelated with the random intercepts:
> >
> > *> VarCorr(glmer(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + (1|id) +
> > (0+x2|id), family = binomial, data = mydata, control =
> > glmerControl(optimizer = "optimx", optCtrl = list(method = "nlm")), nAGQ
> =
> > 1))*
> >
> > *boundary (singular) fit: see help('isSingular')*
> >
> >
> >
> >
> >
> > * Groups Name        Std.Dev. Corr        id     (Intercept) 0.00000
> >         id.1   x2A         0.76331                     x2B
>  0.75422
> >   0.931              x2C         0.56139  0.807 0.967*
> >
> > ^ The exact same strangeness persists. Correlations are being estimated
> > against my wishes, and there's a nonsensical parameter supposedly
> > ostensibly representing the reference category, plus an implausible zero
> > value reported on the random intercepts. What am I doing wrong?
> >
> > Best,
> >
> > Juho
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

	[[alternative HTML version deleted]]