[R-sig-ME] Fwd: Nonsensical results in glmer (bglmer)

Francesco Romano |brom@no77 @end|ng |rom gm@||@com
Mon Jul 31 14:42:59 CEST 2023


Hello Tom,

Thank you so much for pitching in.

I haven't touched the levels for group so the ref level is L1-L2 as far as
I can see.

> contrasts(masterPT$Group)
   2
L1 0
L2 1

What seems strange to me is that the output is calling the levels 1 and 2
rather than
what they actually are, L1 and L2. This is new to me because in the past I
have always seen
something like GroupL1 or GroupL2 in the fixed-effects section of the
output.

Just in case, I reset the contrast scheme to treatment and ran the analysis
again (the binomial outcome was recoded as 0/1 as 'Correct'):

> contrasts(masterPT$Group) = contr.treatment(2)
> contrasts(masterPT$Type) = contr.treatment(2)
> freqmodel4<-bglmer(Correct~Type*Group+(1+Type|SUBJ)+(1+Group|ITEM),
family = binomial(link="logit"), data=masterPT,
control=glmerControl(optimizer = "bobyqa"), nAGQ=1)
> summary(freqmodel4)
Cov prior  : SUBJ ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
common.scale = TRUE)
           : ITEM ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
common.scale = TRUE)
Prior dev  : -5.3929

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['bglmerMod']
 Family: binomial  ( logit )
Formula: Correct ~ Type * Group + (1 + Type | SUBJ) + (1 + Group | ITEM)
   Data: masterPT
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid
  1676.8   1734.1   -828.4   1656.8     2265

Scaled residuals:
    Min      1Q  Median      3Q     Max
-4.8595 -0.1106  0.1315  0.3447  4.1192

Random effects:
 Groups Name        Variance Std.Dev. Corr
 SUBJ   (Intercept) 1.915    1.384
        Type2       1.642    1.281    -0.74
 ITEM   (Intercept) 8.970    2.995
        Group2      6.479    2.545    -0.74
Number of obs: 2275, groups:  SUBJ, 65; ITEM, 35

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    2.7380     0.7070   3.872 0.000108 ***
Type2          2.5683     1.3313   1.929 0.053707 .
Group2        -1.3802     0.6907  -1.998 0.045681 *
Type2:Group2  -4.0366     1.2383  -3.260 0.001115 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) Type2  Group2
Type2       -0.446
Group2      -0.750  0.312
Type2:Grop2  0.326 -0.792 -0.442

Now this has basically turn things around the way they should be. Group2 is
in fact the L1 group so the coefficient makes sense.
It seems like recoding the outcome to 0/1 from correct/incorrect did the
trick. I confirmed this by relevelling:

> freqmodel5<-bglmer(Correct~Type*relevel(masterPT$Group, ref =
"L2")+(1+Type|SUBJ)+(1+Group|ITEM), family = binomial(link="logit"),
data=masterPT, control=glmerControl(optimizer = "bobyqa"), nAGQ=1)
> summary(freqmodel5)
Cov prior  : SUBJ ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
common.scale = TRUE)
           : ITEM ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
common.scale = TRUE)
Prior dev  : -5.3929

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['bglmerMod']
 Family: binomial  ( logit )
Formula: Correct ~ Type * relevel(masterPT$Group, ref = "L2") + (1 + Type |
     SUBJ) + (1 + Group | ITEM)
   Data: masterPT
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid
  1676.8   1734.1   -828.4   1656.8     2265

Scaled residuals:
    Min      1Q  Median      3Q     Max
-4.8595 -0.1106  0.1315  0.3447  4.1192

Random effects:
 Groups Name        Variance Std.Dev. Corr
 SUBJ   (Intercept) 1.915    1.384
        Type2       1.642    1.281    -0.74
 ITEM   (Intercept) 8.970    2.995
        Group2      6.479    2.545    -0.74
Number of obs: 2275, groups:  SUBJ, 65; ITEM, 35

Fixed effects:
                                            Estimate Std. Error z value
Pr(>|z|)
(Intercept)                                   1.3577     0.4946   2.745
 0.00605 **
Type2                                        -1.4683     0.8340  -1.761
 0.07832 .
relevel(masterPT$Group, ref = "L2")L1         1.3802     0.6906   1.999
 0.04566 *
Type2:relevel(masterPT$Group, ref = "L2")L1   4.0366     1.2381   3.260
 0.00111 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) Type2  r(Pr="
Type2       -0.546
r(PT$G,r="L -0.325  0.157
T2:(PT$Gr="  0.151 -0.221 -0.442

The coefficient now correctly indicates a higher chance of scoring a
correct answer when going from the L2 to the L1 group by 34%.

Thanks Tom and Thierry!



Francesco Romano PhD


On Mon, Jul 31, 2023 at 8:52 AM Tom Fritzsche <tom.fritzsche using uni-potsdam.de>
wrote:

> Dear Francsco,
>
> In your new model, it seems you don't set your group reference level
> to 2 (unless you did that independently before, which I would also
> recommend).
> Could it be that now group 1 (the native language group) is the
> reference level so that the results make sense?
> You could check by looking at
> contrasts(masterPT$Group)
> to see what the contrast specification is.
>
> Best,
> Tom
>
> ---
> Tom Fritzsche (pronoun: he)
> University of Potsdam
> Department of Linguistics
> Karl-Liebknecht-Str. 24-25
> 14476 Potsdam
> Germany
>
> Office:  House 14 / Room 1.40
> Phone: +49 331 977 2296
> Fax:      +49 331 977 2095
> Email:   tom.fritzsche using uni-potsdam.de
> https://www.ling.uni-potsdam.de/~fritzsche/
> https://orcid.org/0000-0002-7917-514X
>
> On Mon, 31 Jul 2023 at 08:38, Francesco Romano <fbromano77 using gmail.com>
> wrote:
> >
> > Hello Thierry,
> >
> > Unfortunately that doesn't seem to make any difference. The result is the
> > same: *Group2        -1.3802     0.6907  -1.998 0.045681 *  *
> > I paste the output below.
> >
> > #create a new dummy variable recoding RESP#
> >
> > > masterPT$Correct <- 0
> >
> > #recode correct responses as a score of 1#
> >
> > > masterPT$Correct<-ifelse(masterPT$RESP=="correct",1,0)
> >
> > > summary(masterPT$Correct)
> >    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> >  0.0000  0.0000  1.0000  0.7371  1.0000  1.0000
> >
> > > freqmodel4<-bglmer(Correct~Type*Group+(1+Type|SUBJ)+(1+Group|ITEM),
> > family = binomial(link="logit"), data=masterPT,
> > control=glmerControl(optimizer = "bobyqa"), nAGQ=1)
> > > summary(freqmodel4)
> > Cov prior  : SUBJ ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
> > common.scale = TRUE)
> >            : ITEM ~ wishart(df = 4.5, scale = Inf, posterior.scale = cov,
> > common.scale = TRUE)
> > Prior dev  : -5.3929
> >
> > Generalized linear mixed model fit by maximum likelihood (Laplace
> > Approximation) ['bglmerMod']
> >  Family: binomial  ( logit )
> > Formula: Correct ~ Type * Group + (1 + Type | SUBJ) + (1 + Group | ITEM)
> >    Data: masterPT
> > Control: glmerControl(optimizer = "bobyqa")
> >
> >      AIC      BIC   logLik deviance df.resid
> >   1676.8   1734.1   -828.4   1656.8     2265
> >
> > Scaled residuals:
> >     Min      1Q  Median      3Q     Max
> > -4.8595 -0.1106  0.1315  0.3447  4.1192
> >
> > Random effects:
> >  Groups Name        Variance Std.Dev. Corr
> >  SUBJ   (Intercept) 1.915    1.384
> >         Type2       1.642    1.281    -0.74
> >  ITEM   (Intercept) 8.970    2.995
> >         Group2      6.479    2.545    -0.74
> > Number of obs: 2275, groups:  SUBJ, 65; ITEM, 35
> >
> > Fixed effects:
> >              Estimate Std. Error z value Pr(>|z|)
> > (Intercept)    2.7380     0.7070   3.872 0.000108 ***
> > Type2          2.5683     1.3313   1.929 0.053707 .
> > Group2        -1.3802     0.6907  -1.998 0.045681 *
> > Type2:Group2  -4.0366     1.2383  -3.260 0.001115 **
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Correlation of Fixed Effects:
> >             (Intr) Type2  Group2
> > Type2       -0.446
> > Group2      -0.750  0.312
> > Type2:Grop2  0.326 -0.792 -0.442
> >
> > Best,
> >
> > Francesco Romano PhD
> >
> >
> > On Sun, Jul 30, 2023 at 6:53 PM Thierry Onkelinx <
> thierry.onkelinx using inbo.be>
> > wrote:
> >
> > > Dear Francesco,
> > >
> > > Don't use a factor response variable. Use either FALSE/TRUE or 0/1.
> Note
> > > that as.factor() would use "correct" as the first level and
> "incorrect" as
> > > the second level. Maybe the model uses the first levels as FALSE and
> > > the second as TRUE. Setting TRUE and FALSE yourselves eliminates such
> > > ambiguity.
> > > I prefer to do any transformations like as.factor() or relevel() prior
> to
> > > fitting the model. Then you have the same variables available in the
> > > dataset (e.g. for plotting).
> > >
> > > Best regards,
> > >
> > > ir. Thierry Onkelinx
> > > Statisticus / Statistician
> > >
> > > Vlaamse Overheid / Government of Flanders
> > > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> AND
> > > FOREST
> > > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> > > thierry.onkelinx using inbo.be
> > > Havenlaan 88 bus 73, 1000 Brussel
> > > www.inbo.be
> > >
> > >
> > >
> ///////////////////////////////////////////////////////////////////////////////////////////
> > > To call in the statistician after the experiment is done may be no more
> > > than asking him to perform a post-mortem examination: he may be able
> to say
> > > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > > The plural of anecdote is not data. ~ Roger Brinner
> > > The combination of some data and an aching desire for an answer does
> not
> > > ensure that a reasonable answer can be extracted from a given body of
> data.
> > > ~ John Tukey
> > >
> > >
> ///////////////////////////////////////////////////////////////////////////////////////////
> > >
> > > <https://www.inbo.be>
> > >
> > >
> > > Op zo 30 jul 2023 om 17:46 schreef Francesco Romano <
> fbromano77 using gmail.com
> > > >:
> > >
> > >> Dear all,
> > >>
> > >>
> > >> I wonder if anyone can account for a counterintuitive result in my
> > >> analyses
> > >> of a logistic regression run with glmer. The data, where I only use
> the
> > >> columns RESP as a binomial outcome coded as character (correct vs.
> > >> incorrect), Type as a factor with 2 levels where 'islands'
> constitutes the
> > >> reference level, Group as a factor with two levels where 'L1'
> constitutes
> > >> the reference level, and SUBJ and ITEM as random effects, is
> attached. In
> > >> this particular analysis I use bglmer but the same result ensues from
> > >> glmer
> > >> or even a bernoulli brm analysis in the Bayesian framework (i.e. brms
> > >> package).
> > >>
> > >> The following output shows that the group L1, the reference group,
> has a
> > >> lower probability of a correct score compared to L2. If the output is
> > >> garbled in your email, feel free to run the code yourself to get a
> clearer
> > >> picture. Under the fixed-effects of the output, we interpret the
> > >> coefficient -1.38 of r*elevel(masterPT$Group, ref = "L2")L1
> -1.3802
> > >>     0.6907  -1.998  0.04567 * *to mean that the probability of
> scoring a
> > >> correct answer is approximately 34% lower in the L1 than the L2 group
> > >> (following Gelman and Hill, 2007, p.93) we divide the coeffecient
> -1.38
> > >> expressed in log odds by 4 to obtain an approximate corresponding
> > >> probability). This is counterintuitive because the L1 group is
> basically a
> > >> group of native speakers of Spanish, the language being tested, while
> the
> > >> L2 is a bilingual group being tested in Spanish as a foreign language
> > >> which
> > >> they learned later in life. Even playing devil's advocate, a quick
> look at
> > >> a prop table or even the figures also attached shows the L1 group
> exhibit
> > >> lower counts of incorrect responses. In the figures, this can be
> easily
> > >> seen by comparing the amount of light blue splash for either of the
> Type
> > >> levels between the L1 and L2 groups. There is far less of a splash in
> the
> > >> L1 data which suggests they should statisticall have a higher chance
> of
> > >> selecting a correct response.
> > >>
> > >> > freqmodel2<-bglmer(as.factor(RESP)~Type*relevel(masterPT$Group, ref
> =
> > >> "L2")+(1+Type|SUBJ)+(1+Group|ITEM), family = binomial(link="logit"),
> > >> data=masterPT, control=glmerControl(optimizer = "bobyqa"), nAGQ=1)
> > >> > summary(freqmodel2)
> > >> Cov prior  : SUBJ ~ wishart(df = 4.5, scale = Inf, posterior.scale =
> cov,
> > >> common.scale = TRUE)
> > >>            : ITEM ~ wishart(df = 4.5, scale = Inf, posterior.scale =
> cov,
> > >> common.scale = TRUE)
> > >> Prior dev  : -5.3929
> > >>
> > >> Generalized linear mixed model fit by maximum likelihood (Laplace
> > >> Approximation) ['bglmerMod']
> > >>  Family: binomial  ( logit )
> > >> Formula: as.factor(RESP) ~ Type * relevel(masterPT$Group, ref = "L2")
> +
> > >>  (1 + Type | SUBJ) + (1 + Group | ITEM)
> > >>    Data: masterPT
> > >> Control: glmerControl(optimizer = "bobyqa")
> > >>
> > >>      AIC      BIC   logLik deviance df.resid
> > >>   1676.8   1734.1   -828.4   1656.8     2265
> > >>
> > >> Scaled residuals:
> > >>     Min      1Q  Median      3Q     Max
> > >> -4.1192 -0.3447 -0.1315  0.1106  4.8595
> > >>
> > >> Random effects:
> > >>  Groups Name        Variance Std.Dev. Corr
> > >>  SUBJ   (Intercept) 1.915    1.384
> > >>         Type2       1.642    1.281    -0.74
> > >>  ITEM   (Intercept) 8.970    2.995
> > >>         Group2      6.479    2.545    -0.74
> > >> Number of obs: 2275, groups:  SUBJ, 65; ITEM, 35
> > >>
> > >> Fixed effects:
> > >>                                             Estimate Std. Error z
> value
> > >> Pr(>|z|)
> > >> (Intercept)                                  -1.3577     0.4947
> -2.745
> > >>  0.00606 **
> > >> Type2                                         1.4683     0.8340
>  1.760
> > >>  0.07834 .
> > >> relevel(masterPT$Group, ref = "L2")L1        -1.3802     0.6907
> -1.998
> > >>  0.04567 *
> > >> Type2:relevel(masterPT$Group, ref = "L2")L1  -4.0366     1.2383
> -3.260
> > >>  0.00112 **
> > >> ---
> > >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > >>
> > >> Correlation of Fixed Effects:
> > >>             (Intr) Type2  r(Pr="
> > >> Type2       -0.546
> > >> r(PT$G,r="L -0.325  0.157
> > >> T2:(PT$Gr="  0.151 -0.221 -0.442
> > >>
> > >>
> > >> What am I missing here?
> > >> Am I interpreting something wrong?
> > >>
> > >> Many thanks in advance for any help,
> > >>
> > >> Best,
> > >>
> > >> Francesco Romano PhD
> > >> _______________________________________________
> > >> R-sig-mixed-models using r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >>
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list