[R-sig-ME] Different lmer results using contrasts() vs numeric coding

Dan McCloy drmccloy at uw.edu
Tue Jan 26 22:48:06 CET 2016


Using numeric variables is not the same as hand-coding the contrasts. If
you pass in a numeric variable the modeling function will assume it is a
numerically continuous predictor, and you will get one coefficient
regardless of how many "levels" you represented numerically. Try something
like

contrasts(data$type_handcoded) <- as.matrix(cbind(data$type1, data$type2))

to specify the contrast matrix by hand.
Dear R Mixed Models List,



I'm working on a LMM for psycholinguistic data with a 3x2 fixed effects
structure and crossed subject and item random effects. I ran into a
confusing result when I compared the use of contrasts() for fixed factor
variables vs 'hand-coding' these contrasts into numeric variables (with the
same values assigned using contrasts()).



The summaries for the two models that include all fixed factor terms are
identical. Here is the syntax used for each:



modelHandCoded <- lmer(invRT ~ 1 + type1 + type2 + priming1 +
type1:priming1 + type2:priming1 + (1|pp) + (1|word), data = data)



# numeric variables to define contrasts

table(data$type1)

-2    1

1503 3014

table(data$type2)

 -1    0    1

1520 1503 1494

table(data$priming1)

  -1    1

2253 2264



modelContrasts <- lmer(invRT ~ 1 + type + priming + type:priming + (1|pp) +
(1|word), data = data)



# factor variables with contrasts

contrasts(data$type)             # [,1] = 'type1' above, [,2] = 'type2'
above

       [,1] [,2]

SC   -2    0

IC     1    1

IIH    1   -1

contrasts(data$priming)       # [,1] = 'priming1' above

               [,1]

unprimed   -1

primed       1



However, when I remove the effect of the 3-level fixed factor 'type' (while
still including its interaction with the 2-level factor 'priming'), the two
models no longer produce the same results. Here is the syntax for the two
models without 'type':



modelHandCoded.NoType <- lmer(invRT ~ 1 + priming1 + type1:priming1 +
type2:priming1 + (1|pp) + (1|word), data = data)



modelContrasts.NoType <- lmer(invRT ~ 1 + priming + type:priming + (1|pp) +
(1|word), data = data)



The summary for the hand-coded model includes 3 fixed effects that I
expected (priming1, type1:priming1, type1:priming2):



summary(modelHandCoded.NoType)

...

Fixed effects:

               Estimate Std. Error         df t value Pr(>|t|)

(Intercept)   1.428e+00  3.244e-02  2.900e+01  44.031   <2e-16 ***

priming1      8.878e-03  3.572e-03  4.384e+03   2.485    0.013 *

type1:priming1  8.416e-04  2.527e-03  4.385e+03   0.333    0.739

type2:priming1 -1.790e-04  4.373e-03  4.383e+03  -0.041    0.967



However the summary for the contrasts() model includes additional
interaction terms for each level of priming1:



summary(modelContrasts.NoType)

...

Fixed effects:

                 Estimate Std. Error         df t value Pr(>|t|)

(Intercept)     1.428e+00  3.243e-02  2.900e+01  44.046   <2e-16 ***

priming1        8.872e-03  3.572e-03  4.386e+03   2.484   0.0130 *

priming0:type1  6.289e-03  4.811e-03  3.130e+02   1.307   0.1921

priming1:type1  7.957e-03  4.802e-03  3.110e+02   1.657   0.0985 .

priming0:type2 -9.782e-03  8.323e-03  3.120e+02  -1.175   0.2408

priming1:type2 -1.009e-02  8.316e-03  3.110e+02  -1.213   0.2261



When I compare each reduced model to the full model, I find that there's a
difference between the full model and the reduced hand coded model, but not
between the full model and the reduced model using contrasts(). The
Df/AIC/BIC/LL for the latter two models are identical, so it appears that
removing the 'type' term had no effect. (This is true for comparisons with
both the hand-coded and contrasts() versions of the full model.) Here are
the results of the anova() for each comparison:



Model with full fixed-effects structure vs. hand-coded model with 'type'
removed

       Df    AIC    BIC logLik deviance  Chisq Chi Df Pr(>Chisq)

..1       7 168.14 213.05 -77.07   154.14

object  9 167.20 224.94 -74.60   149.20 4.9406      2    0.08456 .



Model with full fixed-effects structure vs. contrast-coded model with
'type' removed

       Df   AIC    BIC logLik deviance Chisq Chi Df Pr(>Chisq)

object  9 167.2 224.94  -74.6    149.2

..1       9 167.2 224.94  -74.6    149.2     0      0  < 2.2e-16 ***



Can anyone explain why the two reduced models differ depending on whether
the fixed factor variables are hand-coded numeric vs. factors with
contrasts() assigned? Also, why is there no effect of removing a fixed
factor term when contrasts() are used?  Apologies if I'm missing something
obvious!



Thanks,

Becky

_____________________________________________________

Dr Becky Gilbert

Research Associate

Psychology and Language Sciences

University College London

London WC1H 0AP

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list