[R-sig-ME] Contrasts for interactions in lmer
Reinhold Kliegl
reinhold.kliegl at gmail.com
Fri Aug 13 13:57:11 CEST 2010
This is not a list about eye-movement analyses, but very briefly I
would very strongly advise against including non-fixated words with a
zero value in any such analysis. There are many applications for
non-negative DVs where incrementing the DV by one to get around zero
before log-transforming them is perfectly ok. Fixation durations
simply belong not to these measures.
I see two options. One is the one you already tried that is to
restrict the analysis to fixated words (and discuss this appropriately
and document the limits of generalizability of your results). Your
residual plot shows that from a statistical point of view this looks
ok. The fact that the interaction is not significant could well be
related to the reduced statistical power you have after throwing out
the zero-fixation cases, but the interaction may simply not be there,
too.
Alternatively, you might want to switch to a dichotomous
variable for the critical words (fixated vs. not fixated), which would
require a generalized linear mixed model (fixated ~ . , family =
binomial). This way you could keep all data in the analysis.
Reinhold Kliegl
On Fri, Aug 13, 2010 at 1:15 PM, Paul Metzner <paul.metzner at gmail.com> wrote:
> First, to clear things up, RRT is not reciprocal reading time, but re-reading time (time spent on a region after it was fixated and left).
>
>> I saw that now you use log-transformed DVs and that in my
>> experience is a good choice for durations collected in eye tracking.
>> Nevertheless you should check the distribution of model residuals to
>> back up this decision. Anyway, the log-transformation of RRT may have
>> lifted the t-value for the PCU X COND interaction. So I am curious
>> whether it did or not?
>
> I am very unsecure about whether or not i should log-transform. Because most of the duration variables include zeroes and because this is informative, as well, I don't want to exclude them. To still be able to log-transform, I would have to add a constant value and I am uncertain, if that is a valid operation. Plotting fitted values against residuals doesn't look good both when I use raw data and when I log-transform. The distribution of the model residuals look at least ok when zeroes are excluded (and even better with log(RRT)), but that would no longer be the DV that I am interested in. See these plots for illustration: http://amor.rz.hu-berlin.de/~metznerp/Rplot.pdf
> However, the log-transformation did indeed lift the t-value for PCU x COND, but only marginally (from 1.492 to 1.581). Generally, the log-transform seems to have little impact on the main effects and interactions (two or three effects disappear without log-transformation).
>
>
>
> On 13 Aug 2010, at 12:25, Reinhold Kliegl wrote:
>
>> Maybe I responded to quickly ...
>>
>> First, I guess for a two-level factor a sum contrast can also be
>> called a Helmert contrast; it is a bit unusual, I think.
>>
>> Second, the story about the fixed-effect correlations is complicated.
>> What I wrote for balanced designs returning a zero correlation matrix
>> of fixed effects assumes also that contrasts for the fixed effects are
>> orthogonal and that the variance components are specified only for the
>> intercepts, as you had set up your first model. If you specify a
>> non-orthogonal set of treatment contrasts, the fixed-effects
>> correlations will be 0.5. Thus, these correlations inform about the
>> correlations of the predictors in the model matrix.
>> Moreover, the story changes again if you estimate values for
>> parameters representing (co-)-variance components for random effects
>> in a balanced design, the fixed-effect correlations return values that
>> (sometimes?) are close to within-subject correlations (i.e.,
>> correlations unadjusted for shrinkage); maybe for balanced designs
>> with orthogonal predictors there is actually a specification under
>> which they are actually identical with them. This would be cool.
>>
>> Third, I think the fixed-effect part of the model you give now looks
>> fine; it is defensible (and sometimes necessary) to exclude
>> non-significant higher-order interactions. I still don't think you
>> need the variance components associated with COND and DIR for subjects
>> and you may be communicating the wrong thing, but opinions may differ
>> on this, because non-significance of these components is a thorny
>> issue.
>> In this case, as far as I can see, these correlations are not
>> really related to the COND x PCU interaction you are interested in.
>> Significant effect correlations can be mapped onto a different kind
>> of interaction (e.g.,, subjects with a large COND effect may tend to
>> have a larger DIR effect than subjects with a small COND effect), but
>> this does not bear on your PCU covariate, at least not directly. (This
>> could happen independent of a DIR x COND interaction in the
>> fixed-effect part of the model.)
>> I saw that now you use log-transformed DVs and that in my
>> experience is a good choice for durations collected in eye tracking.
>> Nevertheless you should check the distribution of model residuals to
>> back up this decision. Anyway, the log-transformation of RRT may have
>> lifted the t-value for the PCU X COND interaction. So I am curious
>> whether it did or not?
>>
>> Reinhold Kliegl
>>
>>
>> On Fri, Aug 13, 2010 at 10:20 AM, Paul Metzner <paul.metzner at gmail.com> wrote:
>>> Thank you for the quick answer!
>>>
>>>> (1) The Fixed Effects correlations are probably not what you are
>>>> after. For example, in a perfectly balanced design, these correlations
>>>> will be zero.
>>>
>>> They are not, but like you suggested, I wanted them to be at least close to zero. When I changed the model like mentioned before, I noticed an increase in fixed effects correlations and a curious change in contrast coding (see below), that I couldn't explain.
>>> My main interest are the fixed effects interactions. My hypothesis is that subjects with a higher PCU will be affected more strongly by the condition manipulation. Also, in some studies only one kind of verbs (DIR) has been shown to evoke the effect, hence the desired interaction of COND and DIR. But, because I really don't want individual differences over and above what is explained by PCU, I implemented the random effect term like you suggested and re-included the factors contributing to the interactions. My model now looks like this:
>>>
>>> lmer(log(RRT)~COND + PCU + COND:PCU + DIR + COND:DIR + (1+COND+DIR|SUBJECT) + (1|ITEM), data=fm3)
>>>
>>> Although including the covariance component did not improve model fit, I decided to leave it in the model for the reasons mentioned above. I did, however, exclude the three-way interaction COND:DIR:PCU.
>>>
>>>> (3) You used a sum contrast specification for the two factors (COND
>>>> and DIR). This is fine. For two-level factors there is no point in
>>>> specifying Helmert contrasts. So it is unclear what you referring to
>>>> in this context.
>>>
>>> Being a novice to contrast coding, I thought it was the same. Coincidentally, that seems to be the case for two-level factors. Thanks again for the suggestions!
>>>
>>> Paul
>>>
>>>
>>> On 12 Aug 2010, at 11:24, Reinhold Kliegl wrote:
>>>
>>>> There is a bit of evidence for an interaction of COND and PCU:
>>>>>> COND1:PCU 48.309 29.850 1.618
>>>> If the t-value were larger it would indicate that slopes for the
>>>> regression of RRT on PCU differ between the two condition.
>>>>
>>>> There is no statistical support for the the interaction of DIR and PCU
>>>>>> PCU:DIR1 -26.835 29.814 -0.900
>>>>
>>>> Now to some of your questions relating to correlations:
>>>> (1) The Fixed Effects correlations are probably not what you are
>>>> after. For example, in a perfectly balanced design, these correlations
>>>> will be zero.
>>>>
>>>> (2) I suspect what you might be after are effect correlations related
>>>> to subjects or items. Assuming cond and verb bias are within-subject
>>>> effects, you could get an estimate of the parameter for the covariance
>>>> component with the following specification.
>>>> RRT ~ COND * PCU * DIR + (1 + COND + DIR | SUBJECT) + (1 | ITEM)
>>>>
>>>> You should check whether adding these variance components to the model
>>>> improves the goodness fo fit, for example with an ANOVA..
>>>>
>>>> (3) You used a sum contrast specification for the two factors (COND
>>>> and DIR). This is fine. For two-level factors there is no point in
>>>> specifying Helmert contrasts. So it is unclear what you referring to
>>>> in this context.
>>>>
>>>> Finally, it is generally a bad idea to specify models with
>>>> interactions terms leaving out the factors contributing to the
>>>> interactions. If you do so, you need to have very good theoretical
>>>> reasons.
>>>>
>>>> Reinhold Kliegl
>>>>
>>>>
>>>> On Thu, Aug 12, 2010 at 10:44 AM, Paul Metzner <paul.metzner at gmail.com> wrote:
>>>>> Dear all.
>>>>>
>>>>> I am currently analyzing eye-tracking data and am interested in a main effect of condition (COND) plus its interaction with subjects' operation span (PCU) and the direction of a verb bias (1 or 2). The contrasts are:
>>>>>
>>>>>> contrasts(COND)
>>>>>> [,1]
>>>>>> a -1
>>>>>> b 1
>>>>>
>>>>> and
>>>>>
>>>>>> contrasts(DIR)
>>>>>> [,1]
>>>>>> 1 -1
>>>>>> 2 1
>>>>>
>>>>> PCU is a continuous predictor which I centered by subtracting the mean (the problem does, however, persist when I split the sample into extreme groups and work with a categorial predictor). With the following model, I don't get a correlation between the fixed effects:
>>>>>
>>>>>> Linear mixed model fit by REML
>>>>>> Formula: RRT ~ COND * PCU * DIR + (1 | SUBJECT) + (1 | ITEM)
>>>>>> Data: fm3
>>>>>> AIC BIC logLik deviance REMLdev
>>>>>> 46733 46801 -23355 46768 46711
>>>>>> Random effects:
>>>>>> Groups Name Variance Std.Dev.
>>>>>> SUBJECT (Intercept) 8918.29 94.437
>>>>>> ITEM (Intercept) 404.85 20.121
>>>>>> Residual 34881.69 186.766
>>>>>> Number of obs: 3503, groups: SUBJECT, 59; ITEM, 59
>>>>>>
>>>>>> Fixed effects:
>>>>>> Estimate Std. Error t value
>>>>>> (Intercept) 122.900 12.963 9.481
>>>>>> COND1 15.924 3.165 5.031
>>>>>> PCU 139.411 120.025 1.162
>>>>>> DIR1 -7.746 4.107 -1.886
>>>>>> COND1:PCU 48.309 29.850 1.618
>>>>>> COND1:DIR1 -3.396 3.164 -1.073
>>>>>> PCU:DIR1 -26.835 29.814 -0.900
>>>>>> COND1:PCU:DIR1 -8.069 29.838 -0.270
>>>>>>
>>>>>> Correlation of Fixed Effects:
>>>>>> (Intr) COND1 PCU DIR1 COND1:PCU COND1:D PCU:DI
>>>>>> COND1 0.002
>>>>>> PCU 0.004 -0.001
>>>>>> DIR1 0.002 -0.004 0.004
>>>>>> COND1:PCU -0.001 -0.001 0.003 0.000
>>>>>> COND1:DIR1 -0.001 0.000 0.000 0.007 0.021
>>>>>> PCU:DIR1 0.005 0.000 -0.003 0.000 -0.009 -0.005
>>>>>> COND1:PCU:D 0.000 0.021 -0.002 -0.004 -0.009 -0.001 0.011
>>>>>
>>>>> But, since I'm mainly interested in the interactions and not so much the main effects of PCU and DIR, I changed the model to the following:
>>>>>
>>>>>> Linear mixed model fit by REML
>>>>>> Formula: RRT ~ COND + COND:PCU + COND:DIR + (1 | SUBJECT) + (1 | ITEM)
>>>>>> Data: fm3
>>>>>> AIC BIC logLik deviance REMLdev
>>>>>> 46744 46800 -23363 46769 46726
>>>>>> Random effects:
>>>>>> Groups Name Variance Std.Dev.
>>>>>> SUBJECT (Intercept) 8911.15 94.399
>>>>>> ITEM (Intercept) 406.16 20.153
>>>>>> Residual 34869.91 186.735
>>>>>> Number of obs: 3503, groups: SUBJECT, 59; ITEM, 59
>>>>>>
>>>>>> Fixed effects:
>>>>>> Estimate Std. Error t value
>>>>>> (Intercept) 122.962 12.959 9.489
>>>>>> COND1 15.941 3.164 5.039
>>>>>> CONDa:PCU 91.049 123.553 0.737
>>>>>> CONDb:PCU 187.055 123.714 1.512
>>>>>> CONDa:DIR1 -4.340 5.168 -0.840
>>>>>> CONDb:DIR1 -11.160 5.204 -2.144
>>>>>>
>>>>>> Correlation of Fixed Effects:
>>>>>> (Intr) COND1 CONDa:PCU CONDb:PCU CONDa:DIR1
>>>>>> COND1 0.002
>>>>>> CONDa:PCU 0.004 -0.001
>>>>>> CONDb:PCU 0.004 -0.001 0.883
>>>>>> CONDa:DIR1 0.002 -0.003 0.006 0.000
>>>>>> CONDb:DIR1 0.001 -0.003 0.000 0.006 0.256
>>>>>
>>>>> Not I do get a considerable correlation between the interactions. From the output (CONDa:…, CONDb:…), I infer that the model didn't always use helmert coding for condition but applied something else for the interactions. Is that right? When I code COND numerically as -1 and 1, the correlations turn out fine, which supports my conclusion. I would be very grateful for suggestions.
>>>>>
>>>>> Thanks,
>>>>> Paul
>>>>>
>>>>> ---
>>>>> Paul Metzner
>>>>>
>>>>> Humboldt-Universität zu Berlin
>>>>> Philosophische Fakultät II
>>>>> Institut für deutsche Sprache und Linguistik
>>>>>
>>>>> Post: Unter den Linden 6 | 10099 Berlin | Deutschland
>>>>> Besuch: Dorotheenstraße 24 | 10117 Berlin | Deutschland
>>>>>
>>>>> +49-(0)30-2093-9726
>>>>> paul.metzner at gmail.com
>>>>> http://amor.rz.hu-berlin.de/~metznerp/
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>
>>>
>>>
>>> ---
>>> Paul Metzner
>>>
>>> Humboldt-Universität zu Berlin
>>> Philosophische Fakultät II
>>> Institut für deutsche Sprache und Linguistik
>>>
>>> Post: Unter den Linden 6 | 10099 Berlin | Deutschland
>>> Besuch: Dorotheenstraße 24 | 10117 Berlin | Deutschland
>>>
>>> +49-(0)30-2093-9726
>>> paul.metzner at gmail.com
>>> http://amor.rz.hu-berlin.de/~metznerp/
>>>
>>>
>
>
> ---
> Paul Metzner
> Manfred-von-Richthofen-Str. 13
> 12101 Berlin
> Deutschland
>
> Tel.: +49-(0)30-6730-9220
> Mobil: +49-(0)17-8288-1059
>
> paul.metzner at gmail.com
> http://amor.rz.hu-berlin.de/~metznerp/
>
>
More information about the R-sig-mixed-models
mailing list