[R-sig-ME] To conjoin or not to conjoin factorial variables?

Mon Oct 19 01:28:37 CEST 2009

On Sun, Oct 18, 2009 at 3:26 PM, Antoine Tremblay <trea26 at gmail.com> wrote:
> On Sun, Oct 18, 2009 at 12:24 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Sat, Oct 17, 2009 at 10:33 AM, Antoine Tremblay <trea26 at gmail.com> wrote:
>>> Hello all,
>>>
>>> We are interested in an interaction between FACTOR A (levels "a" and
>>> "b"), FACTOR B (levels "c" and "d") and variable TIME (which we model
>>> with a 5 knot restricted cubic spline). That is:
>>>
>>>              m1=
>>> lmer(LogRT~A*B*rcs(TIME,5)+(1|Subject)+(1|Item)+(1|TIME)+(0+TIME|Subject),data=dat).
>>>              (1)
>>
>> I'm not sure I understand the model specification.  Is TIME numeric or
>> a factor or ...?
>
> TIME is a numeric variable ranging from 1 to 1600.
>
>> It unusual to have a term (1|TIME), which would
>> indicate that TIME is a factor with a large number of levels, and
>> another term of the form (0+TIME|Subject), which would indicate that
>> TIME is a continuous covariate or a factor with a small number of
>> levels.
>
> Please correct me if I'm wrong, but I understood from Baayen, Davidson
> & Bates (2008) [Baayen, R.H., Davidson, D.J. & Bates, D.M. (2008).
> Mixed-effects modeling with crossed random effects for subjects and
> items. Journal of Memory and Language, 59, Special Issue: Emerging
> Data Analysis Techniques, 390-412] that putting a variable in the
> random-effects structure (in LMER) can model potential
> heteroscedasticity in that variable.

> So I tested (using the log-likelihood ratio test) whether having
> (1|TIME) in the model was warranted or not and it was. I thus took
> this to mean that there was significant heteroscedasticity in TIME
> (i.e., the difference between the mean variance of all time points and
> the variance of each time-point is big enough to be statistically
> significant).

>  Regarding (0+TIME|Subject), well I tested (log-likelihood ratio
> test) to see whether the slope for TIME significantly differed between
> subjets (i.e., heteroscedasticity regarding the slopes for TIME).

I'm afraid I am still confused.  If TIME is on a continuous scale then
(1|TIME) doesn't make sense.  If TIME is a factor then
(0+TIME|Subject) will result in an attempt to estimate a huge
variance-covariance matrix.

>>> Because (i) plotLMER.fnc cannot plot 3-way interactions,
>>
>> Are you referring to a function in the languageR package?
>
> Yes, I am referring to plotLMER in the languageR package.
>
>>
>>> and (ii) we
>>> are unable to look at the contrasts of interest, which are ("ac" vs.
>>> "bc"), ("ad" vs. "bd"), ("bc" vs. "bd"), and ("ac" vs. "ad"), we
>>> decided to collapse factors A and B into a new variable ConjVar with 4
>>> levels "ac", "ad", "bc", and "bd". The model thus becomes:
>>>
>>>              m2=lmer(LogRT~ConjVar*rcs(TIME,5)+(1|Subject)+(1|Item)+(1|TIME)+(0+TIME|Subject),data=dat)
>>>          (2)
>>>
>>> We find significant differences in the first spline only between
>>> levels "ac" and "bc", between "ad" and "bd", between "bc" and "bd",
>>> but not between "ac" and "ad". Having the ConjVar also enables us to
>>> plot the ConjVar*rcs(TIME,5) interaction with plotLMER.fnc():
>>>
>>>              plotLMER.fnc(m2,pred="TIME",intr=list("ConjVar",levels(dat$ConjVar),"mid",list(1:4,rep(1,4))),lwd=2)
>>>       (3)
>>>
>>> Now, here comes the part we don't understand.
>>>
>>> If we do "anova(m1)", the interaction A*B*rcs(TIME,5) is not
>>> significant, but if we look at the table returned by "anova(m2)", then
>>> the ConjVar*rcs(TIME,5) interaction is highly significant. The
>>> questions we have are the following:
>>>
>>>   (i)  Is it correct to conjoin factors A and B into ConjVar and run
>>> our analyses using this variable?
>>>
>>>   (ii) Why is the interaction A*B*rcs(TIME,5) not significant in (1)
>>> but highly significant in (2)?
>>>
>>>   (iii) Would the proper steps here rather be:
>>>                        (I) run the model with A*B*rcs(TIME,5) and see
>>> if this interaction is significant
>>>                            (as shown in the "anova(m1)" table);
>>>                        (II) and, if it is significant, then refit a
>>> model with the conjoined variable ConjVar and
>>>                             determine where the actual differences
>>> are and plot them?
>>>
>>> Thank you very much for your time,
>>>
>>> --
>>> Antoine Tremblay
>>> Department of Neuroscience
>>> Georgetown University
>>> Washington DC
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>
>
>
> --
> Antoine Tremblay
> Department of Neuroscience
> Georgetown University
> Washington DC
>