[R-sig-ME] time*treatment vs time + time:treatment in RCTs

Mon Aug 29 17:11:28 CEST 2022

I strongly suspect that 'time' is treated as a factor in the examples Jorge is referring to. In this case, the two formulations are just different parameterizations of the same model. We can use the 'Orthodont' data to illustrate this. Think of 'age' as the time variable (as a four-level factor) and 'Sex' as the treatment variable (as a two-level factor). In fact, I will throw in a third parameterization, which I think is even more intuitive.

library(lme4)

data("Orthodont", package="nlme")

Orthodont$age <- factor(Orthodont$age)

res1 <- lmer(distance ~ age*Sex + (1 | Subject), data=Orthodont)
summary(res1)

res2 <- lmer(distance ~ age + age:Sex + (1 | Subject), data=Orthodont)
summary(res2)

res3 <- lmer(distance ~ 0 + age + age:Sex + (1 | Subject), data=Orthodont)
summary(res3)

logLik(res1)
logLik(res2)
logLik(res3)

The fit is identical.

In 'res3', we get the estimated intercepts (means) of the reference group (in this case for 'Male') at all 4 timepoints and the age:Sex coefficients are the difference between the Female and Male groups at those 4 timepoints.

Since these are just all different parameterizations of the same model, there is no reasons for preferring one over the other.

One has to be careful though when using anova() on those models, esp. with respect to the age:Sex test. In anova(res1), the test examines if the difference between males and females is the same at all 4 timepoints, while in anova(res2) and anova(res3) the test examines if the difference is 0 at all 4 timepoints. However, one could get either test out of all three parameterizations, by forming appropriate contrasts. So again, no reason to prefer one over the other (except maybe convenience depending on what one would like to test).

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces using r-project.org] On
>Behalf Of Douglas Bates
>Sent: Monday, 29 August, 2022 16:14
>To: Phillip Alday
>Cc: R-mixed models mailing list; Jorge Teixeira
>Subject: Re: [R-sig-ME] time*treatment vs time + time:treatment in RCTs
>
>M2 is an appropriate model if time corresponds to "time on treatment" or in
>general if the covariate over which the measurements are repeated has a
>scale where 0 is meaningful.  I think of it as the "zero dose" model
>because zero dose of treatment 1 is the same as zero dose of treatment 2 is
>the same as zero dose of the placebo.  Similarly zero time on treatment is
>the same for any of the treatments or the placebo.
>
>In those cases we would not expect a main effect for treatment because that
>corresponds to systematic differences before the study begins (or at zero
>dose), but we would expect an interaction of time (or dose) with treatment.
>
>On Mon, Aug 29, 2022 at 8:28 AM Phillip Alday <me using phillipalday.com> wrote:
>
>> On 8/29/22 05:53, Jorge Teixeira wrote:
>> > Hi. In medicine's RCTs, with 3 or more time-points, whenever LMMs are
>> used
>> > and the code is available, a variation of  y ~ time*treatment + (1 | ID)
>> > *(M1)* is always used (from what I have seen).
>> >
>> > Recently I came across the model  time + time:treatment + (1 | ID)* (M2)*
>> > in Solomun Kurz's blog and in the book of Galecki (LMMs using R).
>> >
>> > Questions:
>> > *1)* Are there any modelling reasons for M2 to be less used in medicine's
>> > RCTs?
>>
>> It depends a bit on what `y` is: change from baseline or the 'raw'
>> measure. If it's the raw measure, then (M2) doesn't include a
>> description of differences at baseline between the groups.
>>
>> Perhaps most importantly though: (M2) violates the principle of
>> marginality discussed e.g. in Venables' Exegeses on Linear Models
>> (https://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf)
>>
>> > *2)* Can anyone explain, in layman terms, what is the estimand in M2? I
>> > still struggle to understand what model is really measuring.
>>
>> Approximately the same thing as M1, except that the "overall" effect of
>> treatment is assumed to be zero. "Overall" is a bit vague because it
>> depends on the contrast coding used for time and treatment.
>>
>> You can see this for yourself. M1 can also be written as:
>>
>> y ~ time + time:treatment + treatment + (1|ID).
>>
>> If you force the coefficient on treatment to be zero, then you have M2.
>>
>> > *3)* On a general basis, in a RCT with 3 time points (baseline, 3-month
>> and
>> > 4-month), would you tend to gravitate more towards model 1 or 2?
>>
>> Definitely (1).
>>
>> PS: When referencing a blog entry, please provide a link to it. :)
>>
>> > Thank you
>> > Jorge