[R-sig-ME] longitudinal with 2 time points

Tue Aug 24 10:02:01 CEST 2010

Hi Marc,

I have to admit that I didn't get a chance to carefully read the article before 
my previous reply. So I want to wait till now to respond after finally I got a 
chance to read the article. Thanks for your excellent explanation below. I agree 
that the coefficient for treatment is estimating the extent of the difference 
between treatment and control in the CHANGE of glucose in week 4 from baseline.

Now my dataset becomes a little bt more complicated: each glucose testing was 
done twice (blood was draw from left arm and right arm and tested separately. So 
for each patient, on each time point, there are 2 measurements (from left and 
right arm separately). So I think I should now include factor "arm" as a random 
effect:

lmer(wk4.glucose ~ baseline.glucose + treatment + gender + age+ 
(1|subject/time))

What do you think of this model specification?

Adiitionally, since I am using mixed model now, if I code a new variable “time” 
(either 0 or 4) and new response variable “y”, how do I specify a mixed model 
with 2 random effects, one with respect to “time” variable (2 time points per 
subject per arm), the other with respect to “arm” variable (2 arms per subject 
per time point)?

Thanks a lot!
 John

----- Original Message ----
From: Marc Schwartz <marc_schwartz at me.com>
To: array chip <arrayprofile at yahoo.com>
Cc: r-sig-mixed-models at r-project.org
Sent: Fri, August 13, 2010 7:24:59 AM
Subject: Re: [R-sig-ME] longitudinal with 2 time points

John,

That you are asking this question indicates that either you have yet to read the 
article or that you need to re-read it, as you have not comprehended the 
content.

The beta coefficient for treatment IS the difference in mean glucose change 
between baseline and 4 weeks **attributable to treatment**, after adjusting for 
any baseline differences in glucose between the two groups. That is also 
presuming that there is no interaction at baseline.

For example, let's say that the beta for treatment is -20. Then, at 4 weeks, 
given the same baseline glucose level, we would predict that, on average, the 
treatment group will have a glucose level 20 mg/dl less than the control group. 

In the absence of an interaction, we would estimate the same average treatment 
difference at 4 weeks of 20 mg/dl whether the baseline glucose was 300 mg/dl or 
100 mg/dl. 

However, given regression to the mean, we might reasonably expect the patient 
with a 300 mg/dl baseline level to have a greater mean reduction at 4 weeks as 
compared to the patient with a 100 mg/dl baseline level. 

We might also expect a patient with a glucose level at the low end of the 
baseline range (eg. 50 mg/dl) to experience an average increase in glucose level 
at 4 weeks, presuming that your inclusion/exclusion criteria permitted patients 
with below normal glucose levels. But the difference will still be, on average, 
20 mg/dl between the two treatment groups.

So the patient with a 300 mg/dl baseline level might have an average reduction 
to 200 mg/dl at 4 weeks on the control treatment, whereas the same patient on 
the active treatment would have an average reduction to 180 mg/dl (a difference 
of -20).

The patient with a 100 mg/dl baseline level might have an average reduction to 
90 mg/dl at 4 weeks on the control treatment, whereas the same patient on the 
active treatment would have an average reduction to 70 mg/dl (again, a 
difference of -20).

The patient with a 50 mg/dl baseline level might have an average increase to 90 
mg/dl at 4 weeks on the control treatment, whereas the same patient on the 
active treatment would have an average increase to 70 mg/dl (yet again, a 
difference of -20).

So your conclusion would be that on average, between baseline and 4 weeks, 
glucose levels were reduced by 20 mg/dl more in the active treatment group 
relative to control.

This difference is the vertical separation in the two parallel fitted regression 
lines as shown in the figure in the paper.

So the method is answering exactly the question the investigator is asking.

Marc

On Aug 13, 2010, at 1:02 AM, array chip wrote:

> Marc,
> 
> Thanks for sharing your insights. Let's take this model as an example:
> 
>  lm(wk4.glucose ~ baseline.glucose + treatment + gender + age)
> 
> Because the investigator is interested in knowing whether the CHANGE of glucose 
>
> in week 4 from baseline is different between treatment and control, Is it still 
>
> legitimate to ask whether and HOW can we test this hypothesis? I think the 
> coefficient of the treatment factor is only testing whether the week 4 glucose 

> level is different between treatment and control, but not testing whether 
> the CHANGE of week 4 glucose level with respect to baseline is different 
>between 
>
> treatment and control.
> 
> Thanks again for your suggestion.
> 
> Yi
> 
> 
> 
>  
> 
> 
> ----- Original Message ----
> From: Marc Schwartz <marc_schwartz at me.com>
> To: array chip <arrayprofile at yahoo.com>
> Cc: Charles E. (Ted) Wright <cewright at uci.edu>; John Maindonald 
> <john.maindonald at anu.edu.au>; r-sig-mixed-models at r-project.org
> Sent: Thu, August 12, 2010 6:02:29 AM
> Subject: Re: [R-sig-ME] longitudinal with 2 time points
> 
> Hi John,
> 
> If you read that article, you will see that your use of delta.y as the 
>dependent 
>
> variable does not make sense.
> 
> Thus, I would re-express your model 5 as:
> 
>  lm(wk4.glucose ~ baseline.glucose + treatment + gender + age)
> 
> and as noted, check for the interaction between baseline glucose and 
treatment:
> 
>  lm(wk4.glucose ~ baseline.glucose * treatment + gender + age)
> 
> 
> You might also want to consider using a spline function on age, presuming that 

> age is hopefully measured as a continuous variable (eg. not ordinal groups).
> 
> Since the ANCOVA based approach described in the paper is essentially an OLS 
> linear regression, you can of course include the additional covariates for 
> adjustment. If the interaction term p value is >0.1 (a common threshold), you 
> can remove it and the beta coefficient and its CIs for the treatment factor is 

> your estimated treatment effect relative to your control.
> 
> For the presentation of the results, besides the obvious tabular summaries and 

> the scatter/regression lines plot, include a series of plots showing selected 
> baseline values and the treatment versus control predicted follow up values and 
>
> CIs for the same baseline value in each plot. This visually shows the common 
> estimated treatment effect for each baseline value, which will also tend to 
> reveal regression to the mean. This presentation is especially helpful if the 
> interaction term is retained, which therefore shows how the treatment effect 
> varies and will reverse, over the range of the baseline values. You can select 
>a 
>
> series of clinically relevant values over the range of the observed baseline 
> values, and/or by default, select a five number plus mean series over the 
> observed baseline values.
> 
> I don't see a role for a mixed effects model here, given that this is a pretty 

> straightforward "change from baseline" type design, but there are many here 
>with 
>
> greater expertise than I. If this was a cross-over design, you have multiple 
> measures of glucose for each patient at each time point, more than two time 
> points, or a multi-center study, then a mixed effects model would make more 
> sense to me.
> 
> HTH,
> 
> Marc
> 
> 
> 
> On Aug 12, 2010, at 12:39 AM, array chip wrote:
> 
>> Hi Marc,
>> 
>> Thanks for the reference. I will definitely read it. Please see my reponse to 

>> John's reply. Your model is another model I should add to the 5 models I 
>> proposed in that email. What's your overall thoughts on these different 
> models?
>> 
>> Thank you for sharing.
>> 
>> John
>> 
>> 
>> 
>> ----- Original Message ----
>> From: Marc Schwartz <marc_schwartz at me.com>
>> To: Charles E. (Ted) Wright <cewright at uci.edu>; array chip 
>> <arrayprofile at yahoo.com>
>> Cc: John Maindonald <john.maindonald at anu.edu.au>; 
>> r-sig-mixed-models at r-project.org
>> Sent: Wed, August 11, 2010 6:20:13 AM
>> Subject: Re: [R-sig-ME] longitudinal with 2 time points
>> 
>> Hi,
>> 
>> I'll throw in a reference that covers some of these issues:
>> 
>> Statistics Notes
>> Analysing controlled trials with baseline and follow up measurements
>> Vickers and Altman
>> BMJ. 2001 November 10; 323(7321): 1123–1124.
>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/
>> 
>> 
>> The basic model specification would of course be:
>> 
>>  lm(4Wks ~ Baseline + Group)
>> 
>> You will also want to test for an interaction between the baseline score and 
>> your grouping factor, in case the observed group (eg. treatment) effect is 
>> dependent upon the value of the baseline measurement. In this case, unlike in 

>> the above paper, you of course end up with crossing fitted regression lines, 
>> rather than parallel lines.
>> 
>> HTH,
>> 
>> Marc Schwartz
>> 
>> 
>> On Aug 11, 2010, at 7:34 AM, Charles E. (Ted) Wright wrote:
>> 
>>> Keep in mind that running an ANOVA on the difference is not the same thing as 
>
> 
>>> using the baseline data as a covariate in an ANOVA on the Week 4 data. 
>>> Essentially the ANOVA on the differences is like the ANCOVA with the slope 
>>> constrained to be 1.
>>> 
>>> Ted Wright
>>> 
>>> On Wed, 11 Aug 2010, John Maindonald wrote:
>>> 
>>>> All these are possibilities, except maybe making baseline measurement
>>>> a random factor.  This would make sense only if data divide into groups,
>>>> and you want the baseline effect to vary randomly from group to group.
>>>> That may limit your ability to estimate parameters that are of interest.
>>>> In most circumstances that I am familiar with, it makes better sense to
>>>> treat baseline effect as fixed.
>>>> 
>>>> John.
>>>> 
>>>> On 11/08/2010, at 8:11 AM, array chip wrote:
>>>> 
>>>>> Hi, I am wondering if it is still meaningful to run a mixed model if a
>>>>> longitudinal dataset has only 2 time points (baseline and week 4)? Would it 
>
> 
>>> be
>>>>> more appropriate to simply take the difference between the 2 time points and 
>>
>> 
>>>>> run
>>>>> ANOVA (ANCOVA) on the difference? what about still running mixed model on 
>> the
>>>>> difference of the 2 time points, but adding baseline measurement as a 
> random
>>>>> factor?
>>>>> 
>>>>> Thanks for sharing your thoughts.
>>>>> 
>>>>> John