[R-sig-ME] Should I include participants with baseline score only (missing afterwards) in a longitudinal study?

Tue Jul 31 16:08:45 CEST 2018

Thanks for catching that, Jon,  I wasn't paying attention to the full
details of the study.

My implicit assumption was "missing at random". For missing not at
random, then you should either exclude the missings or model the
missingness explicitly with say a hurdle model.

Phillip

On 07/31/2018 04:02 PM, Jon Baron wrote:
> On 07/31/18 15:34, Phillip Alday wrote:
>> The model will additional baseline-only participants will have less
>> uncertainty about the estimates concerning the baseline. This reduced
>> uncertainty will help "pin" those values, which may also impact other
>> estimates.
>>
>> As a simple example, think of a line passing through two points. Your
>> job is to determine the slope of the line, but this is made more
>> complicated by you not being totally certain about the position of the
>> two points. If can reduce the uncertainty in the position of just one
>> point, then this will still reduce the possible range of slopes and may
>> event cause your estimate of the slope to tend towards a
>> particular/different value.
>>
>> As for your particular inference: I would tend to keep the data in so
>> that my estimates of function at baseline were as good as possible, even
>> though this extra data adds no information about function at 1 or 3
>> months. The loss in uncertainty of the location of the baseline is
>> potentially useful in its own right and may even help give better
>> estimates of the slope (=difference between baseline and subsequent
>> measurement) by creating additional constraints.
>>
>> Phillip
> 
> This makes sense if those who died after baseline did not differ
> systematically from those who survived. But, if there is any reason for
> the baseline measure to correlate with longevity, I think it would be
> safer to remove the 11 subjects, even at the expense of some
> additional error.
> 
> Jon
> 
>> On 07/31/2018 08:24 AM, K Imran M wrote:
>>> Hi everyone,
>>>
>>> I did a longitudinal study where I collected functional score at 3
>>> different times (baseline, 1 month after baseline and 3 months after
>>> baseline) from 98 patients. There were 11 patients who died right after
>>> baseline (so they have functional score at baseline only, and they
>>> did not
>>> have the scores at 1 month after baseline or 3 months after baseline).
>>>
>>> My question is should I remove 11 patients from the dataset (because
>>> they
>>> only provide 1 score?)
>>>
>>> What I did was, next , I run the nlme::lme function on 2 datasets, the
>>> first dataset that contained 98 participants (11 with only 1 score at
>>> baseline) and the second dataset with participants with at least 2
>>> scores
>>> (baseline + 1 month or baseline + 3 month or baseline + 1 month + 3
>>> month).
>>> I noticed the lme estimates for the two datasets are slightly different.
>>> How can I explain this?
>>>
>>> In the analysis above, I used a random intercept model (participants
>>> as the
>>> random effect) with time (baseline, 1 month after baseline and 3 months
>>> after baseline) treated as a factor variable. The covariate is age.
>>>
>>> The datasets (edited due to privacy) are from this links:
>>> dat.a
>>> (https://drive.google.com/open?id=1jAAFnrUfuTsVQST7EE3vjrh0_71ziAut)
>>> dat.b
>>> (https://drive.google.com/open?id=1caGTd6SNnzbHSln84jw9b_lVHhnz7Qij)
>>>
>>> And the R codes are here:
>>> #######
>>> library(haven)
>>> dat.a <- read_dta("test_complete_data.dta")
>>> dat.b <- read_dta("test_complete_with_at_discharge.dta")
>>>
>>> # mixed model
>>> library(nlme)
>>> mod.dta.a <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
>>>                           data = dat.a, na.action = 'na.omit', method =
>>> 'ML')
>>> mod.dta.b <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
>>>                  data = dat.b, na.action = 'na.omit', method = 'ML')
>>>
>>> # res
>>> summary(mod.dta.a)
>>> summary(mod.dta.b)
>>> #####
>>>
>>>
>>> So let me rephrase the questions (Let us assume we are not interested in
>>> the mechanism of missingness  but purely on the estimation from mixed
>>> model)
>>> 1) should I include patients that have only 1 measurement in a
>>> longitudinal
>>> study in my model?
>>> 2) why the estimates are different from the dataset with at least 2
>>> data on
>>> follow-ups) vs the dataset that also contain participants with only 1
>>> data
>>> on follow-up? A simple explanation should be fine for me.
>>>
>>> I apologize for my lack of math and stat skill. I really appreciate your
>>> time in responding to this question.
>>>
>>> Thank you.
>>>
>>> Best wishes
>>>
>>> Kamarul Imran
>>> Universiti Sains Malaysia
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>