[R-sig-ME] Should I include participants with baseline score only (missing afterwards) in a longitudinal study?

Tue Jul 31 18:18:15 CEST 2018

Phillip and Jon,

Very2 helpful explanation and insight. Really appreciate it.

KIM

On Tue, Jul 31, 2018 at 10:08 PM Phillip Alday <phillip.alday using mpi.nl> wrote:

> Thanks for catching that, Jon,  I wasn't paying attention to the full
> details of the study.
>
> My implicit assumption was "missing at random". For missing not at
> random, then you should either exclude the missings or model the
> missingness explicitly with say a hurdle model.
>
> Phillip
>
> On 07/31/2018 04:02 PM, Jon Baron wrote:
> > On 07/31/18 15:34, Phillip Alday wrote:
> >> The model will additional baseline-only participants will have less
> >> uncertainty about the estimates concerning the baseline. This reduced
> >> uncertainty will help "pin" those values, which may also impact other
> >> estimates.
> >>
> >> As a simple example, think of a line passing through two points. Your
> >> job is to determine the slope of the line, but this is made more
> >> complicated by you not being totally certain about the position of the
> >> two points. If can reduce the uncertainty in the position of just one
> >> point, then this will still reduce the possible range of slopes and may
> >> event cause your estimate of the slope to tend towards a
> >> particular/different value.
> >>
> >> As for your particular inference: I would tend to keep the data in so
> >> that my estimates of function at baseline were as good as possible, even
> >> though this extra data adds no information about function at 1 or 3
> >> months. The loss in uncertainty of the location of the baseline is
> >> potentially useful in its own right and may even help give better
> >> estimates of the slope (=difference between baseline and subsequent
> >> measurement) by creating additional constraints.
> >>
> >> Phillip
> >
> > This makes sense if those who died after baseline did not differ
> > systematically from those who survived. But, if there is any reason for
> > the baseline measure to correlate with longevity, I think it would be
> > safer to remove the 11 subjects, even at the expense of some
> > additional error.
> >
> > Jon
> >
> >> On 07/31/2018 08:24 AM, K Imran M wrote:
> >>> Hi everyone,
> >>>
> >>> I did a longitudinal study where I collected functional score at 3
> >>> different times (baseline, 1 month after baseline and 3 months after
> >>> baseline) from 98 patients. There were 11 patients who died right after
> >>> baseline (so they have functional score at baseline only, and they
> >>> did not
> >>> have the scores at 1 month after baseline or 3 months after baseline).
> >>>
> >>> My question is should I remove 11 patients from the dataset (because
> >>> they
> >>> only provide 1 score?)
> >>>
> >>> What I did was, next , I run the nlme::lme function on 2 datasets, the
> >>> first dataset that contained 98 participants (11 with only 1 score at
> >>> baseline) and the second dataset with participants with at least 2
> >>> scores
> >>> (baseline + 1 month or baseline + 3 month or baseline + 1 month + 3
> >>> month).
> >>> I noticed the lme estimates for the two datasets are slightly
> different.
> >>> How can I explain this?
> >>>
> >>> In the analysis above, I used a random intercept model (participants
> >>> as the
> >>> random effect) with time (baseline, 1 month after baseline and 3 months
> >>> after baseline) treated as a factor variable. The covariate is age.
> >>>
> >>> The datasets (edited due to privacy) are from this links:
> >>> dat.a
> >>> (https://drive.google.com/open?id=1jAAFnrUfuTsVQST7EE3vjrh0_71ziAut)
> >>> dat.b
> >>> (https://drive.google.com/open?id=1caGTd6SNnzbHSln84jw9b_lVHhnz7Qij)
> >>>
> >>> And the R codes are here:
> >>> #######
> >>> library(haven)
> >>> dat.a <- read_dta("test_complete_data.dta")
> >>> dat.b <- read_dta("test_complete_with_at_discharge.dta")
> >>>
> >>> # mixed model
> >>> library(nlme)
> >>> mod.dta.a <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
> >>>                           data = dat.a, na.action = 'na.omit', method =
> >>> 'ML')
> >>> mod.dta.b <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
> >>>                  data = dat.b, na.action = 'na.omit', method = 'ML')
> >>>
> >>> # res
> >>> summary(mod.dta.a)
> >>> summary(mod.dta.b)
> >>> #####
> >>>
> >>>
> >>> So let me rephrase the questions (Let us assume we are not interested
> in
> >>> the mechanism of missingness  but purely on the estimation from mixed
> >>> model)
> >>> 1) should I include patients that have only 1 measurement in a
> >>> longitudinal
> >>> study in my model?
> >>> 2) why the estimates are different from the dataset with at least 2
> >>> data on
> >>> follow-ups) vs the dataset that also contain participants with only 1
> >>> data
> >>> on follow-up? A simple explanation should be fine for me.
> >>>
> >>> I apologize for my lack of math and stat skill. I really appreciate
> your
> >>> time in responding to this question.
> >>>
> >>> Thank you.
> >>>
> >>> Best wishes
> >>>
> >>> Kamarul Imran
> >>> Universiti Sains Malaysia
> >>>
> >>>     [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >> _______________________________________________
> >> R-sig-mixed-models using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>

	[[alternative HTML version deleted]]