[R-sig-ME] Should I include participants with baseline score only (missing afterwards) in a longitudinal study?
phillip@@ld@y @ending from mpi@nl
Tue Jul 31 15:34:11 CEST 2018
The model will additional baseline-only participants will have less
uncertainty about the estimates concerning the baseline. This reduced
uncertainty will help "pin" those values, which may also impact other
As a simple example, think of a line passing through two points. Your
job is to determine the slope of the line, but this is made more
complicated by you not being totally certain about the position of the
two points. If can reduce the uncertainty in the position of just one
point, then this will still reduce the possible range of slopes and may
event cause your estimate of the slope to tend towards a
As for your particular inference: I would tend to keep the data in so
that my estimates of function at baseline were as good as possible, even
though this extra data adds no information about function at 1 or 3
months. The loss in uncertainty of the location of the baseline is
potentially useful in its own right and may even help give better
estimates of the slope (=difference between baseline and subsequent
measurement) by creating additional constraints.
On 07/31/2018 08:24 AM, K Imran M wrote:
> Hi everyone,
> I did a longitudinal study where I collected functional score at 3
> different times (baseline, 1 month after baseline and 3 months after
> baseline) from 98 patients. There were 11 patients who died right after
> baseline (so they have functional score at baseline only, and they did not
> have the scores at 1 month after baseline or 3 months after baseline).
> My question is should I remove 11 patients from the dataset (because they
> only provide 1 score?)
> What I did was, next , I run the nlme::lme function on 2 datasets, the
> first dataset that contained 98 participants (11 with only 1 score at
> baseline) and the second dataset with participants with at least 2 scores
> (baseline + 1 month or baseline + 3 month or baseline + 1 month + 3 month).
> I noticed the lme estimates for the two datasets are slightly different.
> How can I explain this?
> In the analysis above, I used a random intercept model (participants as the
> random effect) with time (baseline, 1 month after baseline and 3 months
> after baseline) treated as a factor variable. The covariate is age.
> The datasets (edited due to privacy) are from this links:
> dat.a (https://drive.google.com/open?id=1jAAFnrUfuTsVQST7EE3vjrh0_71ziAut)
> dat.b (https://drive.google.com/open?id=1caGTd6SNnzbHSln84jw9b_lVHhnz7Qij)
> And the R codes are here:
> dat.a <- read_dta("test_complete_data.dta")
> dat.b <- read_dta("test_complete_with_at_discharge.dta")
> # mixed model
> mod.dta.a <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
> data = dat.a, na.action = 'na.omit', method =
> mod.dta.b <- lme(barthel ~ -1 + age + factor(time), random = ~1| id,
> data = dat.b, na.action = 'na.omit', method = 'ML')
> # res
> So let me rephrase the questions (Let us assume we are not interested in
> the mechanism of missingness but purely on the estimation from mixed model)
> 1) should I include patients that have only 1 measurement in a longitudinal
> study in my model?
> 2) why the estimates are different from the dataset with at least 2 data on
> follow-ups) vs the dataset that also contain participants with only 1 data
> on follow-up? A simple explanation should be fine for me.
> I apologize for my lack of math and stat skill. I really appreciate your
> time in responding to this question.
> Thank you.
> Best wishes
> Kamarul Imran
> Universiti Sains Malaysia
> [[alternative HTML version deleted]]
> R-sig-mixed-models using r-project.org mailing list
More information about the R-sig-mixed-models