[R-sig-ME] subject level predictions with lme4 from incomplete longitudinal profile

Sun Nov 2 18:00:18 CET 2014

Dear all,

I am interested in using lme4 to make subject level predictions from longitudinal data. I have 7 longitudinal observations for over 100 subjects to fit the model (Call it z1), and the goal would be to use info from the first 2 observations of a new subject to make predictions for the remaining 5 time points. One way seems to be to add the first 2 time points of the new subject to the dataset of the other subjects with full longitudinal sets and refit the model to get the required random coefficients for the new subject in order to make predictions. My question is whether refitting the model could be avoided and use info from fitted model z1 as well as the design matrix and response values for the first 2 time points of the new subject to compute its random coefficients so that subject level predictions can be obtained.

See below an illustration using a subset of the entire dataset:

> #get file http://bioinformaticsprb.med.wayne.edu/shares/uspar.zip and unzip it
> load("uspar.RData")
> str(dat)
'data.frame':  94 obs. of  5 variables:
$ ID: int  1 1 1 1 1 1 1 2 2 2 ...
$ t : num  12.6 15.2 18.5 21.7 25.7 ...
$ X : num  2.53 2.72 2.92 3.08 3.25 ...
$ X2: num  31.8 41.2 53.8 66.6 83.3 ...
$ Y : num  2.67 2.88 3.06 3.23 3.38 ...
> head(dat)
  ID        t        X       X2        Y
1  1 12.56409 2.530843 31.79774 2.674149
2  1 15.16409 2.718930 41.23011 2.884801
3  1 18.46409 2.915828 53.83811 3.063391
4  1 21.66409 3.075656 66.63130 3.226844
5  1 25.66409 3.245093 83.28236 3.380995
6  1 28.56409 3.352150 95.75113 3.462606
>
> library(lme4)
> library(lattice)
> print(xyplot(Y ~ t | ID, dat, aspect = "xy",
+              layout = c(4,4), type = c("g", "p", "r"),
+              xlab = "t",
+              ylab = "Y"))
>
> #so straight lines are not good enough, therefore will try some transformations of the time t
> dat$X=log(dat$t)
> dat$X2=dat$t*log(dat$t)
>
> #fit a mixed model without subject 14
> z1=lmer(Y~X+X2+(1+X|ID),data=dat[1:87,])
> coef(z1)
$ID
   (Intercept)        X           X2
1  -0.53171325 1.326889 -0.004786795
2  -0.47307768 1.298678 -0.004786795
3  -0.94360603 1.426778 -0.004786795
4  -0.96673528 1.419102 -0.004786795
5  -0.17684352 1.223741 -0.004786795
6  -0.25485719 1.244443 -0.004786795
7  -0.43006443 1.292671 -0.004786795
8  -0.16374444 1.236175 -0.004786795
9  -0.56353815 1.327487 -0.004786795
10 -0.03938627 1.199694 -0.004786795
11 -0.88569466 1.420670 -0.004786795
12 -0.98283470 1.428146 -0.004786795
13 -0.17773363 1.214458 -0.004786795

attr(,"class")
[1] "coef.mer"
>
> #Question:
> #Say I only had the first 2 time points for subject 14 and wanted to make
> #subject level predictions for it for all time points.
> #One way seems to be to fit a new model z2 with by including the first 2 points
> # of this subject.
>
> z2=lmer(Y~X+X2+(1+X|ID),data=dat[1:89,])
> coef(z2)
$ID
   (Intercept)        X           X2
1  -0.52677880 1.324643 -0.004759445
2  -0.46715932 1.296103 -0.004759445
3  -0.93644456 1.423830 -0.004759445
4  -0.95853280 1.415826 -0.004759445
5  -0.17203704 1.221526 -0.004759445
6  -0.24932519 1.241990 -0.004759445
7  -0.42432616 1.290164 -0.004759445
8  -0.15979241 1.234259 -0.004759445
9  -0.55790691 1.325007 -0.004759445
10 -0.03503148 1.197623 -0.004759445
11 -0.87957274 1.418050 -0.004759445
12 -0.97501533 1.424990 -0.004759445
13 -0.17154433 1.211794 -0.004759445
14 -0.65875042 1.348014 -0.004759445

attr(,"class")
[1] "coef.mer"
>
> dat$Ypred=predict(z2,dat)
> print(xyplot(Ypred ~ Y | ID, dat, aspect = "xy",
+              layout = c(4,4), type = c("g", "p", "r"),
+              xlab = "Y",
+              ylab = "Ypred"))
>
> #Q1, Is there another way that would not require fitting model z2 but simply use
> # info form z1 and
> dat[88:89,c("X","X2","Y")]
          X       X2        Y
88 2.706981 40.56130 2.803360
89 2.843977 48.87079 2.933857

Thanks,
Adi Tarca

________________________________

This document may include proprietary and confidential information. This document may not be reproduced, copied, distributed, published, modified or furnished to third parties, without prior written consent. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

	[[alternative HTML version deleted]]