[R] longitudinal imputation with PAN

Joanne Hosking joanne.hosking at pms.ac.uk
Mon Sep 24 15:55:20 CEST 2007


Hello all,

I am working on a longitudinal study of children in the UK and trying the PAN package for imputation of missing data, since it fulfils the critical criteria of taking into account individual subject trend over time as well as population trend over time.  In order to validate the procedure I have started by deleting some known values …we have 6 annual measures of height on 300 children and I have imputed the missing values using PAN and compared the imputed values to the real values I deleted - in most individuals the imputed values fit the individual trend extremely well! However, when looking at the trend over time for a handful of individuals, the imputed value was actually lower than the previous (real) value of height or higher than the next (real) value making it appear that height went down…which in reality it never does…so my question is why, when it seems to work so well for the majority of individuals, does this happen? Am I doing something wrong?
As a novice user of R (and new to this area of statistics) I wondered if anyone could possibly  point me in the right direction, since the mixed effect design (plus potential ease and speed) of the PAN procedure for longitudinal data imputation is very appealing...
I would very much appreciate any advice you could give me, many thanks in advance.

Jo Hosking

Code and a small sample data are shown below (I could supply more data to anyone willing!)...

impht.data <-read.delim ("impht_long_trunc.dat",header = TRUE)
impht.data$sex <-factor(impht.data$sex,label = c("Boys","Girls"))
impht.data$visit <- factor (impht.data$visit)
impht.data$code <- factor (impht.data$code)

y <- impht.data$htmiss
subj <- impht.data$code
pred <- cbind (impht.data$age, impht.data$sex, impht.data$visit)
xcol <- 1:3
zcol <- 1
prior <- list(a=1, Binv=1, c=1, Dinv=1)
ht1 <- pan(y, subj, pred, xcol, zcol, prior, seed=13579, iter=1000)

code    sex     visit   age     ht      htmiss
1       2       1       4.87    105     105
1       2       2       5.86    109.6
1       2       3       6.88    116.4   116.4
1       2       4       7.72    121.2   121.2
1       2       5       8.72    126.7   126.7
1       2       6       9.71    132.3   132.3
2       2       1       4.84    107.1   107.1
2       2       2       6       115.7   115.7
2       2       3       6.86    121.4   121.4
2       2       4       7.69    126.5   126.5
2       2       5       8.7     134.15  134.15
2       2       6       9.76    140
3       2       1       4.62    103     103
3       2       2       5.69    108.9   108.9
3       2       3       6.87    115.1
3       2       4       7.55    118.6   118.6
3       2       5       8.46    123.6   123.6
3       2       6       9.63    128.9   128.9



More information about the R-help mailing list