[R] Question regarding significance of a covariate in a coxme survival model

Wed Aug 4 19:22:22 CEST 2010

Hi,
I am running a Cox Mixed Effects Hazard model using the library coxme. I 
am trying to model time to onset (age_sym1) of thought problems (e.g. 
hearing voices) (sym1).  As I have siblings in my dataset,  I have 
decided to account for this by including a random effect for family 
(famid). My covariate of interest is Mother's diagnosis where a 0 is 
bipolar, 1 is control, and 2 is major depression.  I am trying to fit 
the following model.

thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data = 
bip.surv)

Which provides the following output:

-------------------------------------------------
 > thorp1
Cox mixed-effects model fit by maximum likelihood
   Data: bip.surv
   events, n = 99, 189 (3 observations deleted due to missingness)
   Iterations= 10 54
                     NULL Integrated Penalized
Log-likelihood -479.0372 -467.3549 -435.2096
                   Chisq    df          p   AIC     BIC
Integrated loglik 23.36 3.00 3.3897e-05 17.36     9.58
  Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54
Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid)
Fixed coefficients
                      coef exp(coef) se(coef)      z      p
lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063
lifedxmMAJOR   -0.6329957 0.5309987 0.3460847 -1.83 0.0670
Random effects
  Group Variable Std Dev    Variance
  famid Intercept 0.9877770 0.9757033

--------------------------------------------------

So it appears that there is a difference in hazard rates associated with 
Mother's diagnosis between BIPOLAR and CONTROL. Just to be safe, I fit a 
model without this covariate present and decided to compare the models 
with AIC and see if fit decreased with this covariate not in the model.

----------------------------------------------------------
   thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data = bip.surv)
 > thorp1.n
Cox mixed-effects model fit by maximum likelihood
   Data: bip.surv
   events, n = 99, 189 (3 observations deleted due to missingness)
   Iterations= 10 54
                     NULL Integrated Penalized
Log-likelihood -479.0372 -471.3333 -436.0478
                   Chisq    df          p    AIC     BIC
Integrated loglik 15.41 1.00 0.00008663 13.41     10.81
  Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37
Model:  Surv(age_sym1, sym1) ~ (1 | famid)
Random effects
  Group Variable Std Dev Variance
  famid Intercept 1.050949 1.104495
------------------------------------------------------------

The problem is that the AIC for the model w/o lifedxm is 13.41 and the 
model w/ lifedxm is 17.36. So fit actually improved without lifedxm in 
the model but really the fit is indistinguishable if I use Burnham & 
Anderson, 2002's criteria.

So my conundrum is this - The AIC and the p-values appear to disagree. 
The p-value would suggest that lifedxm should be included in the model 
and the AIC says it doesn't improve fit. In general, I tend to favor the 
AIC (I usually work with large sample sizes and multilevel models) but I 
am just beginning with survival models and I am curious if a survival 
model guru might suggest whether lifedxm needs to be in the model or not 
based on my results here? Would people generally use the AIC in this 
situation?  Also, I can't run anova() on this models because of the 
random effect.

I am happy to provide the data if necessary.

Please cc me on a reply.

Thanks,
Chris