[R] Question regarding significance of a covariate in a coxme survival model

Teresa Iglesias tliglesias at ucdavis.edu
Fri Aug 27 22:32:06 CEST 2010


Christopher David Desjardins <desja004 <at> umn.edu> writes:

> 
> Hi,
> I am running a Cox Mixed Effects Hazard model using the library coxme. I 
> am trying to model time to onset (age_sym1) of thought problems (e.g. 
> hearing voices) (sym1).  As I have siblings in my dataset,  I have 
> decided to account for this by including a random effect for family 
> (famid). My covariate of interest is Mother's diagnosis where a 0 is 
> bipolar, 1 is control, and 2 is major depression.  I am trying to fit 
> the following model.
> 
> thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data = 
> bip.surv)
> 
> Which provides the following output:
> 
> -------------------------------------------------
>  > thorp1
> Cox mixed-effects model fit by maximum likelihood
>    Data: bip.surv
>    events, n = 99, 189 (3 observations deleted due to missingness)
>    Iterations= 10 54
>                      NULL Integrated Penalized
> Log-likelihood -479.0372 -467.3549 -435.2096
>                    Chisq    df          p   AIC     BIC
> Integrated loglik 23.36 3.00 3.3897e-05 17.36     9.58
>   Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54
> Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid)
> Fixed coefficients
>                       coef exp(coef) se(coef)      z      p
> lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063
> lifedxmMAJOR   -0.6329957 0.5309987 0.3460847 -1.83 0.0670
> Random effects
>   Group Variable Std Dev    Variance
>   famid Intercept 0.9877770 0.9757033
> 
> --------------------------------------------------
> 
> So it appears that there is a difference in hazard rates associated with 
> Mother's diagnosis between BIPOLAR and CONTROL. Just to be safe, I fit a 
> model without this covariate present and decided to compare the models 
> with AIC and see if fit decreased with this covariate not in the model.
> 
> ----------------------------------------------------------
>    thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data = bip.surv)
>  > thorp1.n
> Cox mixed-effects model fit by maximum likelihood
>    Data: bip.surv
>    events, n = 99, 189 (3 observations deleted due to missingness)
>    Iterations= 10 54
>                      NULL Integrated Penalized
> Log-likelihood -479.0372 -471.3333 -436.0478
>                    Chisq    df          p    AIC     BIC
> Integrated loglik 15.41 1.00 0.00008663 13.41     10.81
>   Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37
> Model:  Surv(age_sym1, sym1) ~ (1 | famid)
> Random effects
>   Group Variable Std Dev Variance
>   famid Intercept 1.050949 1.104495
> ------------------------------------------------------------
> 
> The problem is that the AIC for the model w/o lifedxm is 13.41 and the 
> model w/ lifedxm is 17.36. So fit actually improved without lifedxm in 
> the model but really the fit is indistinguishable if I use Burnham & 
> Anderson, 2002's criteria.
> 
> So my conundrum is this - The AIC and the p-values appear to disagree. 
> The p-value would suggest that lifedxm should be included in the model 
> and the AIC says it doesn't improve fit. In general, I tend to favor the 
> AIC (I usually work with large sample sizes and multilevel models) but I 
> am just beginning with survival models and I am curious if a survival 
> model guru might suggest whether lifedxm needs to be in the model or not 
> based on my results here? Would people generally use the AIC in this 
> situation?  Also, I can't run anova() on this models because of the 
> random effect.
> 
> I am happy to provide the data if necessary.
> 
> Please cc me on a reply.
> 
> Thanks,
> Chris
> 
> 

Hi Chris,
Did you get an answer to why the p-value and AIC score disagreed? 
I am getting the same results with my own data. They're not small 
disagreements either. The AIC score is telling me the opposite of 
what the p-value and the parameter coef are saying.  
The p-value and the coef for the predictor variable are in agreement.

I've also noticed in some published papers with tables containing 
p-values and AIC scores that the chisq p-value and AIC score 
are in opposition too. This makes me think I'm missing something 
and that this all actually makes sense.

However given that AIC = − 2 (log L) + 2K (where K is the number of parameters)
and seeing as how the log-likelihood scores below are in the hundreds, shouldn't
the AIC score be in the hundreds as well?? 


--------------------------------------

model0 (intercept only model)
                               NULL Integrated Penalized
Log-likelihood -119.8470  -117.9749 -113.1215

                         Chisq   df        p           AIC    BIC
Integrated loglik  3.74 1.00 0.052989  1.74   0.08
Penalized loglik 13.45 7.06 0.063794 -0.68 -12.43


Random effects
Group   Variable  Std Dev   Variance 
Site    Intercept    0.6399200 0.4094977


_____
model1 (before vs after)
                             NULL Integrated Penalized
Log-likelihood -119.8470  -112.1598 -108.1663

                               Chisq   df          p        AIC   BIC
Integrated loglik 15.37 2.00 0.00045863 11.37  8.05
Penalized loglik 23.36 7.06 0.00153710  9.25 -2.49

Fixed coefficients 
                Coef       exp(coef)  se(coef)         z       p
Time  -1.329997 0.2644779 0.4009229 -3.32 0.00091

Random effects
 Group   Variable  Std Dev   Variance 
 Site    Intercept 0.5625206 0.3164294

______
 model2 (stim1 vs stim2)
                               NULL Integrated Penalized
Log-likelihood -119.8470  -117.8677  -113.167

                              Chisq   df        p      AIC    BIC
Integrated loglik  3.96 2.00 0.138160 -0.04  -3.37
 Penalized loglik 13.36 7.83 0.093314 -2.31 -15.34

Fixed coefficients
             coef          exp(coef)  se(coef)       z       p
stim 0.1621367  1.176021 0.3534788 0.46 0.65

Random effects
 Group  Variable  Std Dev   Variance 
 Site   Intercept   0.6262600 0.3922016
----------------------------------------------

If you didn't get an answer, hopefully, someone that understands all this will
reply for both of us. 

Thanks,
-Teresa



More information about the R-help mailing list