[R] Discrepancy in the regression coefficients for Cox regression - PBC data set
Ravi Varadhan
RVaradhan at jhmi.edu
Fri Nov 21 20:30:26 CET 2008
Peter,
I did check the data in the Appendix of F&H with the data in "survival"
package. I couldn't find any differences in the "time" and "status"
variables.
May be Terry Therneau knows the answer?!
Ravi.
----------------------------------------------------------------------------
-------
Ravi Varadhan, Ph.D.
Assistant Professor, The Center on Aging and Health
Division of Geriatric Medicine and Gerontology
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email: rvaradhan at jhmi.edu
Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
----------------------------------------------------------------------------
--------
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Peter Dalgaard
Sent: Friday, November 21, 2008 1:58 PM
To: Ravi Varadhan
Cc: r-help at r-project.org
Subject: Re: [R] Discrepancy in the regression coefficients for Cox
regression - PBC data set
Ravi Varadhan wrote:
> Hi David,
>
> I did look at Appendix D.3 of T&G, but am not sure if the data set
> analyzed in F&H and that attached with "survival" are different. They
> both have
> n=418 (312 from RCT and 106 observational).
Well, as David implies, if the observation times are longer and a few more
people died, that could easily explain the differences.
Someone borrowed our copy of F&H so I can't check, but presumably you have
one (and it is your problem anyway...).
>
> There is a major difference in the coefficient for "edema" 0.66 vs
> 0.86. In any case, the point is not whether the differences in
> coefficient affect interpretation of the model, but to understand why
> there are differences in the results.
>
> Best,
> Ravi.
>
>
> ----------------------------------------------------------------------
> ------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan at jhmi.edu
>
> Webpage:
> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
> ----------------------------------------------------------------------
> ------
> --------
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Friday, November 21, 2008 12:34 PM
> To: Ravi Varadhan
> Cc: r-help at r-project.org
> Subject: Re: [R] Discrepancy in the regression coefficients for Cox
> regression - PBC data set
>
> There is a discussion in Appendix D.3 of "Modeling Survival Data" by
> Thereau and Grambsch regarding the differences in the datasets
> including the fact that "there was significantly more follow-up for
> many patients at the time this dataset was assembled". I do not see a
> material difference in the estimates.
>
> --
> David Winsemius, MD
> Heritage Labs
>
> On Nov 21, 2008, at 12:16 PM, Ravi Varadhan wrote:
>
>> Hi,
>>
>> When I run the following Cox proportional hazards model on the Mayo
>> clinic's PBC data set (given in the "survival" package), the
>> regression coefficients do not agree with the results presented in
>> Table 4.6.3 (p. 195) of Fleming & Harrington's book.
>>
>> library(survival)
>>
>> data(pbc)
>>
>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>
>> ans.cox
>>
>>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>> ans.cox
>> Call:
>> coxph(formula = Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>
>>
>> coef exp(coef) se(coef) z p
>> log(bili) 0.8975 2.453 0.08271 10.85 0.0e+00
>> log(alb) -2.4524 0.086 0.65707 -3.73 1.9e-04
>> age 0.0382 1.039 0.00768 4.97 6.5e-07
>> log(protime) 2.3458 10.442 0.77425 3.03 2.4e-03
>> edema 0.6613 1.937 0.20595 3.21 1.3e-03
>>
>> Likelihood ratio test=234 on 5 df, p=0 n= 418 These coefficients,
>> however, are significantly different (i.e. the differences can't be
>> just attributed to round-off's) from that reported in Table 4.6.3 (in
>> the "Final model" column) of Fleming and Harrington (p.
>> 195). The coefficients reported are: 0.8707, -2.533, 0.0394, 2.380,
>> 0.8592.
>> Note the big difference for the "edema" variable.
>>
>> It seems like the data set considered in the book and that available
>> in "survival" package are the same (with n=418).
>>
>> I also re-ran the Cox PH model with the 2 "data-errors" discussed in
>> p.188
>> of F&H, but still I could not match the results in Table 4.6.3.
>>
>> Is it possible that the results could be explained due to difference
>> in convergence during maximization of partial likelihood?
>>
>> Can anyone help me figure out why this diescrepancy exists?
>>
>> Thanks very much,
>> Ravi.
>> ---------------------------------------------------------------------
>> -
>> ------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: rvaradhan at jhmi.edu
>>
>> Webpage:
>> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>>
>>
>>
>> ---------------------------------------------------------------------
>> -
>> ------
>> --------
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list