[R] How to calculate confidence interval of C statistic by rcorr.cens

Sun May 22 17:56:47 CEST 2011

Thank you for your comment, Prof Harrell.

I changed the function;

CstatisticCI <- function(x)   # x is object of rcorr.cens.
   {
     se <- x["S.D."]/2
     Low95 <- x["C Index"] - 1.96*se
     Upper95 <- x["C Index"] + 1.96*se

     cbind(x["C Index"], Low95, Upper95)
   }

 > CstatisticCI(MyModel.lrm.penalized.rcorr)
                       Low95   Upper95
C Index 0.8222785 0.7195828 0.9249742

I obtained wider CI than the previous incorrect one.
Regarding your comments on overfitting, this is a sample used in model 
development. However, I performed penalization by pentrace and lrm in 
rms package. The CI above is CI of penalized model. Results of 
validation of each model are followings;

First model
 > validate(MyModel.lrm, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6385   0.6859  0.6198   0.0661          0.5724 1000
R2            0.3745   0.4222  0.3388   0.0834          0.2912 1000
Intercept     0.0000   0.0000 -0.1446   0.1446         -0.1446 1000
Slope         1.0000   1.0000  0.8266   0.1734          0.8266 1000
Emax          0.0000   0.0000  0.0688   0.0688          0.0688 1000
D             0.2784   0.3248  0.2474   0.0774          0.2010 1000
U            -0.0192  -0.0192  0.0200  -0.0392          0.0200 1000
Q             0.2976   0.3440  0.2274   0.1166          0.1810 1000
B             0.1265   0.1180  0.1346  -0.0167          0.1431 1000
g             1.7010   2.0247  1.5763   0.4484          1.2526 1000
gp            0.2414   0.2512  0.2287   0.0225          0.2189 1000

penalized model
 > validate(MyModel.lrm.penalized, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6446   0.6898  0.6256   0.0642          0.5804 1000
R2            0.3335   0.3691  0.3428   0.0264          0.3072 1000
Intercept     0.0000   0.0000  0.0752  -0.0752          0.0752 1000
Slope         1.0000   1.0000  1.0547  -0.0547          1.0547 1000
Emax          0.0000   0.0000  0.0249   0.0249          0.0249 1000
D             0.2718   0.2744  0.2507   0.0236          0.2481 1000
U            -0.0192  -0.0192 -0.0027  -0.0165         -0.0027 1000
Q             0.2910   0.2936  0.2534   0.0402          0.2508 1000
B             0.1279   0.1192  0.1336  -0.0144          0.1423 1000
g             1.3942   1.5259  1.5799  -0.0540          1.4482 1000
gp            0.2141   0.2188  0.2298  -0.0110          0.2251 1000

Optimism of slope and intercept were improved from 0.1446 and 0.1734 to 
-0.0752 and -0.0547, respectively. Emax was improved from 0.0688 to 
0.0249. Therefore, I thought overfitting was improved at least to some 
extent. Well, I'm not sure whether this is enough improvement though.

--
Kohkichi

(11/05/22 23:27), Frank Harrell wrote:
> S.D. is the standard deviation (standard error) of Dxy.  It already includes
> the effective sample size in its computation so the sqrt(n) terms is not
> needed.  The help file for rcorr.cens has an example where the confidence
> interval for C is computed.  Note that you are making the strong assumption
> that there is no overfitting in the model or that you are evaluating C on a
> sample not used in model development.
> Frank
>
>
> Kohkichi wrote:
>>
>> Hi,
>>
>> I'm trying to calculate 95% confidence interval of C statistic of
>> logistic regression model using rcorr.cens in rms package. I wrote a
>> brief function for this purpose as the followings;
>>
>> CstatisticCI<- function(x)   # x is object of rcorr.cens.
>>    {
>>      se<- x["S.D."]/sqrt(x["n"])
>>      Low95<- x["C Index"] - 1.96*se
>>      Upper95<- x["C Index"] + 1.96*se
>>      cbind(x["C Index"], Low95, Upper95)
>>    }
>>
>> Then,
>>
>>> MyModel.lrm.rcorr<- rcorr.cens(x=predict(MyModel.lrm), S=df$outcome)
>>> MyModel.lrm.rcorr
>>         C Index            Dxy           S.D.              n
>> missing     uncensored
>>       0.8222785      0.6445570      0.1047916    104.0000000
>> 0.0000000    104.0000000
>> Relevant Pairs     Concordant      Uncertain
>>    3950.0000000   3248.0000000      0.0000000
>>
>>> CstatisticCI(x5factor_final.lrm.pen.rcorr)
>>                        Low95   Upper95
>> C Index 0.8222785 0.8021382 0.8424188
>>
>> I'm not sure what "S.D." in object of rcorr.cens means. Is this standard
>> deviation of "C Index" or standard deviation of "Dxy"?
>> I thought it is standard deviation of "C Index". Therefore, I wrote the
>> code above. Am I right?
>>
>> I would appreciate any help in advance.
>>
>> --
>> Kohkichi Hosoda M.D.
>>
>>      Department of Neurosurgery,
>>      Kobe University Graduate School of Medicine,
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-calculate-confidence-interval-of-C-statistic-by-rcorr-cens-tp3541709p3542163.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.