[R] upperbound of C index Conf.int. greater than 1
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Wed May 14 15:05:50 CEST 2008
DAVID ARTETA GARCIA wrote:
>
>
> Dear Frank
>
> Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:
>
>>
>> A few observations.
>>
>> 1. With minimal overfitting, rcorr.cens(predict(fit), Y) gives a good
>> standard error for Dxy = 2*(C-.5) and bootstrapping isn't very necessary
>>
>> 2. If you bootstrap use the nonparametric bootstrap percentile method
>> or other methods that constrain the confidence interval to be in [0,1].
>>
>> 3. I don't know why the model would be linear on the two predictors you
>> are using.
>
> Do you mean using these predictors fitted with spline functions?? I have
> read about it in your "Regression Modeling Strategies" but I am not very
> sure I understand the use of them. I will read through it again.
Yes, or other types of splines. In general I don't expect things to be
linear. If you have enough data you can always allow for nonlinearity.
The book has a strategy for allocating degrees of freedom based on the
predictive potential for each variable, and the following strategy also
works:
f <- lrm(y ~ rcs(x1,5) + rcs(x2,5))
plot(anova(f))
That plot masks the contribution of nonlinear terms so you won't be
biased. You can reduce d.f. or force linearity for those variables
having lower overall partial chi-squares. The plot shows partial Wald
chi-square minus its degrees of freedom. The plot does not bias you as
long as you agree to devote at least one d.f. (linear fit) to each
predictor. If you have used y to screen predictors to narrow it down to
x1 and x2, all bets are off.
Frank
>
> David
>
>>
>> Frank
>>
>> --
>> Frank E Harrell Jr Professor and Chair School of Medicine
>> Department of Biostatistics Vanderbilt University
>
More information about the R-help
mailing list