[R] Question about validating predicted probabilities
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sat Aug 22 00:17:20 CEST 2009
Noah Silverman wrote:
> Thanks Frank,
> Two quick questions:
> 1) I see you calculating datadist, but then not using it in the
> subsequent entries. Is that for a different application.
That's just to set default plotting limits.
> 2) I'm less concerned with plotting than the values that were plotted.
> As mentioned in my original message, The line plotted from the fitted
> logistic looked great. I want those values. Perhaps all I need is the
> "lrm" line of your example?
The fitted logistic model in your plot is forced to be linear in the
logit (log odds of pred. prob.). The spline function relaxes that
making if halfway between the linear one and the loess one.
> 3) Your Design library rocks. Thank you so much for making it available
> to the R community!!
> On 8/21/09 3:00 PM, Frank E Harrell Jr wrote:
>> A parametric version is:
>> dd <- datadist(predprob); options(datadist='dd')
>> f <- lrm(event ~ rcs(qlogis(predprob), 3))
>> plot(f, predprob=NA, fun=plogis)
>> Noah Silverman wrote:
>>> Frank was nice enough to point me to the val.prob function of the
>>> Design library.
>>> It creates a beautiful graph that really helps me visualize how well
>>> my model is predicting probabilities.
>>> By default, there are two lines on the graph
>>> 1) fitted logistic calibration curve
>>> 2) nonparametric fit using lowess
>>> Right now, the nonparametric line doesn't look very good.
>>> The "fitted logistic" line looks great. It is right next to the
>>> "ideal" line!!
>>> If I am understanding the graph correctly, whatever transformation
>>> the val.prob is doing to my predicted probability is making it really
>>> Is there some standard function in R that will let me do the same
>>> transformation? (I guess the long way around would be to tear into
>>> the actual val.prob function and try to reverse engineer what he's
>>> doing. But there must be something easier.)
>>> Anybody have any suggestions?
>>> R-help at r-project.org mailing list
>>> PLEASE do read the posting guide
>>> and provide commented, minimal, self-contained, reproducible code.
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help