[R] Question about validating predicted probabilities

Sat Aug 22 00:17:20 CEST 2009

Noah Silverman wrote:
> Thanks Frank,
> 
> Two quick questions:
> 
> 1) I see you calculating datadist, but then not using it in the 
> subsequent entries.  Is that for a different application.

That's just to set default plotting limits.

> 
> 2) I'm less concerned with plotting than the values that were plotted.  
> As mentioned in my original message, The line plotted from the fitted 
> logistic looked great.  I want those values.  Perhaps all I need is the 
> "lrm" line of your example?

The fitted logistic model in your plot is forced to be linear in the 
logit (log odds of pred. prob.).  The spline function relaxes that 
making if halfway between the linear one and the loess one.

> 
> 3) Your Design library rocks.  Thank you so much for making it available 
> to the R community!!

Thanks
Frank

> 
> -N
> 
> On 8/21/09 3:00 PM, Frank E Harrell Jr wrote:
>> A parametric version is:
>>
>> require(Design)
>> dd <- datadist(predprob); options(datadist='dd')
>> f <- lrm(event ~ rcs(qlogis(predprob), 3))
>> plot(f, predprob=NA, fun=plogis)
>>
>> Frank
>>
>>
>> Noah Silverman wrote:
>>> Hello,
>>>
>>> Frank was nice enough to point me to the val.prob function of the 
>>> Design library.
>>>
>>> It creates a beautiful graph that really  helps me visualize how well 
>>> my model is predicting probabilities.
>>>
>>> By default, there are two lines on the graph
>>>     1) fitted logistic calibration curve
>>>     2) nonparametric fit using lowess
>>>
>>> Right now, the nonparametric line doesn't look very good.
>>>
>>> The "fitted logistic" line looks great.  It is right next to the 
>>> "ideal" line!!
>>>
>>> If I am understanding the graph correctly, whatever transformation 
>>> the val.prob is doing to my predicted probability is making it really 
>>> accurate.
>>>
>>> Is there some standard function in R that will let me do the same 
>>> transformation?  (I guess the long way around would be to tear into 
>>> the actual val.prob function and try to reverse engineer what he's 
>>> doing.  But there must be something easier.)
>>>
>>> Anybody  have any suggestions?
>>>
>>> Thanks!
>>>
>>> -N
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University