[R] Best performance measure?

Wed Aug 19 21:11:51 CEST 2009

Noah Silverman wrote:
> Frank,
> 
> That makes sense.
> 
> I just had a look at the actual algorithm calculating the Briar score. 
> 
> One thing that confuses me is how the score is calculated.
> 
> 
> 
> If I understand the code correctly, it is just:  sum((p - y)^2)/n
> 
> If I have an example with a label of 1 and a probability prediction of 
> .4, it is (.4 - 1)^2 
> (I know it is the average of these value across all the examples)

Yes and I seem to remember the original score is 1 minus that.

> 
> Wouldn't it make more sense to stratify the probabilities and then check 
> the accuracy of each level.

The stratification will bring a great deal of noise into the problem. 
Better: loess calibration curves or decomposition of the Brier score 
into discrimination and calibration components (which is not in the 
software).

Frank

> 
> i.e.  For predicted probabilities of .10 to .20 the data was actually 
> labeled true .18 percent of the time. mean(label)
> 
> 
> 
> 
> 
> 
> On 8/19/09 11:51 AM, Frank E Harrell Jr wrote:
>> Noah Silverman wrote:
>>> Thanks for the suggestion.
>>>
>>> You explained that Briar combines both accuracy and discrimination 
>>> ability.  If I understand you right, that is in relation to binary 
>>> classification.
>>>
>>> I'm not concerned with binary classification, but the accuracy of the 
>>> probability predictions.
>>>
>>> Is there some kind of score that measures just the accuracy?
>>>
>>> Thanks!
>>>
>>> -N
>>
>> The Brier score has nothing to do with classification.  It is a 
>> probability accuracy score.
>>
>> Frank
>>
>>>
>>> On 8/19/09 10:42 AM, Frank E Harrell Jr wrote:
>>>> Noah Silverman wrote:
>>>>> Hello,
>>>>>
>>>>> I working on a model to predict probabilities.
>>>>>
>>>>> I don't really care about binary prediction accuracy.
>>>>>
>>>>> I do really care about the accuracy of my probability predictions.
>>>>>
>>>>> Frank was nice enough to point me to the val.prob function from the 
>>>>> Design library.  It looks very promising for my needs.
>>>>>
>>>>> I've put together some tests and run the val.prob analysis.  It 
>>>>> produces some very informative graphs along with a bunch of 
>>>>> performance measures.
>>>>>
>>>>> Unfortunately, I'm not sure which measure, if any, is the "best" 
>>>>> one.  I'm comparing hundreds of different models/parameter 
>>>>> combinations/etc.  So Ideally I'd like a single value or two as the 
>>>>> "performance measure" for each one.  That way I can pick the 
>>>>> "best"  model from all my experiments.
>>>>>
>>>>> As mentioned above, I'm mainly interested in the accuracy of my 
>>>>> probability predictions.
>>>>>
>>>>> Does anyone have an opinion about which measure I should look at??
>>>>> (I see Dxy, C, R2, D, U, Briar, Emax, Eavg, etc.)
>>>>>
>>>>> Thanks!!
>>>>>
>>>>> -N
>>>>
>>>> It all depends on the goal, i.e., the relative value you place on 
>>>> absolute accuracy vs. discrimination ability. The Brier score 
>>>> combines both and other than interpretability has many advantages.
>>>>
>>>> Frank
>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>
>>

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University