# [R] ROC optimal threshold

Frank E Harrell Jr f.harrell at vanderbilt.edu
Fri Mar 31 18:20:33 CEST 2006

Michael Kubovy wrote:
> Hi Tim and José,
>
>
>>>Date: Fri, 31 Mar 2006 11:58:14 +0200
>>>Subject: [R] ROC optimal threshold
>>>
>>>I am using the ROC package to evaluate predictive models
>>>I have successfully plot the ROC curve, however
>>>
>>>?is there anyway to obtain the value of operating point=optimal
>>>threshold
>>>value (i.e. the nearest point of the curve to the top-left corner
>>>of the
>>>axes)?
>
>
> On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:
>
>
>>I've struggled a bit with the same question, said another way: "how
>>do you find the value in a ROC curve that minimizes false positives
>>while maximizing true positives"?
>>
>>Here's something I've come up with. I'd be curious to hear from the
>>list whether anyone thinks this code might get stuck in local
>>minima, or if it does find the global minimum each time. (I think
>>it's ok).
>>
>>
>>>From your ROC object you need to grab the sensitivity (=true
>>>positive rate) and specificity (= 1- false positive rate) and the
>>>cutoff levels.  Then find the value that minimizes abs(sensitivity-
>>>specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>>
>>absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
>>sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract >>$spec)^2)),];
>>
>>In this example, 'extract' is a dataframe containing three columns:
>>extract$sens = sensitivity values, extract$spec = specificity
>>values, extract$votes = cutoff values. The command subsets the >>dataframe to a single row containing the desired cutoff and the >>sens and spec values that are associated with it. >> >>Most of the time these two answers (abs or sqrt) are the same, >>sometimes they differ quite a bit. >> >>I do not see this application of ROC curves very often. A question >>for those much more knowledgeable than I.... is there a problem >>with using ROC curves in this manner? >> >>Tim Howard > > > @BOOK{MacmillanCreelman2005, > title = {Detection theory: {A} user's guide}, > publisher = {Lawrence Erlbaum Associates}, > year = {2005}, > address = {Mahwah, NJ, USA}, > edition = {2nd}, > author = {Macmillan, Neil A and Creelman, C Douglas}, > } > on p. 43 shows that the ideal value of the cutoff depends on the > reward function R that specifies the payoff for each outcome: > $> LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true > positive) - R(false negative)} \frac{p(noise)}{p(signal)} >$ > > I believe that your attempt to minimize false positives while > maximizing true positives amounts to maximizing the proportion of > correct answers. For that you just set$\beta = 0\$. Otherwise it
> might be best to explicitly state your costs and benefits by
> specifying the reward function R.
> _____________________________
> Professor Michael Kubovy

Choosing cutoffs is frought with difficulties, arbitrariness,
inefficiency, and the necessity to use a complex adjustment for multiple
comparisons in later analysis steps unless the dataset used to generate
the cutoff was so large as could be considered infinite.

--
Frank E Harrell Jr   Professor and Chair           School of Medicine
Department of Biostatistics   Vanderbilt University