[R] ROC optimal threshold

Michael Kubovy kubovy at virginia.edu
Fri Mar 31 15:54:12 CEST 2006

Hi Tim and José,

>> Date: Fri, 31 Mar 2006 11:58:14 +0200
>> From: "Anadon Herrera, Jose Daniel" <jdanadon at umh.es>
>> Subject: [R] ROC optimal threshold
>>
>> I am using the ROC package to evaluate predictive models
>> I have successfully plot the ROC curve, however
>>
>> ?is there anyway to obtain the value of operating point=optimal
>> threshold
>> value (i.e. the nearest point of the curve to the top-left corner
>> of the
>> axes)?

On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:

> I've struggled a bit with the same question, said another way: "how
> do you find the value in a ROC curve that minimizes false positives
> while maximizing true positives"?
>
> Here's something I've come up with. I'd be curious to hear from the
> list whether anyone thinks this code might get stuck in local
> minima, or if it does find the global minimum each time. (I think
> it's ok).
>
>> From your ROC object you need to grab the sensitivity (=true
>> positive rate) and specificity (= 1- false positive rate) and the
>> cutoff levels.  Then find the value that minimizes abs(sensitivity-
>> specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>
> absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
> sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract >$spec)^2)),];
>
> In this example, 'extract' is a dataframe containing three columns:
> extract$sens = sensitivity values, extract$spec = specificity
> values, extract$votes = cutoff values. The command subsets the > dataframe to a single row containing the desired cutoff and the > sens and spec values that are associated with it. > > Most of the time these two answers (abs or sqrt) are the same, > sometimes they differ quite a bit. > > I do not see this application of ROC curves very often. A question > for those much more knowledgeable than I.... is there a problem > with using ROC curves in this manner? > > Tim Howard @BOOK{MacmillanCreelman2005, title = {Detection theory: {A} user's guide}, publisher = {Lawrence Erlbaum Associates}, year = {2005}, address = {Mahwah, NJ, USA}, edition = {2nd}, author = {Macmillan, Neil A and Creelman, C Douglas}, } on p. 43 shows that the ideal value of the cutoff depends on the reward function R that specifies the payoff for each outcome: $LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true positive) - R(false negative)} \frac{p(noise)}{p(signal)}$ I believe that your attempt to minimize false positives while maximizing true positives amounts to maximizing the proportion of correct answers. For that you just set$\beta = 0\$. Otherwise it
might be best to explicitly state your costs and benefits by
specifying the reward function R.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/