[R] ROC optimal threshold
Michael Kubovy
kubovy at virginia.edu
Fri Mar 31 15:54:12 CEST 2006
Hi Tim and José,
>> Date: Fri, 31 Mar 2006 11:58:14 +0200
>> From: "Anadon Herrera, Jose Daniel" <jdanadon at umh.es>
>> Subject: [R] ROC optimal threshold
>>
>> I am using the ROC package to evaluate predictive models
>> I have successfully plot the ROC curve, however
>>
>> ?is there anyway to obtain the value of operating point=optimal
>> threshold
>> value (i.e. the nearest point of the curve to the top-left corner
>> of the
>> axes)?
On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:
> I've struggled a bit with the same question, said another way: "how
> do you find the value in a ROC curve that minimizes false positives
> while maximizing true positives"?
>
> Here's something I've come up with. I'd be curious to hear from the
> list whether anyone thinks this code might get stuck in local
> minima, or if it does find the global minimum each time. (I think
> it's ok).
>
>> From your ROC object you need to grab the sensitivity (=true
>> positive rate) and specificity (= 1- false positive rate) and the
>> cutoff levels. Then find the value that minimizes abs(sensitivity-
>> specificity), or sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>
> absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
> sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract
> $spec)^2)),];
>
> In this example, 'extract' is a dataframe containing three columns:
> extract$sens = sensitivity values, extract$spec = specificity
> values, extract$votes = cutoff values. The command subsets the
> dataframe to a single row containing the desired cutoff and the
> sens and spec values that are associated with it.
>
> Most of the time these two answers (abs or sqrt) are the same,
> sometimes they differ quite a bit.
>
> I do not see this application of ROC curves very often. A question
> for those much more knowledgeable than I.... is there a problem
> with using ROC curves in this manner?
>
> Tim Howard
@BOOK{MacmillanCreelman2005,
title = {Detection theory: {A} user's guide},
publisher = {Lawrence Erlbaum Associates},
year = {2005},
address = {Mahwah, NJ, USA},
edition = {2nd},
author = {Macmillan, Neil A and Creelman, C Douglas},
}
on p. 43 shows that the ideal value of the cutoff depends on the
reward function R that specifies the payoff for each outcome:
\[
LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true
positive) - R(false negative)} \frac{p(noise)}{p(signal)}
\]
I believe that your attempt to minimize false positives while
maximizing true positives amounts to maximizing the proportion of
correct answers. For that you just set $\beta = 0$. Otherwise it
might be best to explicitly state your costs and benefits by
specifying the reward function R.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400 Charlottesville, VA 22904-4400
Parcels: Room 102 Gilmer Hall
McCormick Road Charlottesville, VA 22903
Office: B011 +1-434-982-4729
Lab: B019 +1-434-982-4751
Fax: +1-434-982-4766
WWW: http://www.people.virginia.edu/~mk9y/
More information about the R-help
mailing list