[R] Calculate Specificity and Sensitivity for a given threshold value
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Thu Nov 13 20:17:47 CET 2008
Pierre-Jean-EXT.Breton at sanofi-aventis.com wrote:
> Hi Frank,
>
> Thank you for your answer.
> In fact, I don't use this for clinical research practice.
> I am currently testing several scoring methods and I'd like
> to know which one is the most effective and which threshold
> value I should apply to discriminate positives and negatives.
> So, any idea for my problem ?
The use of thresholds gets in the way of finding a good solution because
you will have predictor values in the "gray zone". I tend to rank
methods by the most sensitive index available such as the log likelihood
in the binary logistic model. You can extend ordinary logistic models
to allow for nonlinear effects on the log odds scale using regression
splines.
Frank
>
> Pierre-Jean
>
> -----Original Message-----
> From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu]
> Sent: Thursday, November 13, 2008 5:00 PM
> To: Breton, Pierre-Jean-EXT R&D/FR
> Cc: r-help at r-project.org
> Subject: Re: [R] Calculate Specificity and Sensitivity for a given
> threshold value
>
> Kaliss wrote:
>> Hi list,
>>
>>
>> I'm new to R and I'm currently using ROCR package.
>> Data in input look like this:
>>
>> DIAGNOSIS SCORE
>> 1 0.387945
>> 1 0.50405
>> 1 0.435667
>> 1 0.358057
>> 1 0.583512
>> 1 0.387945
>> 1 0.531795
>> 1 0.527148
>> 0 0.526397
>> 0 0.372935
>> 1 0.861097
>>
>> And I run the following simple code:
>> d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE,
>
>> d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr");
>> plot(perf)
>>
>> So building the curve works easily.
>> My question is: can I have the specificity and the sensitivity for a
>> score threshold = 0.5 (for example)? How do I compute this ?
>>
>> Thank you in advance
>
> Beware of the utility/loss function you are implicitly assuming with
> this approach. It is quite oversimplified. In clinical practice the
> cost of a false positive or false negative (which comes from a cost
> function and the simple forward probability of a positive diagnosis,
> e.g., from a basic logistic regression model if you start with a cohort
> study) vary with the type of patient being diagnosed.
>
> Frank
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list