[R] Calculate Specificity and Sensitivity for a given threshold value

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Nov 13 20:19:39 CET 2008


N. Lapidus wrote:
> Hi Pierre-Jean,
> 
> Sensitivity (Se) and specificity (Sp) are calculated for cutoffs stored in
> the "performance" x.values of your prediction for Se and Sp:
> 
> For example, let's generate the performance for Se and Sp:
> sens <- performance(pred,"sens")
> spec <- performance(pred,"spec")
> 
> Now, you can have acces to:
> sens at x.values[[1]] # (or spec at x.values[[1]]), which is the list of cutoffs
> sens at y.values[[1]] # for the corresponding Se
> spec at y.values[[1]] # for the corresponding Sp
> 
> You can for example sum up this information in a table:
> (SeSp <- cbind (Cutoff=sens at x.values[[1]], Se=sens at y.values[[1]],
> Sp=spec at y.values[[1]]))
> 
> You can also write a function to give Se and Sp for a specific cutoff, but
> you will have to define what to do for cutoffs not stored in the list. For
> example, the following function keeps the closest stored cutoff to give
> corresponding Se and Sp (but this is not always the best solution, you may
> want to define your own way to interpolate):
> 
> se.sp <- function (cutoff, performance)    {
>     sens <- performance(pred,"sens")
>     spec <- performance(pred,"spec")
>     num.cutoff <- which.min(abs(sens at x.values[[1]] - cutoff))
>     return(list(Cutoff=sens at x.values[[1]][num.cutoff],
> Sensitivity=sens at y.values[[1]][num.cutoff], Specificity=spec at y.values
> [[1]][num.cutoff]))

That is a biased procedure (like how stepwise regression results in 
overfitting).  It also uses a strange loss function.  The bootstrap 
would need to be used to penalize for the uncertainty in the cutoff. 
You are also assuming that a cutoff exists, which is a major assumption.

Frank

> }
> 
> se.sp(.5, pred)
> 
> Hope this helps,
> 
> Nael
> 
> 
> On Thu, Nov 13, 2008 at 5:59 PM,
> <Pierre-Jean-EXT.Breton at sanofi-aventis.com>wrote:
> 
>> Hi Frank,
>>
>> Thank you for your answer.
>> In fact, I don't use this for clinical research practice.
>> I am currently testing several scoring methods and I'd like
>> to know which one is the most effective and which threshold
>> value I should apply to discriminate positives and negatives.
>> So, any idea for my problem ?
>>
>> Pierre-Jean
>>
>> -----Original Message-----
>> From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu]
>> Sent: Thursday, November 13, 2008 5:00 PM
>> To: Breton, Pierre-Jean-EXT R&D/FR
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Calculate Specificity and Sensitivity for a given
>> threshold value
>>
>> Kaliss wrote:
>>> Hi list,
>>>
>>>
>>> I'm new to R and I'm currently using ROCR package.
>>> Data in input look like this:
>>>
>>> DIAGNOSIS     SCORE
>>> 1     0.387945
>>> 1     0.50405
>>> 1     0.435667
>>> 1     0.358057
>>> 1     0.583512
>>> 1     0.387945
>>> 1     0.531795
>>> 1     0.527148
>>> 0     0.526397
>>> 0     0.372935
>>> 1     0.861097
>>>
>>> And I run the following simple code:
>>> d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE,
>>> d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr");
>>> plot(perf)
>>>
>>> So building the curve works easily.
>>> My question is: can I have the specificity and the sensitivity for a
>>> score threshold = 0.5 (for example)? How do I compute this ?
>>>
>>> Thank you in advance
>> Beware of the utility/loss function you are implicitly assuming with
>> this approach.  It is quite oversimplified.  In clinical practice the
>> cost of a false positive or false negative (which comes from a cost
>> function and the simple forward probability of a positive diagnosis,
>> e.g., from a basic logistic regression model if you start with a cohort
>> study) vary with the type of patient being diagnosed.
>>
>> Frank
>>
>> --
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                      Department of Biostatistics   Vanderbilt
>> University
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list