[R] Calculate Specificity and Sensitivity for a given threshold value
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Thu Nov 13 20:19:39 CET 2008
N. Lapidus wrote:
> Hi Pierre-Jean,
>
> Sensitivity (Se) and specificity (Sp) are calculated for cutoffs stored in
> the "performance" x.values of your prediction for Se and Sp:
>
> For example, let's generate the performance for Se and Sp:
> sens <- performance(pred,"sens")
> spec <- performance(pred,"spec")
>
> Now, you can have acces to:
> sens at x.values[[1]] # (or spec at x.values[[1]]), which is the list of cutoffs
> sens at y.values[[1]] # for the corresponding Se
> spec at y.values[[1]] # for the corresponding Sp
>
> You can for example sum up this information in a table:
> (SeSp <- cbind (Cutoff=sens at x.values[[1]], Se=sens at y.values[[1]],
> Sp=spec at y.values[[1]]))
>
> You can also write a function to give Se and Sp for a specific cutoff, but
> you will have to define what to do for cutoffs not stored in the list. For
> example, the following function keeps the closest stored cutoff to give
> corresponding Se and Sp (but this is not always the best solution, you may
> want to define your own way to interpolate):
>
> se.sp <- function (cutoff, performance) {
> sens <- performance(pred,"sens")
> spec <- performance(pred,"spec")
> num.cutoff <- which.min(abs(sens at x.values[[1]] - cutoff))
> return(list(Cutoff=sens at x.values[[1]][num.cutoff],
> Sensitivity=sens at y.values[[1]][num.cutoff], Specificity=spec at y.values
> [[1]][num.cutoff]))
That is a biased procedure (like how stepwise regression results in
overfitting). It also uses a strange loss function. The bootstrap
would need to be used to penalize for the uncertainty in the cutoff.
You are also assuming that a cutoff exists, which is a major assumption.
Frank
> }
>
> se.sp(.5, pred)
>
> Hope this helps,
>
> Nael
>
>
> On Thu, Nov 13, 2008 at 5:59 PM,
> <Pierre-Jean-EXT.Breton at sanofi-aventis.com>wrote:
>
>> Hi Frank,
>>
>> Thank you for your answer.
>> In fact, I don't use this for clinical research practice.
>> I am currently testing several scoring methods and I'd like
>> to know which one is the most effective and which threshold
>> value I should apply to discriminate positives and negatives.
>> So, any idea for my problem ?
>>
>> Pierre-Jean
>>
>> -----Original Message-----
>> From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu]
>> Sent: Thursday, November 13, 2008 5:00 PM
>> To: Breton, Pierre-Jean-EXT R&D/FR
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Calculate Specificity and Sensitivity for a given
>> threshold value
>>
>> Kaliss wrote:
>>> Hi list,
>>>
>>>
>>> I'm new to R and I'm currently using ROCR package.
>>> Data in input look like this:
>>>
>>> DIAGNOSIS SCORE
>>> 1 0.387945
>>> 1 0.50405
>>> 1 0.435667
>>> 1 0.358057
>>> 1 0.583512
>>> 1 0.387945
>>> 1 0.531795
>>> 1 0.527148
>>> 0 0.526397
>>> 0 0.372935
>>> 1 0.861097
>>>
>>> And I run the following simple code:
>>> d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE,
>>> d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr");
>>> plot(perf)
>>>
>>> So building the curve works easily.
>>> My question is: can I have the specificity and the sensitivity for a
>>> score threshold = 0.5 (for example)? How do I compute this ?
>>>
>>> Thank you in advance
>> Beware of the utility/loss function you are implicitly assuming with
>> this approach. It is quite oversimplified. In clinical practice the
>> cost of a false positive or false negative (which comes from a cost
>> function and the simple forward probability of a positive diagnosis,
>> e.g., from a basic logistic regression model if you start with a cohort
>> study) vary with the type of patient being diagnosed.
>>
>> Frank
>>
>> --
>> Frank E Harrell Jr Professor and Chair School of Medicine
>> Department of Biostatistics Vanderbilt
>> University
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list