[R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

Tue Oct 14 04:26:19 CEST 2008

Actually, somewhat counterintuitively, ppv tends to me more affected by specificity and npv by sensitivity.  You can see this from the function SensSpec.demo in the TeachingDemos package (also see the corresponding example on the help page for tkexamp (same package)).

I don't think that Frank is saying that Sensitivity and Specificity are not related to ppv and npv and that they don't measure what is claimed.  I think his argument is more based on the fact that sens, spec, ppv, and npv are all based on grouping the predicted values from the logistic regression into 2 groups and any time you categorize something that is not categorical, you lose information.

Imagine 2  models (model 1 and model 2) from which we use 50% as the cutt-off.  Model 1 produces predicted probabilities in the range of 0.47 - 0.54 and model 2 produces most of its predicted probabilities in either 0.9-0.98 or 0.01-0.09 with only a few between those 2 ranges, but both agree on which side of 0.5 the predicted probabilities are.  Based on Sensitivity and Specificity, these 2 models are equivalent, but I certainly want my doctor using model 2.

Or imagine that you go to the doctor and are given a set of tests, the test results are put into the logistic model and the doctor sees that you have a 51% chance of having disease A, would you want the doctor to treat you the same (without further testing) as the patient who had a predicted value of 97%?  and what if you had taken a couple more deep breaths just before having your blood pressure measured and insted had a predicted probability of 49%, would you want to be treated the same as the patient with a predicted probability of 1%?

I would hope that a doctor seeing a predicted probability of 51% or 49% would do additional testing.  But if we focus on only sens and spec, then 51% is the same as 100% and 49% is the same as 0%.  I think Frank's issue is with throwing out the information contained in the actual predicted probabilities.  Judge the model based on how the predicted probabilities match what is observed rather than on a dicotimization of them.

One more analogy, if you have a coin that comes up heads 60% of the time, the probability that best predicts future tosses is to predict heads 100% of the time, but that does not describe the true state of nature of 60%.  Some of the common measures used are not optimal for describing the true state of nature, if you are more interested in the true state of nature than in a different question, don't use these measures.

________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of John Sorkin [jsorkin at grecc.umaryland.edu]
Sent: Monday, October 13, 2008 4:14 PM
To: Ph.D. Robert W. Baer; Frank E Harrell Jr
Cc: r-help at r-project.org; dieter.menne at menne-biomed.de; p.dalgaard at biostat.ku.dk
Subject: Re: [R] Fw:  Logistic regresion - Interpreting (SENS)  and     (SPEC)

Of course Prof Baer is correct the positive predictive value (PPV) and the negative predictive values (NPV) serve the function of providing conditional post-test probabilities
PPV: Post-test probability of disease given a positive test
NPV: Post-test probability of no disease given a negative test.

Further, PPV is a function of sensitivity (for a given specificity in a population with a given disease prevalence), the higher the sensitivity almost always the greater the PPV (it can by unchanged, but I don't believe it can be lower) and as
              NPV is a function of specificity (for a given sensitivity in a population with a given disease prevelance), the higher the specificity almost always the greater the NPV (it can by unchanged, but I don't believe it can be lower) .

Thus using Prof  Harrell's suggestion to use the test that move a pre-test probability a great deal in one or both directions, the test to choose is the one with largest sensitivity and or specificity, and thus
sensitivity and specificity are, I believe is a good summary measures of the "quality" of a clinical test.

Finally I think Prof Harrell's observation that sensitivity and specificity change quite a bit, and mathematically must change if the disease is not all-or-nothing while true is a degenerate case of little practical importance.

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> "Robert W. Baer, Ph.D." <rbaer at atsu.edu> 10/13/2008 4:41 PM >>>

----- Original Message -----
From: "Frank E Harrell Jr" <f.harrell at vanderbilt.edu>
To: "John Sorkin" <jsorkin at grecc.umaryland.edu>
Cc: <r-help at r-project.org>; <dieter.menne at menne-biomed.de>;
<p.dalgaard at biostat.ku.dk>
Sent: Monday, October 13, 2008 2:09 PM
Subject: Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

> John Sorkin wrote:
>> Frank,
>> Perhaps I was not clear in my previous Email message. Sensitivity and
>> specificity do tell us about the quality of a test in that given two
>> tests the one with higher sensitivity will be better at identifying
>> subjects who have a disease in a pool who have a disease, and the more
>> sensitive test will be better at identifying subjects who do not have a
>> disease in a pool of people who do not have a disease. It is true that
>> positive predictive and negative predictive values are of greater utility
>> to a clinician, but as you know these two measures are functions of
>> sensitivity, specificity and disease prevalence. All other things being
>> equal, given two tests one would select the one with greater sensitivity
>> and specificity so in a sense they do measure the "quality" of a clinical
>> test - but not, as I tried to explain the quality of a statistical model.
>
> That is not very relevant John.  It is a function of all those things
> because those quantities are all deficient.
>
> I would select the test that can move the pre-test probability a great
> deal in one or both directions.

Of course, this quantity is known as a likelihood ratio and is a function of
sensitivity and specificity.  For 2 x 2 data one often speaks of postive
likelihood ratio and negative likelihood ratio, but for multi-row
contingency table one can define likelihood ratios for a series of cut-off
points.  This has become a popular approach in evidence-based medicine when
diagnostic tests have continuous rather than binary outputs.

>> You are of course correct that sensitivity and specificity are not truly
>> "inherent" characteristics of a test as their values may change from
>> population-to-population, but paretically speaking, they don't change all
>> that much, certainly not as much as positive and negative predictive
>> values.
>
> They change quite a bit, and mathematically must change if the disease is
> not all-or-nothing.
>
>>
>
>> I guess we will disagree about the utility of sensitivity and specificity
>> as simplifying concepts.
>>
>> Thank you as always for your clear thoughts and stimulating comments.
>
> And thanks for yours John.
> Frank
>
>> John
>>
>>
>>
>>
>> among those subjects with a disease and the one with greater specificity
>> will be better at indentifying  John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>>>>> Frank E Harrell Jr <f.harrell at vanderbilt.edu> 10/13/2008 2:35 PM >>>
>> John Sorkin wrote:
>>> Jumping into a thread can be like jumping into a den of lions but here
>>> goes . . .
>>> Sensitivity and specificity are not designed to determine the quality of
>>> a fit (i.e. if your model is good), but rather are characteristics of a
>>> test. A test that has high sensitivity will properly identify a large
>>> portion of people with a disease (or a characteristic) of interest. A
>>> test with high specificity will properly identify large proportion of
>>> people without a disease (or characteristic) of interest. Sensitivity
>>> and specificity inform the end user about the "quality" of a test. Other
>>> metrics have been designed to determine the quality of the fit, none
>>> that I know of are completely satisfactory. The pseudo R squared is one
>>> such measure.
>>> For a given diagnostic test (or classification scheme), different
>>> cut-off points for identifying subject who have disease can be examined
>>> to see how they influence sensitivity and 1-specificity using ROC
>>> curves.
>>> I await the flames that will surely come my way
>>>
>>> John
>>
>> John this has been much debated but I fail to see how backwards
>> probabilities are that helpful in judging the usefulness of a test.  Why
>> not condition on what we know (the test result and other baseline
>> variables) and quit conditioning on what we are trying to find out
>> (disease status)?  The data collected in most studies (other than
>> case-control) allow one to use logistic modeling with the correct time
>> order.
>>
>> Furthermore, sensitivity and specificity are not constants but vary with
>> subjects' characteristics.  So they are not even useful as simplifying
>> concepts.
>>
>> Frank
>>>
>>>
>>>
>>> John David Sorkin M.D., Ph.D.
>>> Chief, Biostatistics and Informatics
>>> University of Maryland School of Medicine Division of Gerontology
>>> Baltimore VA Medical Center
>>> 10 North Greene Street
>>> GRECC (BT/18/GR)
>>> Baltimore, MD 21201-1524
>>> (Phone) 410-605-7119
>>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>>
>>>>>> Frank E Harrell Jr <f.harrell at vanderbilt.edu> 10/13/2008 12:27 PM >>>
>>> Maithili Shiva wrote:
>>>> Dear Mr Peter Dalgaard and Mr Dieter Menne,
>>>>
>>>> I sincerely thank you for helping me out with my problem. The thing is
>>>> taht I already have calculated SENS = Gg / (Gg + Bg) = 89.97%
>>>> and SPEC = Bb / (Bb + Gb) = 74.38%.
>>>>
>>>> Now I have values of SENS and SPEC, which are absolute in nature. My
>>>> question was how do I interpret these absolue values. How does these
>>>> values help me to find out wheher my model is good.
>>>>
>>>> With regards
>>>>
>>>> Ms Maithili Shiva
>>> I can't understand why you are interested in probabilities that are in
>>> backwards time order.
>>>
>>> Frank
>>>
>>>> ________________________________________________________________________
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Subject: [R] Logistic regresion - Interpreting (SENS) and (SPEC)
>>>>> To: r-help at r-project.org Date: Friday, October 10, 2008, 5:54 AM
>>>>> Hi
>>>>>
>>>>> Hi I am working on credit scoring model using logistic
>>>>> regression. I havd main sample of 42500 clentes and based on
>>>>> their status as regards to defaulted / non - defaulted, I
>>>>> have genereted the probability of default.
>>>>>
>>>>> I have a hold out sample of 5000 clients. I have calculated
>>>>> (1) No of correctly classified goods Gg, (2) No of correcly
>>>>> classified Bads Bg and also (3) number of wrongly classified
>>>>> bads (Gb) and (4) number of wrongly classified goods (Bg).
>>>>>
>>>>> My prolem is how to interpret these results? What I have
>>>>> arrived at are the absolute figures.
>>>>>
>>
>>
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                      Department of Biostatistics   Vanderbilt University
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:9}}