[R] Logistic Regression - Interpreting SENS (Sensitivity) and SPEC (Specificity)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Oct 13 09:52:30 CEST 2008
On Mon, 13 Oct 2008, Peter Dalgaard wrote:
> Dieter Menne wrote:
>> Maithili Shiva <maithili_shiva <at> yahoo.com> writes:
>>
>>> I havd main sample of 42500 clentes and
>>> based on their status as regards to defaulted / non - defaulted, I have
>> genereted the probability of default.
>>> I have a hold out sample of 5000 clients. I have calculated (1) No of
>> correctly classified goods Gg, (2) No of
>>> correcly classified Bads Bg and also (3) number of wrongly classified bads
>> (Gb) and (4) number of wrongly
>>> classified goods (Bg).
>>
>> The simple and wrong answer is to use these data directly to compute
>> sensitivity
>> (fraction of hits). This measure is useless, but I encounter it often in
>> medical
>> publications.
>>
>> You can get a more reasonable answer by using cross-validation. Check, for
>> example, Frank Harrell's
>> http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
>
> But if he has a "hold out sample", isn't he already cross-validating?? I
> wonder if you're answering the right question there. Could he just be looking
> for Sp=Gg/(Gg+Bg), Se=Bb/(Gb+Bb)? (If I got the notation right.)
Strictly no, she is 'validating' (no cross- involved). Cross-validation
would be a better idea for much smaller sample sizes (we don't know how
many regressors are involved, so say hundreds unless there are more than
ten regressors).
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list