[R] How to assess the accuracy of fitted logistic regression using glm

Uwe Ligges ligges at statistik.tu-dortmund.de
Fri Jun 10 14:41:53 CEST 2011



On 10.06.2011 08:54, Xiaobo Gu wrote:
> Hi Professor Brian,
>
> Thanks for your reply.
>
> I think there are many statisticians here, and it is somehow R
> related, hoping someone can
> help me.
>
> I have done a simple test, using a sample csv data which I post if need.
>
> donut<- read.csv(file="D:/donut.csv", header = TRUE);
> donut[["color"]]<- as.factor(donut[["color"]])
> donut[["shape"]]<- as.factor(donut[["shape"]])
> donut[["k"]]<- as.factor(donut[["k"]])
> donut[["k0"]]<- as.factor(donut[["k0"]])
> donut[["bias"]]<- as.factor(donut[["bias"]])
>
> lr<- glm(color ~ shape + x + y, family = binomial, data = donut);
> summary(lr)
>
> Call:
> glm(formula = color ~ shape + x + y, family = binomial, data = donut)
>
> Deviance Residuals:
>      Min       1Q   Median       3Q      Max
> -2.1079  -0.9476   0.5086   0.7518   1.4079
>
> Coefficients:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept)  2.53010    1.65500   1.529   0.1263
> shape22      0.05628    1.54990   0.036   0.9710
> shape23     -0.74568    1.44813  -0.515   0.6066
> shape24     -2.61896    1.38016  -1.898   0.0578 .
> shape25     -2.07648    1.32818  -1.563   0.1180
> x           -0.45885    1.52863  -0.300   0.7640
> y           -0.59311    1.46999  -0.403   0.6866
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
>      Null deviance: 50.446  on 39  degrees of freedom
> Residual deviance: 42.473  on 33  degrees of freedom
> AIC: 56.473
>
> Number of Fisher Scoring iterations: 4
>
> In the Coefficients section, is Pr(>|z|) the P-value for that
> variable, and there
> are a few other questions:
> 1. How to determine the predict power of each variables?
> 2. How to determine the overall performance of the fitted model, here what's the
> difference between and "Deviance Residuals" and "Residual deviance"?
> 3. How to compare "Null deviance" and "Residual deviance"?
> 4. What does AIC mean, and how to use this measure?
> 5. What does the Signif. codes section mean?


To answer your question, we'd need to write half a book, at least. This 
cannot be answered in an e-mail message. Hence please re-read Brian 
Ripley's advice and try to get statistical advice from  a local 
consultant or read elementary textbooks on the subject.

Uwe Ligges



> Regards,
>
> Xiaobo Gu
>
>
>
> On Mon, Jun 6, 2011 at 9:59 PM, Prof Brian Ripley<ripley at stats.ox.ac.uk>  wrote:
>> On Mon, 6 Jun 2011, Xiaobo Gu wrote:
>>
>>> Hi,
>>>
>>> I am trying glm with family = binomial to do binary logistic
>>> regression, but how can I assess the accuracy of the fitted model, the
>>> summary method can print a lot of information about the returned
>>> object, such as coefficients, because statistics is not my speciality,
>>> so can you share some rule of thumb to exam the  fitted model from the
>>> practical perspective.
>>
>> It depends entirely on why you did the fit.  People have written whole books
>> on assessing the performance of classification procedures such as binary
>> logistic regression.  For example, the residual deviance is closely related
>> to log-probability scoring: for some purposes that is a good performance
>> measure, for others (e.g. when you are going to threshold the predicted
>> probabilities) it can be very misleading.
>>
>> In short, you need statistical advice, not R advice (the purpose of this
>> list).
>>
>>>
>>> Regards,
>>>
>>> Xiaobo Gu
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list