[R] How to assess the accuracy of fitted logistic regression using glm
Uwe Ligges
ligges at statistik.tu-dortmund.de
Fri Jun 10 14:41:53 CEST 2011
On 10.06.2011 08:54, Xiaobo Gu wrote:
> Hi Professor Brian,
>
> Thanks for your reply.
>
> I think there are many statisticians here, and it is somehow R
> related, hoping someone can
> help me.
>
> I have done a simple test, using a sample csv data which I post if need.
>
> donut<- read.csv(file="D:/donut.csv", header = TRUE);
> donut[["color"]]<- as.factor(donut[["color"]])
> donut[["shape"]]<- as.factor(donut[["shape"]])
> donut[["k"]]<- as.factor(donut[["k"]])
> donut[["k0"]]<- as.factor(donut[["k0"]])
> donut[["bias"]]<- as.factor(donut[["bias"]])
>
> lr<- glm(color ~ shape + x + y, family = binomial, data = donut);
> summary(lr)
>
> Call:
> glm(formula = color ~ shape + x + y, family = binomial, data = donut)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.1079 -0.9476 0.5086 0.7518 1.4079
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 2.53010 1.65500 1.529 0.1263
> shape22 0.05628 1.54990 0.036 0.9710
> shape23 -0.74568 1.44813 -0.515 0.6066
> shape24 -2.61896 1.38016 -1.898 0.0578 .
> shape25 -2.07648 1.32818 -1.563 0.1180
> x -0.45885 1.52863 -0.300 0.7640
> y -0.59311 1.46999 -0.403 0.6866
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 50.446 on 39 degrees of freedom
> Residual deviance: 42.473 on 33 degrees of freedom
> AIC: 56.473
>
> Number of Fisher Scoring iterations: 4
>
> In the Coefficients section, is Pr(>|z|) the P-value for that
> variable, and there
> are a few other questions:
> 1. How to determine the predict power of each variables?
> 2. How to determine the overall performance of the fitted model, here what's the
> difference between and "Deviance Residuals" and "Residual deviance"?
> 3. How to compare "Null deviance" and "Residual deviance"?
> 4. What does AIC mean, and how to use this measure?
> 5. What does the Signif. codes section mean?
To answer your question, we'd need to write half a book, at least. This
cannot be answered in an e-mail message. Hence please re-read Brian
Ripley's advice and try to get statistical advice from a local
consultant or read elementary textbooks on the subject.
Uwe Ligges
> Regards,
>
> Xiaobo Gu
>
>
>
> On Mon, Jun 6, 2011 at 9:59 PM, Prof Brian Ripley<ripley at stats.ox.ac.uk> wrote:
>> On Mon, 6 Jun 2011, Xiaobo Gu wrote:
>>
>>> Hi,
>>>
>>> I am trying glm with family = binomial to do binary logistic
>>> regression, but how can I assess the accuracy of the fitted model, the
>>> summary method can print a lot of information about the returned
>>> object, such as coefficients, because statistics is not my speciality,
>>> so can you share some rule of thumb to exam the fitted model from the
>>> practical perspective.
>>
>> It depends entirely on why you did the fit. People have written whole books
>> on assessing the performance of classification procedures such as binary
>> logistic regression. For example, the residual deviance is closely related
>> to log-probability scoring: for some purposes that is a good performance
>> measure, for others (e.g. when you are going to threshold the predicted
>> probabilities) it can be very misleading.
>>
>> In short, you need statistical advice, not R advice (the purpose of this
>> list).
>>
>>>
>>> Regards,
>>>
>>> Xiaobo Gu
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Brian D. Ripley, ripley at stats.ox.ac.uk
>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel: +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK Fax: +44 1865 272595
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list