[R] Hosmer- Lemeshow test
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Wed Sep 17 14:03:43 CEST 2008
saggak wrote:
>
> Dear Mr Frank,
>
> I thank you for your prompt reply. However, I am not able to understand
> (may be since for me R is a new venture) the contents of your reply. If
> its a book you are referring to, I don't have access to it. How do I get
> @ARTICLE{hos97com and how do I run it in R?
>
> Thanking you in adavance
>
> With regards
>
> Saggak
The article is in the journal Statistics in Medicine. I mentioned how
to run it. Here is an example:
library(Design) # also requires Hmisc package
f <- lrm(y ~ x1+x2*pol(x3,2), x=TRUE, y=TRUE)
resid(f, 'gof')
Frank
>
>
>
> --- On *Tue, 16/9/08, Frank E Harrell Jr /<f.harrell at vanderbilt.edu>/*
> wrote:
>
> From: Frank E Harrell Jr <f.harrell at vanderbilt.edu>
> Subject: Re: [R] Hosmer- Lemeshow test
> To: saggak1908 at yahoo.co.in
> Cc: "R list" <r-help at stat.math.ethz.ch>
> Date: Tuesday, 16 September, 2008, 4:38 PM
>
> saggak wrote:
> > Dear R - help,
> >
> > I am working on the Credit scorecard model. I am using the Logistic
> regression to arrive at the regression coefficients model.
> >
> > I want to use the Hosmer - Lemeshow test .
> >
> > In order to understand the use of R - language, I had referred the
> following URL
> >
> > Â Â Â Â Â
> http://www.stat.sc.edu/~hitchcock/diseaseoutbreakRexample704.txt
> >
> > The related data 'diseaseoutbreak' is available at the following
> URL
> >
> > Â Â Â Â Â Â
> http://www.stat.sc.edu/~hitchcock/diseaseoutbreakdata.txt
> >
> > The R code as mentioned therein is
> >
> > ####
> > # A function to do the Hosmer-Lemeshow test in R.
> > # R Function is due to Peter D. M. Macdonald, McMaster University.
> > #
> > hosmerlem <-
> > function (y, yhat, g = 10)
> > {
> > cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0,
> > 1, 1/g)), include.lowest = T)
> > obs <- xtabs(cbind(1 - y, y) ~ cutyhat)
> > expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
> > chisq <- sum((obs - expect)^2/expect)
> > P <- 1 - pchisq(chisq, g - 2)
> > c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P)
> > }
> > #
> > ######
> >
> > # Doing the Hosmer-Lemeshow test
> > # (after copying the above function into R):
> >
> > hosmerlem(disease, fitted(disease.logit))
> > However when I ran these commands / functions in R, I got following errors
> >
> > Error in model.frame.default(formula = cbind(1 - y, y) ~ cutyhat) :
> > Â invalid type (list) for variable 'cbind(1 - y, y)'
> >
> > Can anyone please guide me as to how to run Hosmer- Lemeshow test, as also
> how to find out the other usual logistic regression related "Log -
> likelihood, AIC, Pseudo R etc"?
> >
> > Thanking you all in advance
> >
> > Saggak
>
> That test is too dependent on cutpoints and does not have adequate power
> . I recommend replacing it with
>
> @ARTICLE{hos97com,
> author = {Hosmer, D. W. and Hosmer, T. and {le Cessie}, S. and
> Lemeshow, S.},
> year = 1997,
> title = {A comparison of goodness-of-fit tests for the logistic
> regression
> model},
> journal = Statistics in Medicine,
> volume = 16,
> pages = {965-980},
> annote = {goodness-of-fit for binary logistic model;difficulty with
> Hosmer-Lemeshow statistic being dependent on how groups are
> defined;sum of squares test;cumulative sum test;invalidity
> of naive
> test based on deviance;goodness-of-link function;simulation
> setup}
>
> which is implemented in the residuals.lrm function in the Design package.
>
>
> --
> Frank E Harrell Jr Professor and Chair School of Medicine
> Department of Biostatistics Vanderbilt University
>
More information about the R-help
mailing list