[R] Hosmer-Lemeshow 'goodness of fit'
Frank Harrell
f.harrell at vanderbilt.edu
Mon May 9 02:38:34 CEST 2011
Please read the documentation carefully, and replace the Design package with
the newer rms package.
The older Hosmer-Lemeshow test requires binning and has lower power. It
also does not penalize for overfitting. The newer goodness of fit test in
rms/Design should not agree with Hosmer-Lemeshow.
Frank
viostorm wrote:
>
> I'm trying to do a Hosmer-Lemeshow 'goodness of fit' test on my logistic
> regression model.
>
> I found some code here:
>
> http://sas-and-r.blogspot.com/2010/09/example-87-hosmer-and-lemeshow-goodness.html
>
> The R code is above is a little complicated for me but I'm having trouble
> with my answer:
>
> Hosmer-Lemeshow: p=0.6163585
> le Cessie and Houwelingen test (Design library): p=0.2843620
>
> The above link indicated they should be approximately equal which in my
> case they are not, any suggestions or is there a package function people
> would recommend in R for use with a logistic regression model?
>
> Thanks in advance,
>
> -Rob Schutt
>
> --------------------------------
> Robert Schutt, MD, MCS
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia
>
>
> ------------------
>
>
> ########################################################
> # Compute the Hosmer-Lemeshow 'goodness-of-fit' test
>
> cd.full_model = glm(formula = Collaterals ~ CHF + Age + CABG
> + relevel (as.factor (num.obst.vessels),"one")
> + Current.smoker + DM + HTN + ace.inhibitor
> + MI, family = binomial(link = "logit"))
>
> hosmerlem = function(y, yhat, g=10) {
> cutyhat = cut(yhat,
> breaks = quantile(yhat, probs=seq(0,
> 1, 1/g)), include.lowest=TRUE)
> obs = xtabs(cbind(1 - y, y) ~ cutyhat)
> expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
> chisq = sum((obs - expect)^2/expect)
> P = 1 - pchisq(chisq, g - 2)
> return(list(chisq=chisq,p.value=P))
> }
>
> hosmerlem(y=Collaterals, yhat=fitted(cd.full_model))
>
> ########################################################33
> # Compute the le Cessie and Houwelingen test
>
> f <- lrm(Collaterals ~ CHF + Age + CABG
> + relevel (as.factor (num.obst.vessels),"one")
> + Current.smoker + DM + HTN + ace.inhibitor
> + MI, x=TRUE, y=TRUE)
>
> library(Design)
> resid(f, 'gof')
>
> Output:
>
>> hosmerlem(y=Collaterals, yhat=fitted(cd.full_model))
> $chisq
> [1] 6.275889
>
> $p.value
> [1] 0.6163585
>
>> resid(f, 'gof')
> Sum of squared errors Expected value|H0 SD
> 118.5308396 118.3771115 0.1435944
> Z P
> 1.0705717 0.2843620
>
-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-tp3508127p3508185.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list