[R] Hosmer-Lemeshow 'goodness of fit'

Frank Harrell f.harrell at vanderbilt.edu
Mon May 9 02:38:34 CEST 2011


Please read the documentation carefully, and replace the Design package with
the newer rms package.
The older Hosmer-Lemeshow test requires binning and has lower power.  It
also does not penalize for overfitting.  The newer goodness of fit test in
rms/Design should not agree with Hosmer-Lemeshow.

Frank

viostorm wrote:
> 
> I'm trying to do a Hosmer-Lemeshow 'goodness of fit' test on my logistic
> regression model.
> 
> I found some code here:
> 
> http://sas-and-r.blogspot.com/2010/09/example-87-hosmer-and-lemeshow-goodness.html
> 
> The R code is above is a little complicated for me but I'm having trouble
> with my answer:
> 
> Hosmer-Lemeshow: p=0.6163585
> le Cessie and Houwelingen test (Design library): p=0.2843620
> 
> The above link indicated they should be approximately equal which in my
> case they are not, any suggestions or is there a package function people
> would recommend in R for use with a logistic regression model?
> 
> Thanks in advance,
> 
> -Rob Schutt
> 
> --------------------------------
> Robert Schutt, MD, MCS 
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia 
> 
> 
> ------------------
> 
> 
> ########################################################
> # Compute the Hosmer-Lemeshow 'goodness-of-fit' test
> 
> cd.full_model = glm(formula = Collaterals ~ CHF + Age + CABG 
> 	+ relevel (as.factor (num.obst.vessels),"one") 
> 	+ Current.smoker + DM + HTN + ace.inhibitor 
> 	+ MI, family = binomial(link = "logit"))
> 
> hosmerlem = function(y, yhat, g=10) {
>   cutyhat = cut(yhat,
>      breaks = quantile(yhat, probs=seq(0,
>        1, 1/g)), include.lowest=TRUE)
>   obs = xtabs(cbind(1 - y, y) ~ cutyhat)
>   expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
>   chisq = sum((obs - expect)^2/expect)
>   P = 1 - pchisq(chisq, g - 2)
>   return(list(chisq=chisq,p.value=P))
> }
> 
> hosmerlem(y=Collaterals, yhat=fitted(cd.full_model))
> 
> ########################################################33
> # Compute the le Cessie and Houwelingen test 
> 
> f <- lrm(Collaterals ~ CHF + Age + CABG 
> 	+ relevel (as.factor (num.obst.vessels),"one") 
> 	+ Current.smoker + DM + HTN + ace.inhibitor 
> 	+ MI, x=TRUE, y=TRUE)
> 
> library(Design)  
> resid(f, 'gof')
> 
> Output:
> 
>> hosmerlem(y=Collaterals, yhat=fitted(cd.full_model))
> $chisq
> [1] 6.275889
> 
> $p.value
> [1] 0.6163585
> 
>> resid(f, 'gof')
> Sum of squared errors     Expected value|H0                    SD 
>           118.5308396           118.3771115             0.1435944 
>                     Z                     P 
>             1.0705717             0.2843620
> 


-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-tp3508127p3508185.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list