[R] Hosmer Lemeshow test

Frank Harrell f.harrell at vanderbilt.edu
Tue Apr 23 21:58:53 CEST 2013


The H-L test is now considered to be obsolete by many.  One replacement is
the Hosmer - le Cessie test as implemented in the rms package residuals.lrm
function.  This is a one degree of freedom test.

One problem with H-L is its use of arbitrary binning and suboptimal power. 
The new test requires no binning at all.  Probably better still are directed
tests such as tests of linearity and additivity.
Frank

David Carlson wrote
> The warning is commented out or it would have been printed:
>    #   warning("Some expected counts are less than 5. Use smaller number
> of
> groups")
> 
> For 23 data points, the default of 10 bins is too many since one of the
> bins
> is 0. Eight bins gives the warning, but prints results. You didn't
> indicate
> how many bins SPSS used. 
> 
> -------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
> 
> -----Original Message-----
> From: 

> r-help-bounces@

>  [mailto:

> r-help-bounces@

> ] On
> Behalf Of Endy BlackEndy
> Sent: Tuesday, April 23, 2013 10:31 AM
> To: 

> r-help@

> Subject: [R] Hosmer Lemeshow test
> 
> Hi to everybody. I use the following routine (i found it in the internet)
> to
> compute the Hosmer-Lemeshow test in the framework of logistic regression.
> 
> hosmerlemeshow = function(obj, g=10) {
> # first, check to see if we fed in the right kind of object
>    stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
>    y = obj$model[[1]]
>    # the double bracket (above) gets the index of items within an object
>    if (is.factor(y))
>       y = as.numeric(y)==2
>    yhat = obj$fitted.values
>    cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)
> 
>    obs = xtabs(cbind(1 - y, y) ~ cutyhat)
>    expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
>    if (any(expect < 5))
>    #   warning("Some expected counts are less than 5. Use smaller number
> of
> groups")
>    chisq = sum((obs - expect)^2/expect)
>    P = 1 - pchisq(chisq, g - 2)
>    cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")
>    # by returning an object of class "htest", the function will perform
> like
> the
>    # built-in hypothesis tests
>    return(structure(list(
>       method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
> "bins", sep=" ")),
>       data.name = deparse(substitute(obj)),
>       statistic = c(X2=chisq),
>       parameter = c(df=g-2),
>       p.value = P
>    ), class='htest'))
>    return(list(chisq=chisq,p.value=P))
> }
> 
> However when i run it with the data set below i get the value NaN. Using
> the
> same data set with SPSS i get a specific value.
> 
> FlightNo Temp ThermalDisast
> 1 66 0
> 2 70 1
> 3 69 0
> 4 68 0
> 5 67 0
> 6 72 0
> 7 73 0
> 8 70 0
> 9 57 1
> 10 63 1
> 11 70 1
> 12 78 0
> 13 67 0
> 14 53 1
> 15 67 0
> 16 75 0
> 17 70 0
> 18 81 0
> 19 76 0
> 20 79 0
> 21 75 1
> 22 76 0
> 23 58 1
> 
> Any suggestions will greatly appreciated.
> With regards
> Endy
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________

> R-help@

>  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________

> R-help@

>  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.





-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-test-tp4665115p4665154.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list