[R] Hosmer Lemeshow test

David Carlson dcarlson at tamu.edu
Tue Apr 23 19:02:30 CEST 2013


The warning is commented out or it would have been printed:
   #   warning("Some expected counts are less than 5. Use smaller number of
groups")

For 23 data points, the default of 10 bins is too many since one of the bins
is 0. Eight bins gives the warning, but prints results. You didn't indicate
how many bins SPSS used. 

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Endy BlackEndy
Sent: Tuesday, April 23, 2013 10:31 AM
To: r-help at r-project.org
Subject: [R] Hosmer Lemeshow test

Hi to everybody. I use the following routine (i found it in the internet) to
compute the Hosmer-Lemeshow test in the framework of logistic regression.

hosmerlemeshow = function(obj, g=10) {
# first, check to see if we fed in the right kind of object
   stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
   y = obj$model[[1]]
   # the double bracket (above) gets the index of items within an object
   if (is.factor(y))
      y = as.numeric(y)==2
   yhat = obj$fitted.values
   cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)

   obs = xtabs(cbind(1 - y, y) ~ cutyhat)
   expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
   if (any(expect < 5))
   #   warning("Some expected counts are less than 5. Use smaller number of
groups")
   chisq = sum((obs - expect)^2/expect)
   P = 1 - pchisq(chisq, g - 2)
   cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")
   # by returning an object of class "htest", the function will perform like
the
   # built-in hypothesis tests
   return(structure(list(
      method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),
      data.name = deparse(substitute(obj)),
      statistic = c(X2=chisq),
      parameter = c(df=g-2),
      p.value = P
   ), class='htest'))
   return(list(chisq=chisq,p.value=P))
}

However when i run it with the data set below i get the value NaN. Using the
same data set with SPSS i get a specific value.

FlightNo Temp ThermalDisast
1 66 0
2 70 1
3 69 0
4 68 0
5 67 0
6 72 0
7 73 0
8 70 0
9 57 1
10 63 1
11 70 1
12 78 0
13 67 0
14 53 1
15 67 0
16 75 0
17 70 0
18 81 0
19 76 0
20 79 0
21 75 1
22 76 0
23 58 1

Any suggestions will greatly appreciated.
With regards
Endy

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list