[R] Degrees of freedom in binomial glm
Giovanni Petris
GPetris at uark.edu
Thu Apr 10 22:59:13 CEST 2008
Hello,
I am looking at the job satisfaction data below, from a problem in
Agresti's book, and I am not sure where the degrees of freedom come
from. The way I am fitting a binomial model, I have 168 observations,
so in my understanding that should also be the number of fitted
parameters in the saturated model. Since I have one intercept
parameter, I was thinking to get 167 df for the Null model, but
R tells me it's 165. Where does this number come from?
Thanks in advance,
Giovanni
> ### Agresti, Problem 5.23
> race <- c("White", "Other")
> gender <- c("M", "F")
> age <- c("<35", "35-44", ">44")
> loc <- c("NE", "MidAtl", "S", "MidW", "NW", "SW", "Pac")
> sat <- factor(c("Yes", "No"), levels = c("No", "Yes"))
> Freq <- c(288, 60, 224, 35, 337, 70, 38, 19, 32, 22, 21, 15,
+ 177, 57, 166, 19, 172, 30, 33, 35, 11, 20, 8, 10,
+ 90, 19, 96, 12, 124, 17, 18, 13, 7, 0, 9, 1,
+ 45, 12, 42, 5, 39, 2, 6, 7, 2, 3, 2, 1,
+ 226, 88, 189, 44, 156, 70, 45, 47, 18, 13, 11, 9,
+ 128, 57, 117, 34, 73, 25, 31, 35, 3, 7, 2, 2,
+ 285, 110, 225, 53, 324, 60, 40, 66, 19, 25, 22, 11,
+ 179, 93, 141, 24, 140, 47, 25, 56, 11, 19, 2, 12,
+ 270, 176, 215, 80, 269, 110, 36, 25, 9, 11, 16, 4,
+ 180, 151, 108, 40, 136, 40, 20, 16, 7, 5, 3, 5,
+ 252, 97, 162, 47, 199, 62, 69, 45, 14, 8, 14, 2,
+ 126, 61, 72, 27, 93, 24, 27, 36, 7, 4, 5, 0,
+ 119, 62, 66, 20, 67, 25, 45, 22, 15, 10, 8, 6,
+ 58, 33, 20, 10, 21, 10, 16, 15, 10, 8, 6, 2)
> satdata <- data.frame(Freq, expand.grid(gender=gender, age=age,
+ race=race, sat=sat, loc=loc))
> sat.glm0 <- glm(sat ~ gender + age + race + loc, weights = Freq,
+ family = binomial, data = satdata)
> summary(sat.glm0)
Call:
glm(formula = sat ~ gender + age + race + loc, family = binomial,
data = satdata, weights = Freq)
Deviance Residuals:
Min 1Q Median 3Q Max
-19.456 -6.839 0.000 6.309 17.635
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.334265 0.056491 5.917 3.28e-09 ***
genderF -0.180480 0.047575 -3.794 0.000149 ***
age35-44 0.122422 0.051836 2.362 0.018191 *
age>44 0.361610 0.051576 7.011 2.36e-12 ***
raceOther -0.005883 0.061605 -0.095 0.923919
locMidAtl 0.437342 0.103821 4.212 2.53e-05 ***
locS 0.178574 0.073033 2.445 0.014481 *
locMidW 0.083189 0.066427 1.252 0.210449
locNW 0.134337 0.067498 1.990 0.046563 *
locSW 0.295874 0.073488 4.026 5.67e-05 ***
locPac 0.425480 0.096561 4.406 1.05e-05 ***
---
Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 12987 on 165 degrees of freedom
Residual deviance: 12880 on 155 degrees of freedom
AIC: 12902
Number of Fisher Scoring iterations: 4
> str(satdata)
'data.frame': 168 obs. of 6 variables:
$ Freq : num 288 60 224 35 337 70 38 19 32 22 ...
$ gender: Factor w/ 2 levels "M","F": 1 2 1 2 1 2 1 2 1 2 ...
$ age : Factor w/ 3 levels "<35","35-44",..: 1 1 2 2 3 3 1 1 2 2 ...
$ race : Factor w/ 2 levels "White","Other": 1 1 1 1 1 1 2 2 2 2 ...
$ sat : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
$ loc : Factor w/ 7 levels "NE","MidAtl",..: 1 1 1 1 1 1 1 1 1 1 ...
> sessionInfo()
R version 2.6.2 (2008-02-08)
i686-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.6.2
>
--
Giovanni Petris <GPetris at uark.edu>
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/
More information about the R-help
mailing list