[R] Degrees of freedom in binomial glm

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Apr 10 23:20:34 CEST 2008


You don't have 168 observations - 2 of them have no data (Freq = 0).

On Thu, 10 Apr 2008, Giovanni Petris wrote:

>
> Hello,
>
> I am looking at the job satisfaction data below, from a problem in
> Agresti's book, and I am not sure where the degrees of freedom come
> from. The way I am fitting a binomial model, I have 168 observations,
> so in my understanding that should also be the number of fitted
> parameters in the saturated model. Since I have one intercept
> parameter, I was thinking to get 167 df for the Null model, but
> R tells me it's 165. Where does this number come from?
>
> Thanks in advance,
> Giovanni
>
>
>> ### Agresti, Problem 5.23
>> race <- c("White", "Other")
>> gender <- c("M", "F")
>> age <- c("<35", "35-44", ">44")
>> loc <- c("NE", "MidAtl", "S", "MidW", "NW", "SW", "Pac")
>> sat <- factor(c("Yes", "No"), levels = c("No", "Yes"))
>> Freq <- c(288,  60, 224, 35, 337,  70,   38, 19, 32, 22, 21, 15,
> +           177,  57, 166, 19, 172,  30,   33, 35, 11, 20,  8, 10,
> +            90,  19,  96, 12, 124,  17,   18, 13,  7,  0,  9,  1,
> +            45,  12,  42,  5,  39,   2,    6,  7,  2,  3,  2,  1,
> +           226,  88, 189, 44, 156,  70,   45, 47, 18, 13, 11,  9,
> +           128,  57, 117, 34,  73,  25,   31, 35,  3,  7,  2,  2,
> +           285, 110, 225, 53, 324,  60,   40, 66, 19, 25, 22, 11,
> +           179,  93, 141, 24, 140,  47,   25, 56, 11, 19,  2, 12,
> +           270, 176, 215, 80, 269, 110,   36, 25,  9, 11, 16,  4,
> +           180, 151, 108, 40, 136,  40,   20, 16,  7,  5,  3,  5,
> +           252,  97, 162, 47, 199,  62,   69, 45, 14,  8, 14,  2,
> +           126,  61,  72, 27,  93,  24,   27, 36,  7,  4,  5,  0,
> +           119,  62,  66, 20,  67,  25,   45, 22, 15, 10,  8,  6,
> +            58,  33,  20, 10,  21,  10,   16, 15, 10,  8,  6,  2)
>> satdata <- data.frame(Freq, expand.grid(gender=gender, age=age,
> +                                         race=race, sat=sat, loc=loc))
>> sat.glm0 <- glm(sat ~ gender + age + race + loc, weights = Freq,
> +                 family = binomial, data = satdata)
>> summary(sat.glm0)
>
> Call:
> glm(formula = sat ~ gender + age + race + loc, family = binomial,
>    data = satdata, weights = Freq)
>
> Deviance Residuals:
>    Min       1Q   Median       3Q      Max
> -19.456   -6.839    0.000    6.309   17.635
>
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  0.334265   0.056491   5.917 3.28e-09 ***
> genderF     -0.180480   0.047575  -3.794 0.000149 ***
> age35-44     0.122422   0.051836   2.362 0.018191 *
> age>44       0.361610   0.051576   7.011 2.36e-12 ***
> raceOther   -0.005883   0.061605  -0.095 0.923919
> locMidAtl    0.437342   0.103821   4.212 2.53e-05 ***
> locS         0.178574   0.073033   2.445 0.014481 *
> locMidW      0.083189   0.066427   1.252 0.210449
> locNW        0.134337   0.067498   1.990 0.046563 *
> locSW        0.295874   0.073488   4.026 5.67e-05 ***
> locPac       0.425480   0.096561   4.406 1.05e-05 ***
> ---
> Signif. codes:  0 â??***â?? 0.001 â??**â?? 0.01 â??*â?? 0.05 â??.â?? 0.1 â?? â?? 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
>    Null deviance: 12987  on 165  degrees of freedom
> Residual deviance: 12880  on 155  degrees of freedom
> AIC: 12902
>
> Number of Fisher Scoring iterations: 4
>
>> str(satdata)
> 'data.frame':	168 obs. of  6 variables:
> $ Freq  : num  288 60 224 35 337 70 38 19 32 22 ...
> $ gender: Factor w/ 2 levels "M","F": 1 2 1 2 1 2 1 2 1 2 ...
> $ age   : Factor w/ 3 levels "<35","35-44",..: 1 1 2 2 3 3 1 1 2 2 ...
> $ race  : Factor w/ 2 levels "White","Other": 1 1 1 1 1 1 2 2 2 2 ...
> $ sat   : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
> $ loc   : Factor w/ 7 levels "NE","MidAtl",..: 1 1 1 1 1 1 1 1 1 1 ...
>> sessionInfo()
> R version 2.6.2 (2008-02-08)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] tools_2.6.2
>>
>
> --
>
> Giovanni Petris  <GPetris at uark.edu>
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list