[R] Fwd: add1() and anova() with glm with dispersion
Chad R. Bhatti
bhatticr at yahoo.com
Wed Jun 28 22:34:04 CEST 2006
> Hello,
>
> I have a question about a discrepancy between the
> reported F statistics using anova() and add1() from
> adding an additional term to form nested models.
>
> I found and old posting related to anova() and
> drop1() regarding a glm with a dispersion parameter.
>
> The posting is very old (May 2000, R 1.1.0).
> The old posting is located here.
>
>
https://stat.ethz.ch/pipermail/r-devel/2000-May/020720.html
>
>
> I first noticed the discrepancy in the PC version of
> R
> 2.2.1. I have upgraded to PC R 2.3.1 and the
> discrepancy remains.
>
> To be clear I will be as exact as possible and show
> the model output below.
> I am fitting nested models from the quasi-Poisson
> family using glm(). Here, model.0 is contained in
> model.1 where model.1 contains one additional
> covariate not in model.0.
>
> To be specific I will show you model.0 and model.1.
> > > summary(model.0)
> >
> > Call:
> > glm(formula = as.formula(formula.0), family =
> > quasipoisson, data = data.1)
> >
> > Deviance Residuals:
> > Min 1Q Median 3Q Max
> > -4.0256 -1.0080 -0.1604 0.7372 4.8545
> >
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
>
> >
> > (Intercept) 9.440e-01 8.968e-02 10.526 < 2e-16
> > ***
> > I11 -6.857e-02 3.693e-02 -1.857 0.063468
> .
> >
> > I12 -7.338e-02 4.142e-02 -1.771 0.076598
> .
> >
> > I13 3.329e-02 4.130e-02 0.806 0.420208
>
> >
> > I14 1.830e-01 3.717e-02 4.924 9.00e-07
> > ***
> > I15 2.015e-01 3.480e-02 5.789 7.93e-09
> > ***
> > trades1 1.775e-01 2.114e-02 8.398 < 2e-16
> > ***
> > trades2 3.325e-02 1.857e-02 1.790 0.073493
> .
> >
> > trades3 1.050e-01 1.873e-02 5.604 2.31e-08
> > ***
> > trades4 4.615e-02 1.853e-02 2.490 0.012827
> *
> >
> > vol1 1.932e-04 5.116e-05 3.777 0.000162
> > ***
> > sVol1 1.211e-04 4.536e-05 2.670 0.007643
> > **
> > trades5 1.835e-02 1.850e-02 0.992 0.321297
>
> >
> > trades6 7.797e-03 1.837e-02 0.425 0.671213
>
> >
> > trades7 3.802e-02 1.831e-02 2.077 0.037916
> *
> >
> > trades8 5.904e-02 1.831e-02 3.224 0.001280
> > **
> > trades9 4.581e-02 1.830e-02 2.503 0.012383
> *
> >
> > trades10 5.521e-02 1.802e-02 3.063 0.002212
> > **
> > ---
> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
> '.'
> > 0.1 ' ' 1
> >
> > (Dispersion parameter for quasipoisson family
> taken
> > to
> > be 1.832780)
> >
> > Null deviance: 7020.8 on 2623 degrees of
> > freedom
> > Residual deviance: 4697.2 on 2606 degrees of
> > freedom
> > AIC: NA
> >
> > Number of Fisher Scoring iterations: 4
> >
> >
> >
> >
> > > summary(model.1)
> >
> > Call:
> > glm(formula = as.formula(formula.1), family =
> > quasipoisson, data = data.1)
> >
> > Deviance Residuals:
> > Min 1Q Median 3Q Max
> > -4.0807 -1.0061 -0.1659 0.7461 4.9228
> >
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
>
> >
> > (Intercept) 9.627e-01 9.019e-02 10.675 < 2e-16
> > ***
> > I11 -6.631e-02 3.694e-02 -1.795 0.072713
> .
> >
> > I12 -7.383e-02 4.141e-02 -1.783 0.074717
> .
> >
> > I13 3.363e-02 4.129e-02 0.814 0.415467
>
> >
> > I14 1.844e-01 3.717e-02 4.960 7.50e-07
> > ***
> > I15 2.017e-01 3.479e-02 5.798 7.51e-09
> > ***
> > trades1 1.840e-01 2.140e-02 8.600 < 2e-16
> > ***
> > trades2 3.585e-02 1.861e-02 1.927 0.054135
> .
> >
> > trades3 1.050e-01 1.872e-02 5.607 2.27e-08
> > ***
> > trades4 4.647e-02 1.852e-02 2.509 0.012166
> *
> >
> > vol1 1.957e-04 5.111e-05 3.828 0.000132
> > ***
> > sVol1 1.238e-04 4.535e-05 2.730 0.006374
> > **
> > trades5 1.883e-02 1.849e-02 1.019 0.308506
>
> >
> > trades6 7.358e-03 1.836e-02 0.401 0.688663
>
> >
> > trades7 3.840e-02 1.829e-02 2.099 0.035928
> *
> >
> > trades8 5.907e-02 1.831e-02 3.226 0.001272
> > **
> > trades9 4.527e-02 1.830e-02 2.473 0.013447
> *
> >
> > trades10 5.582e-02 1.802e-02 3.098 0.001970
> > **
> > aveSpread1 -2.386e-01 1.238e-01 -1.928 0.054022
> .
> >
> > ---
> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
> '.'
> > 0.1 ' ' 1
> >
> > (Dispersion parameter for quasipoisson family
> taken
> > to
> > be 1.831032)
> >
> > Null deviance: 7020.8 on 2623 degrees of
> > freedom
> > Residual deviance: 4690.4 on 2605 degrees of
> > freedom
> > AIC: NA
> >
> > Number of Fisher Scoring iterations: 4
>
> So model.1 is model.0 + aveSpread1.
>
>
> The R anova() function is supposed to account for a
> dispersion/scale parameter of a glm object and has
> the
> following output. The anova() function adds the
> terms
> sequentially and computes a test statistic.
>
> > > anova(model.1, test="F")
> > Analysis of Deviance Table
> >
> > Model: quasipoisson, link: log
> >
> > Response: trades
> >
> > Terms added sequentially (first to last)
> >
> >
> > Df Deviance Resid. Df Resid. Dev
>
> > F
> > Pr(>F)
> > NULL 2623 7020.8
>
> >
> >
> > I11 1 91.5 2622 6929.3
> > 49.9706
> > 1.996e-12 ***
> > I12 1 622.3 2621 6307.0
> > 339.8717
> > < 2.2e-16 ***
> > I13 1 614.4 2620 5692.6
> > 335.5561
> > < 2.2e-16 ***
> > I14 1 53.3 2619 5639.4
> > 29.0903
> > 7.529e-08 ***
> > I15 1 64.6 2618 5574.8
> > 35.2590
> > 3.270e-09 ***
> > trades1 1 535.2 2617 5039.6
> > 292.3131
> > < 2.2e-16 ***
> > trades2 1 49.0 2616 4990.5
> > 26.7798
> > 2.454e-07 ***
> > trades3 1 113.6 2615 4876.9
> > 62.0682
> > 4.830e-15 ***
> > trades4 1 31.3 2614 4845.6
> > 17.0783
> > 3.700e-05 ***
> > vol1 1 20.8 2613 4824.8
> > 11.3810
> > 0.0007528 ***
> > sVol1 1 13.9 2612 4810.9
> > 7.5932
> > 0.0058997 **
> > trades5 1 10.6 2611 4800.3
> > 5.7901
> > 0.0161861 *
> > trades6 1 8.1 2610 4792.2
> > 4.4175
> > 0.0356685 *
> > trades7 1 24.7 2609 4767.5
> > 13.4770
> > 0.0002464 ***
> > trades8 1 33.5 2608 4734.0
> > 18.2926
> > 1.963e-05 ***
> > trades9 1 19.5 2607 4714.5
> > 10.6649
> > 0.0011060 **
> > trades10 1 17.3 2606 4697.2
> > 9.4474
> > 0.0021364 **
> > aveSpread1 1 6.8 2605 4690.4
> > 3.7240
> > 0.0537449 .
> > ---
>
>
> Now consider the output from add1() when adding
> aveSpread1 to model.0.
>
> > > add1(model.0, ~ . + aveSpread1, test = "F")
> > Single term additions
> >
> > Model:
> > trades ~ I11 + I12 + I13 + I14 + I15 + trades1 +
> > trades2 + trades3 +
> > trades4 + vol1 + sVol1 + trades5 + trades6 +
> > trades7 + trades8 +
> > trades9 + trades10
> > Df Deviance F value Pr(F)
> > <none> 4697.2
> > aveSpread1 1 4690.4 3.7871 0.05176 .
> >
>
>
> Here is the discrepancy. Compare F statistics
> for aveSpread1 3.7240 from anova() to 3.7871 from
> add1().
>
>
> I built my own anova-type table that did not account
> for the dispersion parameter and my last term
> (rounded) matches that reported by add1().
>
> > source('main.R')
> [,1]
>
>
> model.table "Variable Deviance Resid Dev
>
> F p-value "
> add.term "trades1 535.2 5039.6
>
> 277.84 0.0000 "
> add.term "trades2 49.0 4990.5
>
> 25.69 0.0000 "
> add.term "trades3 113.6 4876.9
>
> 60.92 0.0000 "
> add.term "trades4 31.3 4845.6
>
> 16.86 0.0000 "
> add.term "vol1 20.8 4824.8
>
> 11.28 0.0008 "
> add.term "sVol1 13.9 4810.9
>
> 7.55 0.0061 "
> add.term "trades5 10.6 4800.3
>
> 5.76 0.0164 "
> add.term "trades6 8.1 4792.2
>
> 4.40 0.0360 "
> add.term "trades7 24.7 4767.5
>
> 13.50 0.0002 "
> add.term "trades8 33.5 4734.0
>
> 18.45 0.0000 "
> add.term "trades9 19.5 4714.5
>
> 10.79 0.0010 "
> add.term "trades10 17.3 4697.2
>
> 9.59 0.0020 "
> add.term "aveSpread1 6.8 4690.4
>
> 3.79 0.0518 "
>
>
> McCullagh and Nelder (2nd ed) p207 #4 show an
> adjustment for the dispersion parameter in the
> F-statistic. It does not seem that add1() is making
> this adjustment though the anova() help page
> suggests
> that anova() is. The estimated dispersion
> parameters
> are s0 = 1.832780 and s1 = 1.831032. I have tried
> scaling the F statistic from add1() by s1/s0, but it
> still is not equal to the F statistic from anova().
>
> I am not experienced with the R source code. I
> would
> like to know how add1() computes the F statistic
> with
> dispersion and how anova() computes the F statistic
> with dispersion and why they are different.
>
> I apologize if I have missed something obvious, but
> add1() and anova() do not show the code in the PC
> version. In the PC version if I type lm, glm,
> glm.fit, I see code and I do not for add1 and anova
> so
> I do not know how to examine these functions.
>
> Thanks for all of your assistance.
>
> Chad R. Bhatti
>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> protection around
> http://mail.yahoo.com
>
More information about the R-help
mailing list