[R-sig-eco] aov vs. glm

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Nov 11 11:00:25 CET 2011


On Fri, 2011-11-11 at 13:01 +1100, Scott Foster wrote:
> Hi Lara,
> 
> Thanks for posting your results.  It makes things clearer.
> 
> Now, I'm not sure about how to fix this easily but I do know what the 
> problem is (I think).  It is to do with different types of sums of 
> squares.  There is also a smaller issue of types of test as well...
> 
> The summary( glm( ...)) is giving t-tests for each of the coefficients.  
> If you had more than two levels of each of treatment and species then 
> this would be very obvious.  The summary( aov(...)) is giving you a 
> complete anova table, an F-test.

anova(mod, test = "F")

will give the sequential sums of squares for a GLM mod ( mod <-
glm(....) ) and 'family = "gaussian"` - which is the default in glm().

Lara:

Each ( glm(), lm() and aov() ) all fit the same model *and* report the
same sums of squares in the "ANOVA" table if one asks for the "ANOVA" in
the correct way. summary(glm(....)) is not an "ANOVA" table.

That aov() does things differently - i.e. the summary() methods print
the ANOVA table - is cover in ?aov. The main purpose of avo() is to
present result in a manner that may be more familiar with textbooks and
other stats software ANOVA.

HTH

G

> The two types of test (t and F) should agree BUT the anova table is 
> (likely to be) giving type 1 sums of squares.  If you wanted the 
> analyses to agree you will want type 3 sums of squares.  I'm making no 
> comment about which sums of squares you want...  I also don't know how 
> to get them easily (just do a net search to get solutions using drop1, 
> lme and others). Be careful of marginalilty though.  I would be tempted 
> to calculate the F-statistics by hand -- all the mean squares are there 
> (I'm probably unique here though)
> 
> Note that the test for the last term in the anova table 
> (treatment:species) gives agreement between the model types. This is 
> because the type 1 and type 3 sums of squares agree for the last term.
> 
> There is *lots* of information and opinion about which type of sums of 
> squares to use.  It is a debate that has been raging for decades.  I 
> will point you to Bill Venables' contribution (Section 5 of Exegese on 
> Linear Models -- sorry I don't have a url).
> 
> I hope that this helps.  It should at least give you the right keywords 
> to Google.
> 
> Good luck,
> 
> Scott
> 
> On 11/11/11 12:23, Lara R. Appleby 04 wrote:
> > Below are the results obtained from doing a two way anova using glm (Method 1)
> > and aov (Method 2).
> >
> > ##Method 1
> >>   summary(glm(clutchsize~treatment*species))
> > Call:
> > glm(formula = clutchsize ~ treatment * species)
> >
> > Deviance Residuals:
> >       Min       1Q   Median       3Q      Max
> > -4.6223  -1.3917   0.0171   1.3777   6.6974
> >
> > Coefficients:
> >                     Estimate Std. Error t value Pr(>|t|)
> > (Intercept)         7.7095     1.2679   6.080 8.23e-09 ***
> > treatment          -0.2480     0.5845  -0.424 0.671891
> > species            -3.0463     0.8837  -3.447 0.000721 ***
> > treatment:species   0.5677     0.4069   1.395 0.164874
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > (Dispersion parameter for gaussian family taken to be 4.166879)
> >
> >       Null deviance: 854.72  on 166  degrees of freedom
> > Residual deviance: 679.20  on 163  degrees of freedom
> > AIC: 718.21
> >
> > Number of Fisher Scoring iterations: 2
> >
> > ##Method 2
> >>   summary(aov(clutchsize~treatment*species))
> >                      Df Sum Sq Mean Sq F value   Pr(>F)
> > treatment           1  29.26  29.264  7.0230  0.00884 **
> > species             1 138.14 138.143 33.1526 4.13e-08 ***
> > treatment:species   1   8.11   8.110  1.9464  0.16487
> > Residuals         163 679.20   4.167
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > --- Chris Howden<chris at trickysolutions.com.au>  wrote:
> > It would help if u posted the results so we know how different.
> >
> > But have a look at the defaults for a call to glm. Are they the same as for lm?
> >
> > There are some differences in the output for glm and lm objects when
> > using summary.
> >
> > Chris Howden
> > Founding Partner
> > Tricky Solutions
> > Tricky Solutions 4 Tricky Problems
> > Evidence Based Strategic Development, IP Commercialisation and
> > Innovation, Data Analysis, Modelling and Training
> >
> > (mobile) 0410 689 945
> > (fax / office)
> > chris at trickysolutions.com.au
> >
> > Disclaimer: The information in this email and any attachments to it are
> > confidential and may contain legally privileged information. If you are not
> > the named or intended recipient, please delete this communication and
> > contact us immediately. Please note you are not authorised to copy,
> > use or disclose this communication or any attachments without our
> > consent. Although this email has been checked by anti-virus software,
> > there is a risk that email messages may be corrupted or infected by
> > viruses or other
> > interferences. No responsibility is accepted for such interference. Unless
> > expressly stated, the views of the writer are not those of the
> > company. Tricky Solutions always does our best to provide accurate
> > forecasts and analyses based on the data supplied, however it is
> > possible that some important predictors were not included in the data
> > sent to us. Information provided by us should not be solely relied
> > upon when making decisions and clients should use their own judgement.
> >
> > On 11/11/2011, at 10:25, "Lara R. Appleby 04"
> > <Lara.R.Appleby.04 at alum.dartmouth.org>  wrote:
> >
> >>   I'm trying to basically do a two way ANOVA on the dependent variable (clutchsize)
> >> with the two independent variables (treatment and species). It seems that there
> >> are three ways I can say this in R:
> >>
> >>   1. glm(clutchsize~treatment*species)
> >>   2. aov(clutchsize~treatment*species)
> >>   3. anova(lm(clutchsize~treatment*species)
> >>
> >>   Methods 2 and 3 yield equivalent results, but Method 1 is completely different!
> >>
> >>   Any idea why?
> >>
> >>   Lara Appleby
> >>
> >>   _______________________________________________
> >>   R-sig-ecology mailing list
> >>   R-sig-ecology at r-project.org
> >>   https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> > --- end of quote ---
> >
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
> 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-sig-ecology mailing list