[R-sig-eco] ANOVA Output
Kingsford Jones
kingsfordjones at gmail.com
Thu Nov 13 21:16:24 CET 2008
The types-of-sums-of-squares issue is FAQ 7.18 and you can find a
great deal of discussion in the R help lists. You only need to choose
a 'type' if for some reason you need to efficiently produce a table
with the results of multiple hypothesis tests. In general I think it
is better to think hard about exactly which hypotheses are of interest
and and then compare appropriately nested models to conduct the test
(via e.g., a LRT). This is covered in many stats books, including
many with an R focus. See the books by Venebles and Ripley, Harrell,
Faraway, Maindonald, etc...
Kingsford Jones
On Thu, Nov 13, 2008 at 9:33 AM, Landis, R Matthew
<rlandis at middlebury.edu> wrote:
> That's a great point Tyler. It raises the question of what IS a good reference for statistics that treats them the way R does. There has been some discussion of that already, but one book that hasn't been mentioned is that of John Fox, the author of the car package.
>
> Fox, John. 1997. Applied regression analysis, linear models, and related methods. Sage Publications.
>
> http://books.google.com/books?id=pr2mKvAxXeYC&printsec=frontcover&lr=
>
> Although mainly aimed at the social sciences, I found this to be pretty readable, and much more detailed than Crawley's books (admittedly aimed at a higher level). And as for R code, Fox also has "An R and S-Plus Companion to Applied Regression". http://books.google.com/books?id=xWS8kgRjGcAC&printsec=frontcover&lr=
>
> If you want to get a detailed understanding of Anova and regression the way R sees them, I think this pair of books is nearly as good as it gets.
>
> Matt
>
> -----Original Message-----
> From: r-sig-ecology-bounces at r-project.org [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of tyler
> Sent: Thursday, November 13, 2008 8:52 AM
> To: r-sig-ecology at r-project.org
> Subject: Re: [R-sig-eco] ANOVA Output
>
> Apologies if I'm beating a dead horse here, but this is exactly the
> problem I raised in the thread on classical statistics in R. If Katrina
> is using a textbook like Sokal and Rohlf, it is indeed completely
> unexpected to find that changing the order of explanatory variables in
> an anova will produce different results. Thierry points out that this is
> because R produces Type I SS by default. Unfortunately, nowhere in S&R
> is this distinction explained, so for this problem a book widely
> regarded as a comprehensive reference for biologists provides absolutely
> no help.
>
> These questions come up all the time on the r-help list, and I think
> it's a sign of a real disconnect between the presentation of classical
> statistics in many undergrad programs and the way the tests are actually
> implemented in R.
>
> Anyways, that's a bigger issue. It may be helpful to know that the 'car'
> package includes a function Anova (not to be confused with the anova
> function) that allows you to calculate type II or type III sums of
> squares.
>
> Cheers,
>
> Tyler
>
> "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>
> writes:
>
>> Dear Katrina,
>>
>> The F-value are different because you test different hypotheses since
>> anova yields Type I SS. It looks like you expect Type III SS.
>>
>> HTH,
>>
>> Thierry
>>
>>
>> ------------------------------------------------------------------------
>> ----
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
>> methodology and quality assurance
>> Gaverstraat 4
>> 9500 Geraardsbergen
>> Belgium
>> tel. + 32 54/436 185
>> Thierry.Onkelinx at inbo.be
>> www.inbo.be
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to
>> say what the experiment died of.
>> ~ Sir Ronald Aylmer Fisher
>>
>> The plural of anecdote is not data.
>> ~ Roger Brinner
>>
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of
>> data.
>> ~ John Tukey
>>
>> -----Oorspronkelijk bericht-----
>> Van: r-sig-ecology-bounces at r-project.org
>> [mailto:r-sig-ecology-bounces at r-project.org] Namens Katrina W. Chu
>> Verzonden: woensdag 12 november 2008 22:27
>> Aan: r-sig-ecology at r-project.org
>> Onderwerp: [R-sig-eco] ANOVA Output
>>
>> I have a question about my R-output when I run a three-way ANOVA. I
>> just plugged in the
>> interaction term into the formula and presto! ANOVA! But I noticed
>> that if I change
>> the order of the formula (or interaction term), I get slightly different
>> ANOVA outputs.
>> I've pasted the output at the bottom of this message. I didn't think
>> that this should
>> happen, so I would appreciate if anyone had any feedback on this
>> problem.
>>
>> Thanks in advance, Kat.
>>
>>> ANOVA <- aov(Chlorophyll.a~Treatment*SamplingPeriod*Site)
>>> summary(ANOVA)
>> Df Sum Sq Mean Sq F value Pr(>F)
>> Treatment 3 356.5 118.8 4.2878 0.005276 **
>> SamplingPeriod 3 374.7 124.9 4.5069 0.003911 **
>> Site 1 1016.5 1016.5 36.6791 2.629e-09 ***
>> Treatment:SamplingPeriod 9 467.6 52.0 1.8747 0.053284 .
>> Treatment:Site 3 167.8 55.9 2.0176 0.110424
>> SamplingPeriod:Site 3 1670.2 556.7 20.0884 2.383e-12 ***
>> Treatment:SamplingPeriod:Site 9 277.2 30.8 1.1115 0.352455
>> Residuals 534 14799.5 27.7
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>>> ANOVA <- aov(Chlorophyll.a~SamplingPeriod*Treatment*Site)
>>> summary(ANOVA)
>> Df Sum Sq Mean Sq F value Pr(>F)
>> SamplingPeriod 3 369.5 123.2 4.4437 0.004264 **
>> Treatment 3 361.8 120.6 4.3510 0.004840 **
>> Site 1 1016.5 1016.5 36.6791 2.629e-09 ***
>> SamplingPeriod:Treatment 9 467.6 52.0 1.8747 0.053284 .
>> SamplingPeriod:Site 3 1662.0 554.0 19.9894 2.718e-12 ***
>> Treatment:Site 3 176.0 58.7 2.1166 0.097111 .
>> SamplingPeriod:Treatment:Site 9 277.2 30.8 1.1115 0.352455
>> Residuals 534 14799.5 27.7
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>>> ANOVA <- aov(Chlorophyll.a~Site*SamplingPeriod*Treatment)
>>> summary(ANOVA)
>> Df Sum Sq Mean Sq F value Pr(>F)
>> Site 1 1008.9 1008.9 36.4050 2.998e-09 ***
>> SamplingPeriod 3 374.1 124.7 4.4990 0.003953 **
>> Treatment 3 364.8 121.6 4.3871 0.004607 **
>> Site:SamplingPeriod 3 1654.8 551.6 19.9026 3.050e-12 ***
>> Site:Treatment 3 172.6 57.5 2.0761 0.102364
>> SamplingPeriod:Treatment 9 478.2 53.1 1.9172 0.047282 *
>> Site:SamplingPeriod:Treatment 9 277.2 30.8 1.1115 0.352455
>> Residuals 534 14799.5 27.7
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> --
> What is wanted is not the will to believe, but the will to find out, which is
> the exact opposite. --Bertrand Russell
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
More information about the R-sig-ecology
mailing list