[R-sig-eco] stratifying variables in Adonis

Duncan Mackay duncan.mackay at flinders.edu.au
Wed Mar 16 23:39:17 CET 2011


Aha! Many thanks for your most helpful response. Duncan

-----Original Message-----
From: Jari Oksanen [mailto:jari.oksanen at oulu.fi] 
Sent: Thursday, 17 March 2011 12:59 AM
To: Duncan Mackay; r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] stratifying variables in Adonis

On 16/03/11 13:40 PM, "Duncan Mackay" <duncan.mackay at flinders.edu.au> wrote:

> Hello all,
> In the documentation for Adonis in the vegan package, there is an 
> example of a replicated random block design with Nitrate levels as a 
> treatment factor and field as a blocking factor.
... clip ...
>> adonis(Y1 ~ NO3 + field,strata=dat1$field, data=dat1,perm=1000)
> 
> Call:
> adonis(formula = Y1 ~ NO3 + field, data = dat1, permutations = 1000, 
> strata = dat1$field)
> 
>           Df SumsOfSqs  MeanSqs F.Model      R2 Pr(>F)
> NO3        1  0.022964 0.022964  1.6959 0.18069 0.2358
> field      2  0.077045 0.038523  2.8449 0.60622 0.2358
> Residuals  2  0.027082 0.013541         0.21309
> Total      5  0.127091                  1.00000
> 
> However, my second question is :- How has Adonis has come up with a 
> P-value here for field?  I thought that all shuffling of data was 
> occurring within fields and therefore that the field totals (and 
> F-ratios) would be the same for each permutation?
> 
Duncan,

Your first conjecture is correct: the field totals are constant, but the corollary is not true: F-ratios are not equal. The the "SumOfSqs" and "MeanSqs" of "field" are constant, and the the "SumOfSqs" of "Total" is constant. However, the "SumOfSqs" of "NO3" are not constant and hence the "SumOfSqs" and "MeanSqs" of "Residuals" are not constant. Therefore the ratio of constant "field" "MeanSqs" and variable "Residual" "MeanSqs" is not constant. 

With your reduced data set of six observations and three strata, you can only have eight permutations. In addition to the unpermuted one these are:

    [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    2    3    4    6    5
[2,]    1    2    4    3    5    6
[3,]    1    2    4    3    6    5
[4,]    2    1    3    4    5    6
[5,]    2    1    3    4    6    5
[6,]    2    1    4    3    5    6
[7,]    2    1    4    3    6    5

These give the following Mean Squares:

            [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]
NO3       0.0574 0.0125 0.0143 0.0143 0.0125 0.0574 0.0573
field     0.0528 0.0528 0.0528 0.0528 0.0528 0.0528 0.0528
Residuals 0.0244 0.0469 0.0459 0.0459 0.0469 0.0244 0.0244

Please note the constant "field" and variable "NO3" and hence "Residuals".
These give the ratios (F-statistics) for "field":

[1] 2.1660 1.1272 1.1498 1.1498 1.1272 2.1660 2.1605

(all rounded here in this message, but with usual accuracy in R)

So the problem is that we use F-ratios, and these become inadequate for variables used in stratified sampling.

So the answer to question one: don't use the same variable in the model that you have as a stratum.

Cheers, Jari Oksanen





More information about the R-sig-ecology mailing list