[R-sig-eco] stratifying variables in Adonis
Jari Oksanen
jari.oksanen at oulu.fi
Wed Mar 16 15:29:07 CET 2011
On 16/03/11 13:40 PM, "Duncan Mackay" <duncan.mackay at flinders.edu.au> wrote:
> Hello all,
> In the documentation for Adonis in the vegan package, there is an example of a
> replicated random block design with Nitrate levels as a treatment factor and
> field as a blocking factor.
... clip ...
>> adonis(Y1 ~ NO3 + field,strata=dat1$field, data=dat1,perm=1000)
>
> Call:
> adonis(formula = Y1 ~ NO3 + field, data = dat1, permutations = 1000,
> strata = dat1$field)
>
> Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
> NO3 1 0.022964 0.022964 1.6959 0.18069 0.2358
> field 2 0.077045 0.038523 2.8449 0.60622 0.2358
> Residuals 2 0.027082 0.013541 0.21309
> Total 5 0.127091 1.00000
>
> However, my second question is :- How has Adonis has come up with a P-value
> here for field? I thought that all shuffling of data was occurring within
> fields and therefore that the field totals (and F-ratios) would be the same
> for each permutation?
>
Duncan,
Your first conjecture is correct: the field totals are constant, but the
corollary is not true: F-ratios are not equal. The the "SumOfSqs" and
"MeanSqs" of "field" are constant, and the the "SumOfSqs" of "Total" is
constant. However, the "SumOfSqs" of "NO3" are not constant and hence the
"SumOfSqs" and "MeanSqs" of "Residuals" are not constant. Therefore the
ratio of constant "field" "MeanSqs" and variable "Residual" "MeanSqs" is not
constant.
With your reduced data set of six observations and three strata, you can
only have eight permutations. In addition to the unpermuted one these are:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 6 5
[2,] 1 2 4 3 5 6
[3,] 1 2 4 3 6 5
[4,] 2 1 3 4 5 6
[5,] 2 1 3 4 6 5
[6,] 2 1 4 3 5 6
[7,] 2 1 4 3 6 5
These give the following Mean Squares:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
NO3 0.0574 0.0125 0.0143 0.0143 0.0125 0.0574 0.0573
field 0.0528 0.0528 0.0528 0.0528 0.0528 0.0528 0.0528
Residuals 0.0244 0.0469 0.0459 0.0459 0.0469 0.0244 0.0244
Please note the constant "field" and variable "NO3" and hence "Residuals".
These give the ratios (F-statistics) for "field":
[1] 2.1660 1.1272 1.1498 1.1498 1.1272 2.1660 2.1605
(all rounded here in this message, but with usual accuracy in R)
So the problem is that we use F-ratios, and these become inadequate for
variables used in stratified sampling.
So the answer to question one: don't use the same variable in the model that
you have as a stratum.
Cheers, Jari Oksanen
More information about the R-sig-ecology
mailing list