[R-sig-eco] negative F value in adonis()

Wed Aug 11 15:17:56 CEST 2010

On 11/08/10 14:38 PM, "Adriano Melo" <grumicha at yahoo.com.br> wrote:

> Jari:
> I would be glad in case you discover and inform what is going on with this
> index.

Adriano,

I was able to reproduce the behaviour (negative F values in adonis()) with
another data set (sipoo of vegan) and with two dissimilarity indices or
betadiver(sipoo, "sim") and vegdist(sipoo, "mountford"). I checkec the
calculation in alternative ways and it seems to me that adonis() works quite
correctly. The "problem" is with the index.

Function adonis() works on the same framework as metric scaling a.k.a.
principal coordinates analysis. If you use other dissimilarities than
Euclidean distances, you will usually get some negative eigenvalues. For
reconstitution of the original dissimilarities, you must scale the
eigenvectors so that the sum of their squares is proportional to the
eigenvalue. If your eigenvalues are negative, the corresponding eigenvectors
must be complex so that the sum of their squares can be negative. If your
explanatory variable in adonis() has a better correspondence to these
complex vectors (negative eigenvalues) than to real vectors (positive
eigenvalues), the reconstituted distance in adonis() will be negative. This
is most likely if your data is strongly non-Euclidean, but can happen with
any dissimilarity index except Euclidean distances of real data. The
betadiver(..., "sim") index you tried is one of the most non-Euclidean
cases, and its complex component is large relative to the real component in
metric scaling. 

Function adonis() does not run explicit metric scaling, but the same algebra
is built in its equations. It seems that it quite correctly analyses the
data and gives you the correct negative distances as a result. If you want
to use non-metric indices, you should be prepared to tolerate negative
distances. Actually, we ran into this in function betadisper() which is a
sister function of betadiver() and sister-in-law to adonis(). There we
solved the problem by taking absolute distances (this was done two years and
three months ago, but then we didn't touch adonis()).

You can analyse the issue yourself by using function capscale() in vegan.
The function returns a summary table with decomposition of inertia to real
and imaginary components. If you save the results, you can extract the
distance matrices corresponding to the real and imaginary components using
function fitted(, model = c("CCA", "CA", "Imaginary")). Here is an example
with the 'sipoo' data:

library(vegan)
data(sipoo)
d <- betadiver(sipoo, "sim")
m <- capscale(d ~ 1)
m   ## to see the decomposition of inertia
d.real <- fitted(m, model="CA")
d.ima <- fitted(m, model="Im")

Now you can run the adonis() separately for real and imaginary components:

example(sipoo) ## to get the explanatory variable sipoo.area
adonis(d ~ sipoo.area) ## Negative SumsOfSqs and F value
adonis(d.real ~ sipoo.area)
adonis(d.ima ~ sipoo.area)

The SumOfSqs of the first model = SumOfSqs of the second model (d.real)
minus SumOfSqs of the third model (d.ima). QED.

A reference that I have suggested here before for understanding negative
eigenvalues of non-Euclidean dissimilarities is:

Gower, J.C. (1985). Properties of Euclidean and non-Euclidean
     distance matrices. _Linear Algebra and its Applications_ 67,
     81-97.

I suggest you just accept the idea of having negative distances or
alternatively use indices that are less likely to give you negative
distances. 

Cheers, Jari Oksanen