[R] question about capscale (vegan)

Mon Nov 27 16:11:56 CET 2006

On Mon, 2006-11-27 at 15:37 +0100, Alicia Amadoz wrote:
> Hi Gavin,
> 
> I have been analyzing real data (sorry but I am not allowed to post
> these data here) and what I got was this,
> 
> mydistmat_f.cap <- capscale(distmat_f ~ F + L + F:L, mfactors_frame)

I believe you can write that formula as: distmat_f ~ F * L

> 
> Warning messages:
> 1: some of the first 30 eigenvalues are < 0 in: cmdscale(X, k = k, eig =
> TRUE, add = add)
> 2: Se han producido NaNs in: sqrt(ev)

Sorry, I don't know enough about this method to know whether this a
problem you should worry about or not. You should read up on the method
some more to decide if the first warning is something you should be
worried about. IIRC, negative eigenvalues are to be expected with this
method as they are handled explicitly by capscale, and as this is a
warning coming from cmdscale(), I suspect it is a helpful feature of
that function, which you don't need to worry about when used in
capscale().

> 
> > mydistmat_f.cap
> 
> Call:
> capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)
> 
>               Inertia Rank
> Total          0.3758
> Constrained    0.2110    4
> Unconstrained  0.1648    4
> Inertia is squared  distance
> Some constraints were aliased because they were collinear (redundant)
> 
> Eigenvalues for constrained axes:
>      CAP1      CAP2      CAP3      CAP4
> 1.679e-01 2.954e-02 1.349e-02 1.233e-05
> 
> Eigenvalues for unconstrained axes:
>      MDS1      MDS2      MDS3      MDS4
> 1.388e-01 2.601e-02 4.076e-05 2.064e-07
> 
> So, by these results I can tell that there are 4 axes that explain
> 0.1648 of the total variance and another 4 axes that explain 0.2110 of
> the total variance. But I don't understand the difference between
> constrained and unconstrained.

The constrained axes are axes that are linear combinations of your
explanatory variables (F, L and F:L), so this is the bit of your genomic
data that is explained by those explanatory factors. The unconstrained
bit is the remaining variance not explained, and are MDS (PCoord) axes.

So you can explain c. 56% of the variance in your genomic data with F,
L, and F:L.

Note the warning about aliased constraints - this means that at least
the variance of one variable in the model (inc interactions) is
completely correlated with another variable (or combination of
variables?) and is redundant.

Type alias(mydistmat_f.cap) to see which coefficients are aliased
and ?alias to see what this means.

> 
> > anova(mydistmat_f.cap)
> 
> Permutation test for capscale under direct model
> 
> Model: capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)
>          Df    Var      F N.Perm Pr(>F)
> Model     4   0.21 1.2798 400.00 0.0875 .
> Residual  4   0.16
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> > summary(anova(mydistmat_f.cap))
>        Df         Var               F             N.Perm        Pr(>F)
>  Min.   :4   Min.   :0.1648   Min.   :1.280   Min.   :200   Min.   :0.12
>  1st Qu.:4   1st Qu.:0.1764   1st Qu.:1.280   1st Qu.:200   1st Qu.:0.12
>  Median :4   Median :0.1879   Median :1.280   Median :200   Median :0.12
>  Mean   :4   Mean   :0.1879   Mean   :1.280   Mean   :200   Mean   :0.12
>  3rd Qu.:4   3rd Qu.:0.1994   3rd Qu.:1.280   3rd Qu.:200   3rd Qu.:0.12
>  Max.   :4   Max.   :0.2110   Max.   :1.280   Max.   :200   Max.   :0.12
>                               NA's   :1.000   NA's   :  1   NA's   :1.00
> 
> Then, I want to know the sum of squares of anova to check with other
> analysis that we performed but I can't see them by the output of anova.
> Besides, I am wondering if there is any manner to identify the main
> effects, factor effects and interaction in this anova analysis. I would
> be very grateful if you could help me to understand these results.

There isn't a summary method for anova.cca, and anyway, this anova isn't
working on sums of squares, but on other measures of variance. It is a
permutation test, and simply works out with brute force how likely you
are to have a model explaining 56% of the total variance given your
sample size and model complexity, under a null/random model.

It sounds like you haven't grasped fully the fundamentals of the methods
you are employing, and I would strongly advise you to do some more
reading up on these methods. I can, at best, only guide you as I am not
that familiar with the technique myself.

A good start would be the refs in ?capscale and then search for papers
that cite Anderson & Willis and that use the methodology.

> 
> Thank you very much,
> Alicia

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%