[R-sig-eco] capscale() for PCoA-CDA

gabriel singer gabriel.singer at univie.ac.at
Fri Dec 4 14:02:22 CET 2009


Dear Jari and others,
>> Hi everybody,
>>
>> Anybody has used capscale() in package vegan to compute a PCoA-CDA as
>> suggested by Anderson and Willis 2003 (Ecology 84: 511 ff) using one or
>> more factors as "predictors"?
>>
>> Then I wonder about:
>>
>> *) How to interpret interactions of factors? Why are interactions
>> (specified as "~factor1*factor2" in the function call) shown as
>> continuous predictors (using arrows) in the plot function? Wouldn´t
>> centroids for all cells in the design be more appropriate? Aren´t
>> factorial interactions in a CDA setting more or less meaningless?
>>     
>
> Internally capscale() uses constrasts of variables, and they are treated as
> continuous variables and shown as arrows in plots. However, if the
> constrasts correspond to simple factors, they are not drawn but their
> centroids are shown. For ordered factors you get both centroids and the
> arrows. The interactions of contrasts cannot be shown as simple class means
> and therefore they are drawn as arrows. The simple centroids are not
> appropriate, but you should have centroids of all combinations of class
> levels of interacting factors.
>
> If you think that factorial interactions in *** (what is CDA?) are
> meaningless, why do you want to use them?
>
> I wouldn't say they are meaningless, because that depends on your meaning.
> Often they are difficult to interpret, but that's another issue.
>   
I understand the arrows for interactions now, thanks.

I used CDA in the sense of Anderson and Willis 2003 (and others) as 
Canonical Disicriminant Analysis,
as such it is - at least to my understanding - equivalent to 
Discriminant Function Analyses.
When CDA aka DFA is used with 2 interacting factors, it will try to best 
separate groups and that
is *any groups*, and I can´t see why (and how) there should be 
preference given to any grouping
criterion (factor 1, factor 2 or both)... In the end a 4-level factor 
should be as good as
a 2*2 factorial combination. In this sense I used the word "meaningless".

In fact, capscale() results for a 1*4 constraint (1 factor, 4 levels) 
are identical with a 2*2 constraint.
However, centroids are at differnt positions (!), in fact centroids of 
all combinations of class levels are at
weird (wrong as I think) positions in the 2*2 case!?

Still, "interactions" finally make sense when interpreting the plot, 
that´s quite true.
>   
>> *) How to get classification statistics? And how to efficiently run a
>> "leave 1 out" classification analysis? I thought of manually writing
>> code that checks for the closest centroid. Would it be appropriate to
>> use Euclidean distance as a criterion for this since it happens in PCo
>> space? Probably there are more efficient functions which I do not know
>> of, yet,... for example a function that allows extraction of distances
>> of all objects to all centroids?
>>
>>     
> There is no such thing. Contributed code will be reviewed for inclusion into
> vegan.
>  
>   
>> *) Is the application of capscale on a Euclidean distance matrix
>> equivalent to a classical DFA aka CDA on the original data - or am I
>> completely wrong with this idea?
>>
>>     
> No, it isn't equal to "DFA aka CDA". Perhaps... Depends on what are DFA and
> CDA. With Euclidean distances, capscale() is equivalent to redundancy
> analysis (RDA). Guessing that "DFA aka CDA" are discriminant analysis, RDA
> is not equal to them. The major difference is that RDA uses no information
> about scatter of points with respect to the class centroids, but it only
> uses class centroids. The RDA tries to maximize the distances among class
> centroids, but it doesn't try to maximize the separation of points of
> different classes. The methods are very different although the results may
> have some similarities.
>
> This is connected to the previous question: because RDA (that is in the
> heart of capscale()) does not try to optimize in classification, there is no
> classification statistic to be optimized. That should be estimated
> independently of the analysis and after the analysis, and there are no
> functions for the purpose in vegan.
>  
>   
Slightly confused now... Anderson and Willis (2003) describe PCoA on a 
dissimilarity structure, followed by
CDA or CCorA and call the procedure CAP (Canonical A of Principal 
Coordinates). I will call the latter two
approaches PCoA-CDA and PCoA-CCorA. Now, I get that CCorA differs from 
RDA mainly conceptually,
so there is not much (any?) difference between PCoA-CCorA and PCoA-RDA = 
capscale().
Now, is PCoA-CDA really equivalent to db-RDA (in the sense of Legendre and
Anderson 1999)?  I initially thought this would be the case. They both 
use a set of dummy variables to code
for the factor and treat these as continous predictors. A second thought 
tells me they can´t be the same. Then
maybe what´s left is only the term capscale() which is not the same as 
CAP in the case of PCoA-CDA...
Seems I am getting lost in the panoply of acronyms, sorry...
>> *) Given only one factor as a "predictor", I guess using permutest() or
>> anova() on an object resulting from capscale is completely equivalent to
>> a direct application of adonis()? Correct?
>>
>>     
> Have you tried this? After trying, you could tell us if this is true. I
> wouldn't expect this. The results may not be completely different, but
> internally the methods are pretty different, and when I tried with the same
> random number seed and hence same permutations, the results were not identical.
>   
Well, the question was sort of aimed at what´s happening in the 
background, obviously that´s not the same
(though I don´t get how the two permutation tests exactly differ, I 
thought  - at least in the sample 1 factor case -
it´s basically permuting raw data and building a pseudo-F distribution). 
In my trials I got very similar
results (also same pseudo F - so I thought the test statistic has to be 
the same) and interpreted any
differences of the P-values as due to differences in the permutations.

Jari, thanks for the discussion!

Cheers, Gabriel



More information about the R-sig-ecology mailing list