[R-sig-eco] ordination and clustering with continous and categorical variables

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Dec 22 10:49:17 CET 2010


On Sun, 2010-12-19 at 21:07 +0100, Mario Brusadin wrote:
> Dear all,
> 
> Apologies in advance for the total beginner's question. I would like
> to perform a simple ordination on a dataset of species traits,
> contaning both continous and categorical variables, as well as a
> cluster analysis on the same dataset. My aim is to identify, species
> niche-breadth using the ordination and identify any clusters of
> similar species, with the cluster analysis. I would possibly like to
> overlay the results of the cluster analysis results onto an ordination
> plot.
> 
> I have seen from previous discussions on this mailing list that it is
> possible to apply hierchical clustering to a distance matrix, created
> with the function daisy from the cluster package. This function can be
> set to use the Gower dissimilarity coefficient (1971), to create a
> distance matrix, from a dataset that contains both continous and
> categorical variables.  Is the best option available?

Etienne Laliberte's FD package contains gowdis() which includes several
extensions to Gower's Coefficient motivated form an ecological
viewpoint. See it and the references cited for more information.

On the other hand, daisy() is robust and well tested.

Which to use will depend on whether you need Podani's extensions.

>  As for the
> ordination, I am not entirely sure about which ordination method which
> I should use. Is Correspondence Analysis suitable or would it be
> better to use Principal coordinates analysis? Any suggestions/help
> would be greatly appreciated!

I wouldn't use CA for this. Principal Coordinates (PCoA) would be a
starting point, but nMDS (metaMDS() in package vegan) would be my
preferred method if ordinating sites.

One problem, or rather issue, I foresee is that neither of these
techniques use the original species information - it is effectively lost
when we convert to dissimilarities. Species scores can be located in the
ordination space, where they are the weighted averages of the site
scores.

How were you planning on investigating niche breadth from the
ordination? What would niche-breadth relate to in terms of traits?

With nMDS you can't treat the "axes" separately - there aren't two
independent gradients in a 2d nMDS solution, you have to work with the
configuration in 2-d space. You can fit a model to this configuration
using ordisurf (or do it by hand using gam() ), but then extracting
niche widths from a smoother-based model is problematic - but see
Heegaard's paper on "borders":

Heegaard E. 2002. The outer border and central border for
species-environmental relationships estimated by non-parametric
generalised additive models. Ecological Modelling 157: 131-139.

although I'm not aware of a generally-available R implementation.

> Gower, J. C. (1971) A general coefficient of similarity and some of
> its properties, Biometrics 27, 857–874.

HTH

G

> 
> Cheers
> 
> Mario
> 
> --
> Padova
> Italy
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-sig-ecology mailing list