[R-sig-eco] ordination and clustering with continous and categorical variables

Wed Dec 22 01:14:50 CET 2010

Dear Mario,

For the Gower dissimilarity you may want to check

Library(FD)
?gowdis

which allows weights to be assigned to traits/variables (sometimes useful),
and implements Podani's 1999 approach for ordinal traits/variables.

Cheers

Etienne

-----Original Message-----
From: r-sig-ecology-bounces at r-project.org
[mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of Tyler Smith
Sent: Wednesday, 22 December 2010 1:31 AM
To: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] ordination and clustering with continous and
categorical variables

Mario Brusadin <mario.brusadin at gmail.com>
writes:

> I would like to perform a simple ordination on a dataset of species
> traits, contaning both continous and categorical variables, as well as
> a cluster analysis on the same dataset. ... I would possibly like to
> overlay the results of the cluster analysis results onto an ordination
> plot.
>
> ... This function can be set to use the Gower dissimilarity
> coefficient (1971), to create a distance matrix, from a dataset that
> contains both continous and categorical variables. Is the best option
> available?

I'm not sure if it's the best option, but it is certainly reasonable.

> As for the ordination, I am not entirely sure about which ordination
> method which I should use. Is Correspondence Analysis suitable or
> would it be better to use Principal coordinates analysis? 

Correspondence analysis implicitly preserves the Chi-Square distance
between individuals, which I don't think is what you want. Since you'll
be using the Gower distance to create a distance matrix, you can use
either principal coordinates analysis (via the cmdscale() function), or
non-metric multidimensional scaling (NMDS, available in the metaMDS()
function in the vegan package).

If you want the 'best' possible two-dimensional representation of your
distance matrix to plot on paper, then NMDS is probably better. The
vegan functions env.fit and ordisurf are very useful for plotting the
original variables onto your ordination plots. You can use the results
of the clustering analysis to alter the shape/color of the plotted
points (using the col or cex arguments to plot() ) in order to 'overlay'
the clusters on the ordination.

> My aim is to identify, species niche-breadth using the ordination and
> identify any clusters of similar species, with the cluster analysis.

The above suggestions will give you a potentially useful graphical
summary of your data. I'm not sure how you'd extract niche-breadth from
this. Do you have multiple measurements per species? If you've got a
single summary value for each trait per species, what you'll be plotting
is the relative niche position of each species, rather than their niche
breadths.

Regards,

Tyler

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology