[R-sig-eco] ordination and clustering with continous and categorical variables

Tyler Smith tyler.smith at eku.edu
Tue Dec 21 18:30:48 CET 2010


Mario Brusadin <mario.brusadin at gmail.com>
writes:

> I would like to perform a simple ordination on a dataset of species
> traits, contaning both continous and categorical variables, as well as
> a cluster analysis on the same dataset. ... I would possibly like to
> overlay the results of the cluster analysis results onto an ordination
> plot.
>
> ... This function can be set to use the Gower dissimilarity
> coefficient (1971), to create a distance matrix, from a dataset that
> contains both continous and categorical variables. Is the best option
> available?

I'm not sure if it's the best option, but it is certainly reasonable.

> As for the ordination, I am not entirely sure about which ordination
> method which I should use. Is Correspondence Analysis suitable or
> would it be better to use Principal coordinates analysis? 

Correspondence analysis implicitly preserves the Chi-Square distance
between individuals, which I don't think is what you want. Since you'll
be using the Gower distance to create a distance matrix, you can use
either principal coordinates analysis (via the cmdscale() function), or
non-metric multidimensional scaling (NMDS, available in the metaMDS()
function in the vegan package).

If you want the 'best' possible two-dimensional representation of your
distance matrix to plot on paper, then NMDS is probably better. The
vegan functions env.fit and ordisurf are very useful for plotting the
original variables onto your ordination plots. You can use the results
of the clustering analysis to alter the shape/color of the plotted
points (using the col or cex arguments to plot() ) in order to 'overlay'
the clusters on the ordination.

> My aim is to identify, species niche-breadth using the ordination and
> identify any clusters of similar species, with the cluster analysis.

The above suggestions will give you a potentially useful graphical
summary of your data. I'm not sure how you'd extract niche-breadth from
this. Do you have multiple measurements per species? If you've got a
single summary value for each trait per species, what you'll be plotting
is the relative niche position of each species, rather than their niche
breadths.

Regards,

Tyler



More information about the R-sig-ecology mailing list