[R-sig-eco] IndVal groups & Plotting an RDA with labels

Tue Jul 3 17:09:45 CEST 2012

Caitlin,

     Depending on the size of your data set one way to identify clusters 
is to label the dendrogram with cluster number.  Lets say your Ward's 
result is called 'ward.hcl', and your cluster membership (perhaps 
returned from cutree) is in a vector called 'clustid'

 > plot(ward.hcl,labels=clustid)

will produce the dendrogram with cluster number identified on the 
bottom.  If the number of plots is too high they overwrite and it gets 
hard to read, but usually you can make it out if you stretch the plot.

    Alternatively, size is sometimes indicative, and you can simply

 > table(clustid)

and see if cluster size is unique.

Dave Roberts

On 07/03/2012 07:12 AM, Caitlin Porter wrote:
> Dear all,
>
>
> I am working with a large data set to examine plant community structure on
> barrens habitats and have a couple of questions regarding Indval (labdsv
> package) on Wards Clusters (hclust function in stats package) and plotting
> axes of an RDA result with plot, plot.cca or esqplot functions (vegan,
> MASS, graphics packages)
>
> 1.      I have run IndVal (package labdsv) on a Ward’s Cluster Analysis to
> determine indicator species for groups. I selected k=5 groups as my most
> meaningful and conservative (sample size, etc) classification. However,
> according to my average silhouette width, k=12 is the most optimal number
> of groups.  I ran an IndVal on k=12 out of curiosity and notice some
> interesting patterns I would like to explore further but I am having
> difficulty understanding the output. The groups are labeled in the
> indicator species output as “group 1,2,3,4….12”. I do not understand how to
> determine which group is associated with which cluster (e.g.. in my Ward’s
> dendrogram – which is group 11?). It is somewhat obvious when k is
> relatively small because of the order the groups are clustered in, branch
> height in the dendrogram, and species frequency and abundances, however I’d
> like to know for larger groups. It would be ideal if I could even label the
> dendrogram with these groups. I’ve seen examples of these in a couple of
> papers with color coded boxes, but I can’t  seem to figure out how to code
> it myself.
>
>   2.      My second question relates to plotting an RDA. I have been able to
> run an RDA in vegan package successfully but unable to plot it in a way
> that I can interpret. I need to label sites, species (response matrix) and
> environmental variables/PCA axes (explanatory matrix).
>
>   So far, I’ve only been able to label either the response matrix or the
> explanatory matrix in my graphs, but not all 3 sets of points. I’ve tried
> modifying plot function and code from Borcard et al 2011, (Numerical
> ecology with R), esqplot code for MASS package and plot.cca in Vegan
> package.  I would prefer to use esqplot since I understand  already how to
> better customize it, but I’m just looking to get any graph I can read at
> this point.  When I use the plot function from Borcard et al. I see PC axes
> names only. When I use esqplot, I see species names only.  I also tried
> plot.cca in vegan package but wasn’t able to call up a graph. This code
> looks like a great way to do it, but I’m not sure what I’m doing, *e.g*.
> what to put in for const. or what the ‘unexpected symbol’ error means.
>
>   This old thread asks a similar question (
> https://stat.ethz.ch/pipermail/r-help/2009-February/188282.html), but I’m
> not sure I understand its solution and have approximately 300 species so
> providing a separate name for each individually might not be feasible. This
> other thread asks another similar question (
> http://r.789695.n4.nabble.com/RDA-Triplot-td3055474.html) but the author
> finds an error is generated specifying that biplot is not an appropriate
> method.
>
> I have included my code for question 2 below.Any help would be very much
> appreciated!
>
>
> Sincere thanks,
>
>
> Caitlin
>
>
> *#esqplot (MASS package) *
>
> library(MASS)
>
> #subset species and sites scores from the rda for first 10 RDA axes
>
> sr<- scores(c1.rda, display = c("sites", "species"), choices =
> c(1,2,3,4,5,6), scaling = 2)
>
> sites.only<- as.data.frame(sr$sites)
>
> srsp<- as.data.frame(sr$species) # data frame with just the species in it
>
> c1.site<- c1[,1] # object with just the site names from the original data
> set
>
> cp.m<- merge(c1.site, sites.only, by=0, sort=FALSE) # merged site object
> with site names
>
> eqscplot(cp.m$RDA1, cp.m$RDA2, xlim=c(-1, 1), ylim=c(-1, 1), col="blue",
> xlab="RDA Axis 1", ylab="RDA Axis 2", cex=0.3) # defining the variables&
> limits of plot and what the symbols look like
>
> text(cp.m$RDA1, cp.m$RDA2, labels=cp.m$x, col="black", cex=0.3) #adding
> names on plots
>
> text(srsp$RDA1, srsp$RDA2, labels=rownames(srsp),col= "red", cex=0.3) #adding
> names/species
>
> # Error in text.default(cp.m$RDA1, cp.m$RDA2, labels = cp.m$x, col =
> "black",  :   zero length 'labels'
>
>
>
> *#Borcard et al. 2011 plot function*
>
> plot(c1.rda, main= "Triplot RDA - scaling 2 - wa scores")
>
> spe.c1<- scores(c1.rda, choices=1:6, scaling=2, display="sp")
>
> arrows(0,0,spe.c1[,1], spe.c1[,4], length=0, lty=1, col="red")
>
>
>
> c1.rda.species<- as.data.frame(c1.rda$CCA$v)
>
> c1.species1<- c1.rda.species[abs(c1.rda.species$RDA1)>0.15 |
> abs(c1.rda.species$RDA2)>0.15,]
>
> text(c1.species1$RDA1, c1.species1$RDA2, labels=rownames(c1.species1),
> cex=0.7, col="red")
>
>
>
> *#plot.cca (Vegan package) *
>
> plot(c1.rda, choices = c(1,2), display =c("sp", "wa", "cn"), scaling =2,
> type ="text", xlim= c(-1, 1), ylim = c(-1, 1), xlab = RDA Axis 1", ylab=
> "RDA Axis 2", cex=0.3)
>
>
>
> *# **Error: unexpected symbol in "plot(c1.rda, choices = c(1,2), display
> =c("sp", #"wa", "cn"), scaling =2, type ="text", xlim= c(-1, 1), ylim =
> c(-1, 1), xlab #= RDA Axis"*
>
>
>
> ## S3 method for class 'cca':
>
> text(c1.rda, display = "sites", labels, choices = c(1, 2), scaling = 2,
>
>      arrow.mul, head.arrow = 0.05, select, const)
>
>
>
> # would I put: scores(c1.rda, choices=1:2, scaling=2, display="sp") in for
> const
>
>
>
> ## S3 method for class 'cca':
>
> points(c1.rda, display = "sites", choices = c(1, 2), scaling = 2,
>
>      arrow.mul, head.arrow = 0.05, select, const, ...)
>
>
>
> ## S3 method for class 'cca':
>
> scores(x, choices=c(1,2), display=c("sp","wa","cn"), scaling=2, ...)
>
>
>
> ## S3 method for class 'rda':
>
> scores(x, choices=c(1,2), display=c("sp","wa","cn"), scaling=2,
>
>      const, ...)
>
>
>
> ## S3 method for class 'cca':
>
> summary(object, scaling = 2, axes = 6, display = c("sp", "wa",
>
>      "lc", "bp", "cn"), digits = max(3, getOption("digits") - 3), ...)
>
>
>
> ## S3 method for class 'summary.cca':
>
> print(x, digits = x$digits, head = NA, tail = head, ...)
>
>
>
> ## S3 method for class 'summary.cca':
>
> head(x, n = 6, tail = 0, ...)
>
>
>
> ## S3 method for class 'summary.cca':
>
> tail(x, n = 6, head = 0, ...)
>
> * *
>
> * *
>
>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

--