[R] cluster analyses

White.Denis@epamail.epa.gov White.Denis at epamail.epa.gov
Tue Apr 30 02:49:27 CEST 2002


> I'm clustering rather large data sets and would like to cut the
dendrograms
> to get a better view of specific components.  I calculate the
dissimilarity
> matrix using daisy() because I have a mixture of variable types:
factors,
> ordered factors and numerical variables.  If I want one dendrogram, I
use
> agnes() for the agglomerative nesting and pltree() to draw the
dendrogram.
> That way, I get the row names as labels, but I can't cut the tree.
>
> Alternatively, I use hclust() on the dissimilarity matrix from daisy
().
> This allows me to cut the dendrogram with cutree(), but I loose the
labels,
> so that isn't much use.  I can change the output from hclust() to
class
> dendrogram with as.dendrogram().  This has a rather neat way of
cutting the
> dendrogram with cut.dendrogram(), which allows you to show specific
lower
> sections of the dendrogram with plot.dendrogram(object$lower[[1]]).
Again, I
> loose the labels.
>
> Does anyone know how to keep the row names as labels when starting
with
> daisy() and ending with plot.dendrogram()?  A couple of months ago, I
had a
> look at the code for as.hclust() and managed to change it so that I
could
> keep the labels, but now I don't remember how I got to see the code.
When I
> type as.hclust, I get "function(x,...) UseMethod("as.hclust")".
>
> Also, does anyone know how to get a horizontal dendrogram so that the
labels
> are readable? Ideally with the labels to the right??
> Any help would be greatly appreciated.
>
> Best wishes,
> Mikkel

If your data are "spatial", that is, they can be identified through two
dimensional coordinates, you could try the mapping techniques in package
maptree.  Groups of observations (rows) in higher level clusters can be
given the same symbol/color to show cluster patterns.  But the labels of
individual observations are not preserved through this process either.

There is a new function in that package, kgs(), that calculates, using a
penalty function, an optimal size to which to prune a dendrogram.


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list