[R] Get Details About Clusters

Peter Langfelder peter.langfelder at gmail.com
Thu Mar 15 22:00:46 CET 2012


On Thu, Mar 15, 2012 at 1:48 PM, A J <anxusgo at hotmail.com> wrote:
>
> Hi everybody!
> Anybody knows how can I get detalied information about clusters after using hclust?
> The issue is that if I have some items in different clusters, I would like to get the cluster where each item is placed.
> Taking into account that my data set is too large, it is not useful to have the dendogram or a graphic, and really I need something like a simple table with item label and cluster name, for instance.
> Is it possible to do this in any way in R?
>
> I leave a code example from I start:
>
> a<-replicate(2000, rnorm(2000))b<-hclust(as.dist(a), method="ward", members=NULL)
>
> And this is the information that I achieve:
>
> structure(list(merge = structure(c(-6L, -5L, -7L, -3L, -1L, -2L, 3L, 4L, 5L, -10L, -9L, -8L, 1L, -4L, 2L, 6L, 7L, 8L), .Dim = c(9L, 2L)), height = c(-2.16431780288644, -1.77785380974643, -1.72883152083299, -1.02930929735342, -0.957628473035096, -0.687733358846453, 1.62427849392232, 2.78818645913762, 3.01723103257677), order = c(1L, 4L, 3L, 6L, 10L, 7L, 8L, 2L, 5L, 9L), labels = NULL, method = "ward", call = quote(hclust(d = as.dist(a),     method = "ward", members = NULL)), dist.method = NULL), .Names = c("merge", "height", "order", "labels", "method", "call", "dist.method"), class = "hclust")
>
> I just need the every item with its correponding cluster in a more or less organizated way. Of course, there is not problem in using different funtcions or librarys (till now I have not found anything sweeting to my needs). Advices or orientations are welcome and appreciated!

hclust by itself does not generate clusters; rather, it generates a
clustering tree. You need to identify branches (clusters) in the tree
using a "branch cutting" method. This typically entails choosing one
or more parameters that specify how sensitive the cut method should be
to branch splits.

You can do that in several ways. Simple tree cut is implemented in the
function cutree (package stats). You can specify the number of
clusters or the cut height. More advanced methods are implemented in
the function cutreeDynamic in the dynamicTreeCut package (shameless
plug alert - I'm the maintainer). Examples of use and results from the
dynamicTreeCut package can be seen at

http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/

Our group has used the dynamicTreeCut methods extensively in
clustering gene expression data.

HTH,

Peter



More information about the R-help mailing list