[R] clustering question ... hclust & kmeans

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Aug 1 17:14:27 CEST 2001


On Wed, 1 Aug 2001, Mark Robinson wrote:

> I am using R 1.3.0 on Windows 2000.
>
> For an experiment, I am wanting to find the most diverse 400 items to
> study in a possible 3200 items.  Diversity here is based on a few
> hundred attributes.  For this, I would like to do a clustering analysis
> and find 400 clusters (i.e. different from each other in some way
> hopefully).  From each of these 400 clusters, I will pick a
> representative.  I expect many of these clusters will have just one
> item.  I am planning to do this using a variety of different clustering
> methods.
>
> What I am wondering is if there is any way to retrieve from hclust the
> cluster membership after I cut off the tree.  That is, if I cut the tree
> to segregate into my 400 clusters, is there any way to find which item
> goes into which cluster ... similar to the way kmeans returns "cluster"?

Have you tried this?  Hierarchical clustering on 3200 items takes quite a
lot of memory, so I hope you have lots.

cutree will cut a tree into k(=400) clusters, and return a vector of group
membership just like kmeans.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list