[R] finding centroids of clusters created with hclust

Moritz Lennert mlennert at club.worldonline.be
Wed May 10 18:59:35 CEST 2006


Replying to myself for the record:

Moritz Lennert wrote:
> Hello,
> 
> Can someone point me to documentation or ideas on how to calculate the 
> centroids of clusters identified with hclust ?
> 
> I would like to be able to chose the number of clusters (in the style of 
> cutree) and then get the centroids of these clusters.
> 
> This seems like a quite obvious task to me, but I haven't been able to 
> put my hands on a relevant command.

Here's a simple function that does the job for me:

Variables:

data: matrix of original (absolute value) data introduced into hclust or 
HierClust
clust: result of a 'cutree' call on the results of the hclust or 
HierClust call

Value:

a matrix of relative values of the variables at the centroids of the types


function (data, clust) {
   nvars=length(data[1,])
   ntypes=max(clust)
   centroids<-matrix(0,ncol=nvars,nrow=ntypes)
   for(i in 1:ntypes) {
      c<-rep(0,nvars)
      n<-0
      for(j in names(clust[clust==i])) {
         n<-n+1
         c<-c+data[j,]
      }
      centroids[i,]<-c/n
   }
   rownames(centroids)<-c(1:ntypes)
   colnames(centroids)<-colnames(data)
   centroids
}

Moritz




More information about the R-help mailing list