[R] indexing into and modifying dendrograms

Jenny Bryan jenny at stat.ubc.ca
Mon Jul 11 20:48:32 CEST 2005


I would like to be able to exert certain types of control over the
plotting of dendrograms (representing hierarchical clusterings) that I
think is best achieved by modifying the dendrogram object
prior to plotting.  I am using the "dendrogram" class and associated
methods.

Define the cluster number of each cluster formed as the corresponding
row of the merge object.  So, if you are clustering m objects, the
cluster numbers range from 1 to m-1, with cluster m-1 containing all m
objects, by definition.  I basically want a way to index into an
object of class dendrogram using the above definition of cluster
number and/or to act on a dendrogram, where I specify the target node
using cluster number.

The first application would be to 'flip' the two elements in target
node of the dendrogram (made clear in the small example below). (The
setting is genomics and I have applications where I want to man-handle
my dendrograms to make certain features of the clustering more obvious
to the naked eye.)  I could imagine other, related actions that would
be useful in decorating dendrograms.

I think I need a function that takes a dendrogram and cluster
number(s) as input and returns the relevant part(s) of the dendrogram
object -- but in a form that makes it easy to then, say, set certain
attributes (perhaps recursively) for the target nodes (and perhaps
those contained in it).  I'm including a small example below that
hopefully illustrates this (it looks long, but it's mostly comments!).

Any help would be appreciated.

Jenny Bryan

## get ready for step-by-step figures
par(mfrow = c(2,2))

## get 5 objects, with 2-dimensional features
pts <- rbind(c(2,1.6),
              c(1.8,2.4),
              c(2.1, 2.7),
              c(5,2.6),
              c(4.7,3.1))
plot(pts, xlim = c(0,6), ylim = c(0,4),type = "n",
      xlab = "Feature 1", ylab = "Feature 2")
points(pts,pch = as.character(1:5))

## build a hierarhical tree, store as a dendrogram
aggTree <- hclust(dist(pts), method = "single")
(dend1 <- JB.as.dendrogram.hclust(aggTree))
## NOTE: only thing I added to official version of
## as.dendrogram.hclust:
## each node has an attribute cNum, which gives
## the merge step at which it was formed,
## i.e. gives the row of the merge object which
## describes the formation of that node
## one new line near end of nMerge loop:
## ***************
## *** 51,56 ****
## --- 51,60 ----
##   				     attr(z[[x[2]]], "midpoint"))/2
##   	}
##  	 attr(zk, "height") <- oHgt[k]
## +
## +         ## JB added July 6 2005
## +         attr(zk, "cNum") <- k
## +
##   	z[[k <- as.character(k)]] <- zk
##       }
##       z <- z[[k]]
attributes(dend1)
attributes(dend1[[1]])
## here's a table relating dend1 and the cNum attribute
## dend1               cNum
## -------------------------
## dend1                4
## dend1[[1]]           2
## dend1[[2]]           3
## dend1[[2]][[1]]  <not set>
## dend1[[2]][[1]]      1

## use cNum attribute in "edgetext"
## following example in dendrogram documentation
## would really rather associate with the node than the edge
## but current plotting function has no notion of nodetext
addE <- function(n) {
   if(!is.leaf(n)) {
     attr(n, "edgePar") <- list(p.col="plum")
     attr(n, "edgetext") <- attr(n,"cNum")
   }
   n
}
dend2 <- dendrapply(dend1, addE)
## overlays the cNum ("cluster number") attribute on dendrogram
plot(dend2, main = "dend2")
## why does no plum polygon appear around the '4' for the root
## edge?

## swap order of clusters 2 and 3,
## i.e. 'flip' cluster 4
dend3 <- dend2
dend3[[1]] <- dend2[[2]]
dend3[[2]] <- dend2[[1]]
plot(dend3, main = "dend3")
## wish I could achieve with 'dend3 <- flip(dend2, cNum = 4)

## swap order of cluster 1 and object 1,
## i.e. 'flip' cluster 3
dend4 <- dend2
dend4[[2]][[1]] <- dend2[[2]][[2]]
dend4[[2]][[2]] <- dend2[[2]][[1]]
plot(dend4, main = "dend4")
## wish I could achieve with 'dend4 <- flip(dend2, cNum = 3)

## finally, it's clear that the midpoint attribute would also
## need to be modified by 'flip'




More information about the R-help mailing list