[BioC] Extracting dendogram information from Heatmaps

alison waller alison.waller at utoronto.ca
Thu Dec 13 18:50:43 CET 2007


Thanks everyone, these are great suggestions.

I had trouble with the identify, as the plot moved when I clicked the mouse
and I got error messages.  

The cutree worked well - however, I see a matrix which has values
corresponding to clusters, but is cluster one the leftmost or rightmost
cluster? Ie. how are they ordered.

The $labels method seems the best but my matrix doesn't seem to have labels.
I made my matrix from the M values from an MAList, is there a way to carry
through the gene names?

Myclust<-hclust(dist(MA$M[fitp,])
Myclust$labels gives NULL

Thanks again,

alison


-----Original Message-----
From: Thomas Girke [mailto:thomas.girke at ucr.edu] 
Sent: Thursday, December 13, 2007 12:00 PM
To: James W. MacDonald
Cc: alison waller; bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Extracting dendogram information from Heatmaps

Alison,

In addition to James' suggestions, you may want to get familiar how to
access the 
different data components of the resulting hclust object (e.g. labels,
order) and 
the cutree() function. If you can't read the labels in the plots, then you
can 
always extract them in clean text in the corresponding tree order (see
below: 
hr$labels[hr$order]) from the hclust objects.

Here is a short example to illustrate a possible hclust-heatmap/heatmap.2
routine:

# Generate a sample matrix
y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""),
paste("t", 1:5, sep=""))) 

# Cluster rows and columns by correlation distance
hr <- hclust(as.dist(1-cor(t(y), method="pearson"))) 
hc <- hclust(as.dist(1-cor(y, method="spearman"))) 

# Obtain discrete clusters with cutree
mycl <- cutree(hr, h=max(hr$height)/1.5)

# Prints the row labels in the order they appear in the tree.
hr$labels[hr$order] .
# Prints the row labels and cluster assignments
sort(mycl) 

# Some color selection steps
mycolhc <- sample(rainbow(256))
mycolhc <- mycolhc[as.vector(mycl)]

# Plot the data matrix as heatmap and the cluster results as dendrograms
with heatmap or heatmap.2
# and show the cutree() results in color bar.
heatmap(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc), scale="row",
RowSideColors=mycolhc) 

library("gplots") 
heatmap.2(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc),
col=redgreen(75), scale="row", 
ColSideColors=heat.colors(length(hc$labels)), RowSideColors=mycolhc,
trace="none", key=T, cellnote=round(t(scale(t(y))),1))


Best, 
Thomas

On Thu 12/13/07 09:58, James W. MacDonald wrote:
> Hi Alison,
> 
> alison waller wrote:
> > Hello Everyone,
> > 
> >  
> > 
> > I've been using heatmap and heatmap.2 to draw heatmaps for my
experiments.  
> > 
> >  
> > 
> > I have a heatmap of the M values of 6 arrays for the spots with pvalues
were
> > <0.005 (from eBayes).
> > 
> > However, I would like to see which spots it has grouped together in the
row
> > dendogram.  Is there a way I can extract the information about the spots
> > that are clustered together.  I cannot read the row names, and even if I
> > could I was hoping there would be some way to list the clusters and save
it
> > to a file.
> 
> There are two ways to do this that I know of. And either can be a pain, 
> depending on how big the dendrogram is.
> 
> Both methods require you to construct your dendrogram first. You can 
> then choose the clusters with the mouse. This might be more difficult if 
> you have some gigantic dendrogram and have ingested too much coffee ;-D.
> 
> Normally, one would simply do
> 
> heatmap(mymatrix, otherargs)
> 
> and accept the default clustering method. However, you can always 
> pre-construct the dendrograms and then feed those to heatmap().
> 
> Rowv <- as.dendrogram(hclust(dist(mymatrix)))
> Colv <- as.dendrogram(hclust(dist(t(mymatrix))))
> 
> heatmap(mymatrix, Rowv=Rowv, Colv=Colv, otherargs)
> 
> Now if you do something like that, then you can try
> 
> plot(Rowv)
> a.cluster <- identify(Rowv)
> 
> and then use your mouse to choose the upper left corner of a rectangle 
> that encompasses the cluster you are interested in. Here is where the 
> size of the dendrogram and the amount of coffee comes in. If the 
> dendrogram is really large then identify() may not be able to figure out 
> what you are trying to select, or may decide you are choosing the upper 
> right corner.
> 
> You can choose as many clusters as you want, and they will be in the 
> list a.cluster, in the order you selected.
> 
> A more programmatic method is to use rect.hclust() and either choose the 
> height at which to make the cuts, or the number of clusters, etc. Again, 
> depending on the size of your dendrogram, this may work well or it may 
> be painful.
> 
> Best,
> 
> Jim
> 
> 
> > 
> >  
> > 
> > Thanks,
> > 
> >  
> > 
> > Alison  
> > 
> >  
> > 
> > ******************************************
> > Alison S. Waller  M.A.Sc.
> > Doctoral Candidate
> > awaller at chem-eng.utoronto.ca
> > 416-978-4222 (lab)
> > Department of Chemical Engineering
> > Wallberg Building
> > 200 College st.
> > Toronto, ON
> > M5S 3E5
> > 
> >   
> > 
> >  
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Thomas Girke
Assistant Professor of Bioinformatics
Director, IIGB Bioinformatic Facility
Center for Plant Cell Biology (CEPCEB)
Institute for Integrative Genome Biology (IIGB)
Department of Botany and Plant Sciences
1008 Noel T. Keen Hall
University of California
Riverside, CA 92521

E-mail: thomas.girke at ucr.edu
Website: http://faculty.ucr.edu/~tgirke
Ph: 951-827-2469
Fax: 951-827-4437



More information about the Bioconductor mailing list