[BioC] Extracting dendogram information from Heatmaps
James W. MacDonald
jmacdon at med.umich.edu
Fri Dec 14 21:41:42 CET 2007
Are you sure that your Rowv is a dendrogram? This works for me:
> dat <- matrix(rnorm(1000), ncol=10)
> Rowv <- as.dendrogram(hclust(dist(dat)))
> Colv <- as.dendrogram(hclust(dist(t(dat))))
> heatmap.2(dat, Rowv=Rowv, Colv=Colv)
Best,
Jim
alison waller wrote:
> No, it doesn't appear as if the matrix is mixed up the rows are equal in
> length.
>
> Interestingly, I can use Rowv=Rowv and Colv=Colv with the heatmap command
> successfully, but not with the heatmap.2 command.
>
>
> -----Original Message-----
> From: 'Thomas Girke' [mailto:thomas.girke at ucr.edu]
> Sent: Friday, December 14, 2007 12:52 AM
> To: alison waller
> Cc: 'James W. MacDonald'; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Extracting dendogram information from Heatmaps
>
> You may have mixed up the orientation of your matrix. Does
> length(myclust$labels) correspond to the number of rows in
> myma?
>
> Thomas
>
> On Fri 12/14/07 00:05, alison waller wrote:
>> Great, I passed genes names to the matrix through rownames(myma).
>>
>> Your example below is great!
>>
>> However, I would like to use heatmap.2 and am having trouble specifying
> the
>> Row clustering. I made Rowv and Colv,
>> Rowv=as.dendrogram(myclust),Colv<-as.dendrogram(hclust(dist(t(myma)))).
> And
>> thought I could just create the heatmap as below.
>> But it won't let me pass it Rowv or Colv, so the computer has to
> recalculate
>> the clusters everytime I want to change colors or something little (this
>> takes a while). Is there something wrong with my syntax?
>>
>>
>>
>> heatmap.2(myma,Rowv=Rowv,Colv=Colv,col=topo.colors(75),RowSideColors=mycolh
>> c42,trace='none',labRow=FALSE,key=T)
>> Warning messages:
>> 1: gamma cannot be modified on this device
>> 2: Discrepancy: Rowv is FALSE, while dendrogram is `both'. Omitting row
>> dendogram. in: heatmap.2(myma, Rowv = Rowv, Colv = Colv, col =
>> topo.colors(75),
>> 3: Discrepancy: Colv is FALSE, while dendrogram is `none'. Omitting column
>> dendogram. in: heatmap.2(myma, Rowv = Rowv, Colv = Colv, col =
>> topo.colors(75),
>>
>> heatmap.2(myma,Rowv=Rowv,col=topo.colors(75),RowSideColors=mycolhc42,trace=
>> 'none',labRow=FALSE,key=T)
>> Warning message:
>> Discrepancy: Rowv is FALSE, while dendrogram is `column'. Omitting row
>> dendogram. in: heatmap.2(myma, Rowv = Rowv, col = topo.colors(75),
>> RowSideColors = mycolhc42,
>>> ?heatmap.2
>> -----Original Message-----
>> From: 'Thomas Girke' [mailto:thomas.girke at ucr.edu]
>> Sent: Thursday, December 13, 2007 1:21 PM
>> To: alison waller
>> Cc: 'James W. MacDonald'; bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] Extracting dendogram information from Heatmaps
>>
>> The best way to answer these questions is to subset your data set to
>> to a test matrix with only a few rows. This way you can see the labels
>> in the plot and things become intuitive.
>> For example:
>> myma <- MA$M[fitp,]
>> myma <- myma[1:20,]
>>
>> Row names can always be assigned by you with
>> rownames(myma) <- mynames
>>
>> If your data set has a label column, then it would be
>> rownames(myma) <- myname$label
>>
>> To be sure your data set is a matrix, you do:
>> myma <- as.matrix(myma)
>>
>> Continue with hclust ...
>>
>> Thomas
>>
>> On Thu 12/13/07 12:50, alison waller wrote:
>>> Thanks everyone, these are great suggestions.
>>>
>>> I had trouble with the identify, as the plot moved when I clicked the
>> mouse
>>> and I got error messages.
>>>
>>> The cutree worked well - however, I see a matrix which has values
>>> corresponding to clusters, but is cluster one the leftmost or rightmost
>>> cluster? Ie. how are they ordered.
>>>
>>> The $labels method seems the best but my matrix doesn't seem to have
>> labels.
>>> I made my matrix from the M values from an MAList, is there a way to
> carry
>>> through the gene names?
>>>
>>> Myclust<-hclust(dist(MA$M[fitp,])
>>> Myclust$labels gives NULL
>>>
>>> Thanks again,
>>>
>>> alison
>>>
>>>
>>> -----Original Message-----
>>> From: Thomas Girke [mailto:thomas.girke at ucr.edu]
>>> Sent: Thursday, December 13, 2007 12:00 PM
>>> To: James W. MacDonald
>>> Cc: alison waller; bioconductor at stat.math.ethz.ch
>>> Subject: Re: [BioC] Extracting dendogram information from Heatmaps
>>>
>>> Alison,
>>>
>>> In addition to James' suggestions, you may want to get familiar how to
>>> access the
>>> different data components of the resulting hclust object (e.g. labels,
>>> order) and
>>> the cutree() function. If you can't read the labels in the plots, then
> you
>>> can
>>> always extract them in clean text in the corresponding tree order (see
>>> below:
>>> hr$labels[hr$order]) from the hclust objects.
>>>
>>> Here is a short example to illustrate a possible
> hclust-heatmap/heatmap.2
>>> routine:
>>>
>>> # Generate a sample matrix
>>> y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""),
>>> paste("t", 1:5, sep="")))
>>>
>>> # Cluster rows and columns by correlation distance
>>> hr <- hclust(as.dist(1-cor(t(y), method="pearson")))
>>> hc <- hclust(as.dist(1-cor(y, method="spearman")))
>>>
>>> # Obtain discrete clusters with cutree
>>> mycl <- cutree(hr, h=max(hr$height)/1.5)
>>>
>>> # Prints the row labels in the order they appear in the tree.
>>> hr$labels[hr$order] .
>>> # Prints the row labels and cluster assignments
>>> sort(mycl)
>>>
>>> # Some color selection steps
>>> mycolhc <- sample(rainbow(256))
>>> mycolhc <- mycolhc[as.vector(mycl)]
>>>
>>> # Plot the data matrix as heatmap and the cluster results as dendrograms
>>> with heatmap or heatmap.2
>>> # and show the cutree() results in color bar.
>>> heatmap(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc), scale="row",
>>> RowSideColors=mycolhc)
>>>
>>> library("gplots")
>>> heatmap.2(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc),
>>> col=redgreen(75), scale="row",
>>> ColSideColors=heat.colors(length(hc$labels)), RowSideColors=mycolhc,
>>> trace="none", key=T, cellnote=round(t(scale(t(y))),1))
>>>
>>>
>>> Best,
>>> Thomas
>>>
>>> On Thu 12/13/07 09:58, James W. MacDonald wrote:
>>>> Hi Alison,
>>>>
>>>> alison waller wrote:
>>>>> Hello Everyone,
>>>>>
>>>>>
>>>>>
>>>>> I've been using heatmap and heatmap.2 to draw heatmaps for my
>>> experiments.
>>>>>
>>>>>
>>>>> I have a heatmap of the M values of 6 arrays for the spots with
>> pvalues
>>> were
>>>>> <0.005 (from eBayes).
>>>>>
>>>>> However, I would like to see which spots it has grouped together in
>> the
>>> row
>>>>> dendogram. Is there a way I can extract the information about the
>> spots
>>>>> that are clustered together. I cannot read the row names, and even
> if
>> I
>>>>> could I was hoping there would be some way to list the clusters and
>> save
>>> it
>>>>> to a file.
>>>> There are two ways to do this that I know of. And either can be a
> pain,
>>>> depending on how big the dendrogram is.
>>>>
>>>> Both methods require you to construct your dendrogram first. You can
>>>> then choose the clusters with the mouse. This might be more difficult
> if
>>>> you have some gigantic dendrogram and have ingested too much coffee
> ;-D.
>>>> Normally, one would simply do
>>>>
>>>> heatmap(mymatrix, otherargs)
>>>>
>>>> and accept the default clustering method. However, you can always
>>>> pre-construct the dendrograms and then feed those to heatmap().
>>>>
>>>> Rowv <- as.dendrogram(hclust(dist(mymatrix)))
>>>> Colv <- as.dendrogram(hclust(dist(t(mymatrix))))
>>>>
>>>> heatmap(mymatrix, Rowv=Rowv, Colv=Colv, otherargs)
>>>>
>>>> Now if you do something like that, then you can try
>>>>
>>>> plot(Rowv)
>>>> a.cluster <- identify(Rowv)
>>>>
>>>> and then use your mouse to choose the upper left corner of a rectangle
>
>>>> that encompasses the cluster you are interested in. Here is where the
>>>> size of the dendrogram and the amount of coffee comes in. If the
>>>> dendrogram is really large then identify() may not be able to figure
> out
>>>> what you are trying to select, or may decide you are choosing the
> upper
>>>> right corner.
>>>>
>>>> You can choose as many clusters as you want, and they will be in the
>>>> list a.cluster, in the order you selected.
>>>>
>>>> A more programmatic method is to use rect.hclust() and either choose
> the
>>>> height at which to make the cuts, or the number of clusters, etc.
> Again,
>>>> depending on the size of your dendrogram, this may work well or it may
>
>>>> be painful.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>>
>>>>> Alison
>>>>>
>>>>>
>>>>>
>>>>> ******************************************
>>>>> Alison S. Waller M.A.Sc.
>>>>> Doctoral Candidate
>>>>> awaller at chem-eng.utoronto.ca
>>>>> 416-978-4222 (lab)
>>>>> Department of Chemical Engineering
>>>>> Wallberg Building
>>>>> 200 College st.
>>>>> Toronto, ON
>>>>> M5S 3E5
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> Affymetrix and cDNA Microarray Core
>>>> University of Michigan Cancer Center
>>>> 1500 E. Medical Center Drive
>>>> 7410 CCGC
>>>> Ann Arbor MI 48109
>>>> 734-647-5623
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> --
>>> Thomas Girke
>>> Assistant Professor of Bioinformatics
>>> Director, IIGB Bioinformatic Facility
>>> Center for Plant Cell Biology (CEPCEB)
>>> Institute for Integrative Genome Biology (IIGB)
>>> Department of Botany and Plant Sciences
>>> 1008 Noel T. Keen Hall
>>> University of California
>>> Riverside, CA 92521
>>>
>>> E-mail: thomas.girke at ucr.edu
>>> Website: http://faculty.ucr.edu/~tgirke
>>> Ph: 951-827-2469
>>> Fax: 951-827-4437
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> --
>> Thomas Girke
>> Assistant Professor of Bioinformatics
>> Director, IIGB Bioinformatic Facility
>> Center for Plant Cell Biology (CEPCEB)
>> Institute for Integrative Genome Biology (IIGB)
>> Department of Botany and Plant Sciences
>> 1008 Noel T. Keen Hall
>> University of California
>> Riverside, CA 92521
>>
>> E-mail: thomas.girke at ucr.edu
>> Website: http://faculty.ucr.edu/~tgirke
>> Ph: 951-827-2469
>> Fax: 951-827-4437
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list