[BioC] dendrograms on heatmap.2 (gplots)
Gavin Koh
gavin.koh at gmail.com
Sun May 29 07:10:38 CEST 2011
Dear Steve, Just for the record, I think I have found a function that
allows drawing of a dendrogram with the leaf order specified. It is
draw.dendrogram {NeatMap}. I cannot see a way of directly drawing the
reordered dendrogram in the heatmap, though, so I still think that
your solution is better :-) Gavin.
On 28 May 2011 19:53, Gavin Koh <gavin.koh at gmail.com> wrote:
> Dear Steve,
>
> Yes, I expect that in preserving the order in which I have the sample
> currently, the branches will cross.
> You are right: it will be clearer to cluster by k-means then use
> ColSideColors to colour the leaves than to try to draw a dendrogram
> with criss-crossing branches. Thanks for helping me thinking this
> through.
>
> Gavin.
>
> On 28 May 2011 18:20, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
>> Hi Gavin,
>>
>> On Sat, May 28, 2011 at 11:06 AM, Gavin Koh <gavin.koh at gmail.com> wrote:
>>> Dear Steve, I have healthy controls and patients, so two groups.
>>> k-means misclassifies a few study subjects, but by and large,
>>> redrawing the dendrogram while preserving the ordering is not going to
>>> serious mess things up.
>>
>> Sorry if my post came across in the wrong way -- I'm not trying to
>> imply that you are trying to show something that isn't true, or
>> something ... I'm actually not sure how you interpreted my email,
>> because I'm not sure what you're trying to say in your reply, so let
>> my try another way :-)
>>
>> I guess my point is that: yes, you have two groups when you condition
>> group assignment based on a state we call "healthy" and "affected" (or
>> whatever you call them here).
>>
>> If you are asking to group your patients in a different way -- this
>> time using your gene expression profiles -- it's not totally unusual
>> for things to change a bit.
>>
>> So, again, I'm not trying to lecture here, but this is the way I
>> understand it. If I'm wrong, feel free to correct me:
>>
>> The distances we "walk along" the arms/branches of the dendrogram say
>> something about the distance between the "things" they are connecting.
>> If you didn't change any params in your heatmap call, the default
>> distance measure between your vectors is calculated by its euclidean
>> distance, and that just is what it is. The dendrogram is then drawn to
>> respect those distances. If you move things around, then you are
>> saying something different about those distances, right?
>>
>> In this context, I'm confused about your point when you say "redrawing
>> the dendrogram while preserving the ordering is not going to serious
>> mess things up" -- what ordering do you expect to be preserved ... is
>> it the columns of the matrix that you passed in? If you don't want to
>> move those columns around, then do you want the branches of the tree
>> to criss-cross or something?
>>
>> The way I see it, you are kind of stuck if you intend to draw a
>> dendrogram at all.
>>
>> So -- how can we move things around in a natural way?
>>
>> Maybe you can choose a different distance measure?
>> Maybe you can normalize your data in a different way?
>> Maybe you can plot a subset of genes -- maybe those with the highest
>> variance across all your data, which might result in new distances
>> calculated, and a different drawing of the branches on the tree.
>>
>> You could always pass in your own dendrogram structure to the heatmap
>> and "arbitrarily" calculate distances so that the tree draws as you
>> want, but I don't think that's something you'd want to do anyway.
>>
>> Another approach to show "likeness" between expression profiles is to
>> not focus on the dendrogram lining up "just so", but to rather add a
>> list of colors to the examples (columns) of your data by using the
>> "ColSideColors" parameter. Say the first 10 columns of your matrix are
>> from the 10 controls, and the last 10 are from the affecteds. You can
>> do:
>>
>> R> heatmap.2(my.data, ..., ColSideColColors=c(rep('blue', 10), rep('red', 10)))
>>
>> If, as you say, the expression profiles are *mostly* similar, you'll
>> see that, by and large, the blue experiments will be "chunked" w/
>> blue, and the red expts are chunked with the red, which might show the
>> same point you're trying to make with the dendrogram.
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>
>
>
> --
> Hofstadter's Law: It always takes longer than you expect, even when
> you take into account Hofstadter's Law.
> —Douglas Hofstadter (in Gödel, Escher, Bach, 1979)
>
--
Hofstadter's Law: It always takes longer than you expect, even when
you take into account Hofstadter's Law.
—Douglas Hofstadter (in Gödel, Escher, Bach, 1979)
More information about the Bioconductor
mailing list