[BioC] clustering question
Kimpel, Mark William
mkimpel at iupui.edu
Mon Feb 20 05:23:05 CET 2006
I have a general question about clustering of genomic data. The heatmaps
that are generated are usually scaled row-wise so that variations are
apparent within rows but not between rows. In looking at the
documentation of heatmap and hclust, however, is appears that this
scaling is done after the actual clustering is performed. If heatmap is
performed on the hclust object with scale="none", it is apparent that
most of the row clustering is based on overall gene expression levels,
not on similar column-wise behavior between rows.
Wouldn't it make sense to scale row-wise before clustering so that the
row clusters are based more on the correlation of the behavior of rows
between columns, i.e. two genes would be near each other if the genes
behaved similarly across samples? I realize that some of this effect may
be achieved with unscaled data, but it seems to me that the large
overall expression differences may minimize that.
I hope this makes sense, I have perhaps not used all of the correct
nomenclature.
Thanks,
Mark
Mark W. Kimpel MD
Department of Psychiatry
Indiana University School of Medicine
Biotechnology, Research, & Training Center
1345 W. 16th Street
Indianapolis, IN 46202
More information about the Bioconductor
mailing list