[BioC] significance of "wrong" clustering of differential genes
Naomi Altman
naomi at stat.psu.edu
Mon Nov 13 22:02:45 CET 2006
The heatmap did not come through (to me). However, clustering is
highly dependent on the choice of distance measure.
--Naomi
At 09:57 AM 11/13/2006, Benjamin Otto wrote:
>Hi,
>
>
>
>Please imagine the following situation:
>
>For two sample sets (set1, set2) the most differentially expressed genes are
>identified by limma. The p.value correction would be "holm". Afterwards a
>heatmap is printed for these genes. The procedure would look like:
>
>
>
> > f <- factor(as.character(pheno[,marker]))
>
> > design <- model.matrix(~f)
>
> > fit <- eBayes(lmFit(eSet,design))
>
> > tab <- topTable(fit, coef=2, number=nrow(eSet), adjust.method="holm")
>
> > selected <- tab$adj.P.Val < 0.01 & abs(tab$M) >= 1
>
> > ## print a heatmap for eSet[selected,]
>
>
>
>
>
>What can lead to a misclassification in the clustering, say one sample of
>set1 is clustered together with set2? Afterall according to the workflow I
>have explicitly been searching for the genes which should discriminate
>between the two sets! However the expression values displayed in the heatmap
>assume, that this samle IS more similar to the "wrong" set than to the true
>one. (have a look at the jpg)
>
>Is it possible, that this sample is always treated as outlier in the
>significance calculations?
>
>And if it is so, then: Is it sensible to take such a misclassification as
>kind of significane?
>
>Regards
>
>
>
>Benjamin
>
>
>
>
>
>--
>Benjamin Otto
>Universitaetsklinikum Eppendorf Hamburg
>Institut fuer Klinische Chemie
>Martinistrasse 52
>20246 Hamburg
>
>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list