[BioC] Idenifying signatures across many samples?
John.Luckey at joslin.harvard.edu
Mon May 10 17:13:27 CEST 2004
I am looking to identify common expression profiles (signatures) across many different pairwise comparisons. Essentially, I have lots of diverse affymetric data sets from different tissues, but for each tissue type I have one sample that expresses my phenotype of interest, and one or more others that do not. I am interested in identifying which mRNA transcripts are up or down regulated selectively for that phenotype (obviously, this is a broadly defined phenotype, since it is observed in several different tissue types. While there will be many genes that are tissue specific, I am hopeing that the similar tissues that don't express the phenotype will control for this).
So far, I have simply used the affy package from bioconductor to summarize and pre-process the data, then identified those genes whose fold change within a tissue type comparison reaches a given threshold for my phenotype of interest, and then asked which genes is this true across all tissue types (many samples have only 2 or 3 replicates- so my read of literature is p values not very useful here).
Seems to me there must be a more statistically valid approach or one which somehow weighs degrees of correlation across all comparisons and doesn't necesssarily exclude a gene which might be strongly correlated in all but one comparison (where it might be just below a given threshold for example).
Any advice or directions to relevant papers/ approaches would be greatly apreciated.
C John Luckey, MD PhD
Resident - Clinical Pathology - Brigham and Women’s Hospital
Post Doctoral Fellow – Diane Mathis/Christophe Benoist Lab - Joslin Diabetes Center
One Joslin Place, Rm. 474
Boston, MA 02215
phone: (617) 264-2783
fax: (617) 264-2744
e-mail: john.luckey at joslin.harvard.edu
More information about the Bioconductor