[BioC] Theoretical Question

Wed May 19 20:10:09 CEST 2004

I posted a similar question last week and received some help with this problem, but I am still a bit unclear on the best way to proceed- any insights would be greatly appreciated.

I want to identify a set of genes that are co-regulated with a given phenotype that is observed across various tissue types -to ID the 'signature' that corresponds to the phenotype regardless of tissue- 

Here is the simplest set up: (all data is affymetrix and has been pre-processed/normalized by rma)

Tissue type A has 3 conditions: 1A, 2A, 3A

Type B has 4 conditions: 1B, 2B, 3B, 4B

My phenotype of interest is observed only in 1A and 1B.

I am interested in knowing what is common (both up and down regulated) between 1A (relative only to 2A and 3A) and 1B (relative to 2B, 3B, and 4B).  I have varying numbers of replicates per condition (2-5).

I have done unsupervised clustering using all genes, and 1A and 1B don't cluster together (not really surprising since they are quite different in many respects , I am interested only in their overlapping phenotypes). I am not entirely sure how best to proceed.

I have used straight fold change to ID unique genes in 1A vs 2A and 1A vs 3A. I then select those genes up (or down) in 1A in both comparisons. I then look at how the ‘1A specific’ genes are expressed in 1B vs all other B's- and there is a general positive skewing- but the concern is where to draw cutoffs- how to estimate FDR, etc in such a comparison. Basically, how does one go about saying that the skewing in a different comparison of a subset of genes is significant?

Any insights you might have would be appreciated.

Thx

John Luckey, MD PhD

Clinical Pathology Resident - Brigham and Womens Hospital

Post Doctoral Fellow  -          Mathis - Benoist Lab

Joslin Diabetes Center

One Joslin Place, Rm. 474

Boston, MA  02215