[BioC] select one Affy probeset for one gene

Sean Davis sdavis2 at mail.nih.gov
Mon Mar 13 22:34:05 CET 2006




On 3/13/06 4:15 PM, "Glazko, Galina" <Galina_Glazko at URMC.Rochester.edu>
wrote:

> Dear Sean,
> 
> Thank you for the answer.
> This sounds good but what if I do multiple testing?
> Then my adjusted p-values are based on the entire array, and I will not
> be able to see differentially expressed genes because I am testing say
> 40,000 hypotheses, while there are actually as many hypotheses as there
> are genes.

Galina,

I agree here, but this general concept is slightly different than trying to
choose the "best" probeset for a given gene.

To reduce the data dimensionality, you want to choose probesets that:

 1) Are measuring something.
 2) Are showing some variation between samples

To determine 1, you can use multiple lines of evidence, such as level of
expression or affy calls.  To determine 2, you can calculate a CV
(coefficient of variation) or something like that.  Notice that this doesn't
involve determining which genes represent which probeset, but only
determining the "quality" of the data.

You can look at the genefilter package for some hints about how to do this.

Hope this clarifies a bit.

Sean



More information about the Bioconductor mailing list