[BioC] how to test for genes of interest?

Jenny Drnevich drnevich at illinois.edu
Thu Jul 24 18:00:52 CEST 2008


HI Glyn,

"...mine it for biological significance" is a very vague, and in my 
experience, very subjective sort of analysis. I do agree that with a 
particular list, in this case immune genes, doing something like GSEA 
could be appropriate. However, GSEA gives an answer along the lines 
of "yes, immune genes appear to be important" and not "which immune 
genes are changing, and which are not?" Besides, data mining is not 
included in my basic statistical analysis service. :)  I was just 
wondering if one was going to do the analysis I described, what is 
the proper way to do it?

Thanks,
Jenny

At 10:48 AM 7/24/2008, Glyn Bradley wrote:
>Hi Jenny
>I may get shot down horribly for saying this on this list, but isn't
>there a large school of thought which says don't do FDR at all, just
>take the large list of genes out and mine it for biological
>significance.
>Certainly a large pharma I've a little experienmce of takes that
>approach. Stats are just stats afterall. (and I'm sure you're going to
>validate the results with some other wet lab technique anyway).
>
>
>Glyn PhD
>Bioinf and systems modelling
>mycib.ac.uk
>
>On Thu, Jul 24, 2008 at 4:14 PM, Jenny Drnevich <drnevich at illinois.edu> wrote:
> > Hi everyone,
> >
> > I've always heard that one of the ways "around" the multiple 
> testing problem
> > of microarrays is for you to a priori identify a particular list of genes
> > you're interested in, and then you only have to do the multiple test
> > correction for this smaller list. I've never done this in practice, and I'm
> > not sure at what point in the analysis it's proper to pull out just the
> > smaller list. Obviously, all the data preprocessing and normalization will
> > be done with all the genes, but should I pull out the genes before fitting
> > the model, or after fitting the model right before the multiple test
> > adjustment? I'm using the eBayes() shrinkage in limma, so which 
> genes are in
> > the model will make a big difference in the outcome.
> >
> > I'm thinking it would be best to keep all the genes in the model, and then
> > split them out into two groups (genes of interest and all the 
> rest) and do a
> > FDR correction separately for each group. What do you think?
> >
> > Thanks,
> > Jenny
> >
> > Jenny Drnevich, Ph.D.
> >
> > Functional Genomics Bioinformatics Specialist
> > W.M. Keck Center for Comparative and Functional Genomics
> > Roy J. Carver Biotechnology Center
> > University of Illinois, Urbana-Champaign
> >
> > 330 ERML
> > 1201 W. Gregory Dr.
> > Urbana, IL 61801
> > USA
> >
> > ph: 217-244-7355
> > fax: 217-265-5066
> > e-mail: drnevich at illinois.edu
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu



More information about the Bioconductor mailing list