[BioC] GOstats - defining the gene universe
Rachael McBride
rachel.mcbride at ucd.ie
Mon Oct 8 09:51:42 CEST 2007
James W. MacDonald wrote:
> Hi Rachel,
>
> Rachael McBride wrote:
>> Hi,
>>
>> I have a quick question that I can't seem to find an answer to by
>> searching the BioC lists. I want to use GOstats on a gene list. I've
>> read the vignette and understand that defining the gene universe is an
>> important step. The vignette outlines various non-specific filtering
>> steps that can be done on an expression set in order to define the
>> gene universe. My question is are the non-specific filtering steps
>> done on a normalized or un-normalized expression set.
>
> You would almost always want to use normalized expression data.
>
> The vignette actually includes some steps that by all rights would have
> occurred earlier in the analysis (namely the part where low-variance
> genes are removed).
>
> Usually the analysis proceeds something like this:
>
> Preprocess - normalize, background correct, etc.
> Filter 'uninteresting' genes to reduce multiplicity
> Make comparisons
> Do hypergeometric on the sets from the comparison step.
>
> In this case the universe you would start with would be the data you
> used to make the comparisons, which already lacks the genes you filtered
> out because they were uninteresting by some measure. At this point you
> simply want to remove any duplicates, genes lacking Entrez Gene IDs, and
> genes lacking GO terms.
>
> Best,
>
> Jim
>
>
>
>>
>> Thanks,
>> Rachael
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
Hi Jim,
Thanks for the clarification,
Rachael.
More information about the Bioconductor
mailing list