[BioC] GO enrichment for genome-wide analysis
Sean Davis
sdavis2 at mail.nih.gov
Tue Jan 10 18:42:59 CET 2006
On 1/10/06 12:39 PM, "burak kutlu" <burak_kutlu at yahoo.com> wrote:
> Thanks a lot for the answer!
> In that case, most of the software cited on the GO site are probably flawed.
> -burak
>
> Sean Davis <sdavis2 at mail.nih.gov> wrote:
>
>
> On 1/10/06 8:42 AM, "James W. MacDonald" wrote:
>
>> burak kutlu wrote:
>>> Hi, My understanding is that GOstats implements the GO term
>>> enrichment analysis for studies using microarrays (where the lib
>>> argument is passed to "GOHyperG" to define the microarray-specific
>>> environment, therefore the gene universe). I was wondering if there
>>> is a readilly-available function for determining GO term significance
>>> that uses the GO terms from the whole genome rather than the GO terms
>>> from the genes represented on an array.
>>
>> I don't think you will find this anywhere, because it doesn't make
>> sense. The idea behind GOHyperG is similar to the canonical 'ball and
>> urn' scenario used in basic stats to explain the Hypergeometric
>> distribution.
>>
>> The goal is to determine the probability of reaching into an urn
>> containing a certain number of black and white balls, removing x balls
>> and having n of those balls be white.
>>
>> Your question is akin to asking the probability of reaching into an urn
>> containing black and white balls, removing x balls and having n of those
>> balls be white, but based on the relative proportion of black and white
>> balls in the world, instead of the proportion of black and white balls
>> in the urn. Since the proportion of black and white balls may be quite
>> different in the urn as compared to the world, you cannot generalize
>> like that.
>
> As Jim points out, it's a bad idea to do what you propose, but if you really
> want to, there ARE online and standalone applications that will allow you to
> do this. One example is at:
>
> http://david.niaid.nih.gov/david/ease.htm
Note that if you do use the above link, it will typically ask for a
"background" set; this is meant to be a prompt to the user to think about
what the actual background set should be. I realize that I may have given
the wrong impression that these online tools "do it wrong". They simply
allow the user to do it wrong if he/she desires.
Sean
More information about the Bioconductor
mailing list