[BioC] Problem with the function hyperGtest from GOstats package

Arne.Muller at sanofi-aventis.com Arne.Muller at sanofi-aventis.com
Fri Mar 16 10:22:25 CET 2007


Hello,

Whether genes not included in the subset of GO terms should be removed
from the universe or not depends on the question one asks (the
hypothesis). If the subset of GO terms represent what you're interested
in but you want to know the chance of observing these terms under
consideration of the entire GO BP tree, you need to leave the
un-annotated genes in the universe. This would be the same to test all
GO BP terms and extracting the subset of terms afterwards ... (but it's
less elegant I think ;-).

I suggest to make this an option in "cateogrySubsetIds".

  Kind regards,

  Arne

>-----Original Message-----
>From: bioconductor-bounces at stat.math.ethz.ch 
>[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of 
>Seth Falcon
>Sent: Thursday, March 15, 2007 5:30 PM
>To: James W. MacDonald
>Cc: Biton, Anne PH/FR; bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] Problem with the function hyperGtest from 
>GOstats package
>
>"James W. MacDonald" <jmacdon at med.umich.edu> writes:
>> As you already noted, the man page states
>>
>> 'cateogrySubsetIds': Object of class '"ANY"': If the test method
>>            supports it, can be used to specify a subset of 
>category ids
>>            to include in the test instead of all possible 
>category ids.
>>
>> I don't know which test method supports this argument, but apparently
>> hyperGTest() doesn't.
>
>Unfortunately, the "cateogrySubsetIds" is a half-implemented 
>feature and hyperGTest ignores it.  I will add it to my list, 
>just after the "spell check code" item for the next release ;-)
>
>The reason that you can't simply test all of the GO IDs and 
>then subset after testing is that in the current 
>implementation, the universe of gene IDs is determined in part 
>by requiring that each gene have at least one annotation in 
>the set of GO IDs.  Hence, reducing the set of GO IDs tested 
>could remove some gene IDs from the universe and that will 
>change the results for all tests.
>
>Now whether removing gene IDs from the universe that have no 
>GO annotation is the right thing to do could be up for 
>discussion.  My argument is that removal is good because it 
>makes the test more conservative.  If you leave them in, all 
>you do is increase the size of the gene universe and this 
>tends to make any over-represented GO IDs look all the more impressive.
>
>So, sorry for the teaser w.r.t. to a method for subsetting the 
>category.  I hope to have code that can handle that for the 
>next release.
>
>Best,
>
>+ seth
>
>--
>Seth Falcon | Computational Biology | Fred Hutchinson Cancer 
>Research Center http://bioconductor.org
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list