[BioC] GOstats gene set size selection

Robert Gentleman rgentlem at fhcrc.org
Fri Apr 18 15:52:03 CEST 2008


Hi,

alex lam (RI) wrote:
> Dear colleagues,
> 
> I have been following the GOstats vignette to test GO terms association.
> I would like to know whether it is possible to set limits on the number
> of selected genes in GO term and the size of that term on my affy chip?
> 
> For example, can I tell hyperGTest to skip testing a GO term if the
> number of significant genes in that term is under, say, 3, or if there
> are more than 400 genes of that GO term on the chip? 

   It is not possible to skip the testing, but you can skip the 
reporting, and only for small gene sets, there is no upper limit, 
although I may have time to add one.  It also lets you filter out GO 
categories on p-value.

Please have a look at the vignette, which does discuss this in some detail.

> 
> Currently I found many of my significant GO terms not very specific. As
> I am trying to incorporate GOstats to an expression QTL (eQTL) genome
> scan, I get a lot of output. Therefore, ideally I would like to filter
> out these terms before test rather than screening the results after
> test. Is there such an option with hyperGTest?

   The vignette and the code for summary should give you some reasonable 
options for filtering the results,

   best wishes
     Robert

> 
> Many thanks for your advice,
> Alex
> 
>    > sessionInfo()
> R version 2.6.2 Patched (2008-03-24 r44882)
> x86_64-unknown-linux-gnu
> 
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
> TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
> 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
> ENTIFICATION=C
> 
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     datasets
> [8] methods   base
> 
> other attached packages:
>  [1] GOstats_2.4.0       Category_2.4.0      genefilter_1.16.0
>  [4] survival_2.34       RBGL_1.14.0         annotate_1.16.1
>  [7] xtable_1.5-2        GO.db_2.0.2         AnnotationDbi_1.0.6
> [10] RSQLite_0.6-8       DBI_0.2-4           Biobase_1.16.3
> [13] graph_1.16.1
> 
> loaded via a namespace (and not attached):
> [1] cluster_1.11.10
> 
> --------------------------------------------
> Alex C. Lam
> Roslin Institute (Edinburgh)
> Midlothian
> EH25 9PS
> United Kingdom
> Tel: +44 131 5274471
> 
> Former email address: alex.lam at bbsrc.ac.uk
> New email address: alex.lam at roslin.ed.ac.uk
> Both addresses are functional
> 
> Roslin Institute is a company limited by guarantee, registered in
> Scotland (registered number SC157100) and a Scottish Charity (registered
> number SC023592). Our registered office is at Roslin, Midlothian, EH25
> 9PS. VAT registration number 847380013.
> 
> The information contained in this e-mail (including any attachments) is
> confidential and is intended for the use of the addressee only.   The
> opinions expressed within this e-mail (including any attachments) are
> the opinions of the sender and do not necessarily constitute those of
> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> stated by a sender who is duly authorised to do so on behalf of the
> Institute
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list