[BioC] GOstats gene set size selection
Sean MacEachern
sean.maceach at gmail.com
Thu Apr 17 18:06:49 CEST 2008
Hi Alex,
I'm not too sure if this helps with your question, but I'll put my two cents
in... I am working with chickens and trying to create a large list of genes
for an eQTL study from an initial simple microarray design that compares
resistant vs susceptible birds, due to the small number of genes that I have
found with differential expression I have attempted to increase the size of
my list by examining significant GO terms. Most of the GO terms I have
pulled out using hyperGTest are not very helpful due to their breadth.
I have found the Category package a little more helpful. Kegg pathways are a
little more specific and you can create an adjacency matrix and use the
rowSums() command to filter your dataset. I think you can also treat GO
terms as categories if you need to. It might be a little of topic, but it
could be worth looking at.
Cheers,
Sean
On 4/17/08 7:28 AM, "alex lam (RI)" <alex.lam at roslin.ed.ac.uk> wrote:
> Dear colleagues,
>
> I have been following the GOstats vignette to test GO terms association.
> I would like to know whether it is possible to set limits on the number
> of selected genes in GO term and the size of that term on my affy chip?
>
> For example, can I tell hyperGTest to skip testing a GO term if the
> number of significant genes in that term is under, say, 3, or if there
> are more than 400 genes of that GO term on the chip?
>
> Currently I found many of my significant GO terms not very specific. As
> I am trying to incorporate GOstats to an expression QTL (eQTL) genome
> scan, I get a lot of output. Therefore, ideally I would like to filter
> out these terms before test rather than screening the results after
> test. Is there such an option with hyperGTest?
>
> Many thanks for your advice,
> Alex
>
>> sessionInfo()
> R version 2.6.2 Patched (2008-03-24 r44882)
> x86_64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
> TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
> 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
> ENTIFICATION=C
>
> attached base packages:
> [1] splines tools stats graphics grDevices utils datasets
> [8] methods base
>
> other attached packages:
> [1] GOstats_2.4.0 Category_2.4.0 genefilter_1.16.0
> [4] survival_2.34 RBGL_1.14.0 annotate_1.16.1
> [7] xtable_1.5-2 GO.db_2.0.2 AnnotationDbi_1.0.6
> [10] RSQLite_0.6-8 DBI_0.2-4 Biobase_1.16.3
> [13] graph_1.16.1
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.10
>>
>
> --------------------------------------------
> Alex C. Lam
> Roslin Institute (Edinburgh)
> Midlothian
> EH25 9PS
> United Kingdom
> Tel: +44 131 5274471
>
> Former email address: alex.lam at bbsrc.ac.uk
> New email address: alex.lam at roslin.ed.ac.uk
> Both addresses are functional
>
> Roslin Institute is a company limited by guarantee, registered in
> Scotland (registered number SC157100) and a Scottish Charity (registered
> number SC023592). Our registered office is at Roslin, Midlothian, EH25
> 9PS. VAT registration number 847380013.
>
> The information contained in this e-mail (including any attachments) is
> confidential and is intended for the use of the addressee only. The
> opinions expressed within this e-mail (including any attachments) are
> the opinions of the sender and do not necessarily constitute those of
> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> stated by a sender who is duly authorised to do so on behalf of the
> Institute
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list