[BioC] about columm "size" in out of hyperGtest ( Gostat package)
Marc Carlson
mcarlson at fhcrc.org
Fri May 29 00:58:22 CEST 2009
Hi Greg,
I am a little confused by your description of what you did. After
peering at your explanation, I am still not completely certain that I
understand what your question is. But, there is a nice description of
how the gene universe can affect the number of things you find in the
GOstats vignette titled "Hypergeometric Tests Using GOstats". Perhaps
this can help you?
http://www.bioconductor.org/packages/devel/bioc/html/GOstats.html
Marc
gregory voisin wrote:
> Hi,
>
> I need a precision about columm "size" in out of hyperGtest ( Gostat package)
>
> In https://stat.ethz.ch/pipermail/bioconductor/2006-December/015346.html
> we can read: "The "Size" column is the number of genes annotated at the given GO
> term (where genes are restricted to the defined gene universe)"
> Hence, for a given Term and given platform, we must have a constant number.
>
> I explain:
> first set data : A contains 687 probesets
> I practise a hyperGotest:
> This is an extract from the result:
> GOBPID Pvalue OddsRatio ExpCount Count Size Term
>
> 36 GO:0008283 0.0180640706 1.913970 8.20236088 15 266 cell proliferation
>
> If I inderstand well: 266 probesets on affy HGU133.2.plus are annotated "cell proliferation"
>
>
> Then,
>
> I practise the same analysis on a second set (B) , inclusive of A : 414 probesets
>
> result :
> GOBPID Pvalue OddsRatio ExpCount Count Size Term
>
> 20 GO:0008283 0.008295992 1.765957 14.97834828 25 745 cell proliferation
>
> Here, that's mean that 745 probesets are annotated "cell proliferation"
>
>
>
> Why the number of size for the same term is not the same?
>
> Moreover, B being inclusive of A , the 25 probesets annotated "cell proliferation " , discovered in B analysis are reduced to 15 probesets in A analysis. Normally, in A analysis, I should have at least 25 probesets annotated "cell proleferation".
>
> Why didn't I find at least 25 probesets in A analysis ?
>
>
>
>
> Thanks
> Greg
>
>
>
>> sessionInfo()
>>
> R version 2.8.1 (2008-12-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=French_Canada.1252;LC_CTYPE=French_Canada.1252;LC_MONETARY=French_Canada.1252;LC_NUMERIC=C;LC_TIME=French_Canada.1252
>
> attached base packages:
> [1] splines tools stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] GOstats_2.8.0 Category_2.8.4 genefilter_1.22.0 survival_2.34-1 RBGL_1.18.0 annotate_1.20.1 xtable_1.5-4
> [8] graph_1.20.0 GO.db_2.2.5 hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.4.3 Biobase_2.2.2
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 GSEABase_1.4.0 XML_1.99-0
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list