Hi Christine,

So from just looking at the code for Category, it looks like you might 
want to consult with the accessors for universeCounts(), 
expectedCounts() and geneCounts() etc. to get at some of your curiosity 
for these results.

IOW, when you pass your hyperGResult object to them like (for example):

universeCounts(OverCC)
expectedCounts(OverCC)
geneCounts(OverCC)

They should each tell you something about your results.  Anyhow, you can 
see the manual page for this whole family of accessors like this:

?universeCounts

I hope this helps you get sorted out what you are curious about,

  Marc



On 03/19/2014 05:07 AM, Christine Gläßer wrote:
> Dear all,
>
> I'm analyzing a set of genes for GO enrichment, given a customized
> annotation. In that annotation, a specific GO term occurs only once; in
> my set of genes, the gene attached to it is also represented.
>
> I would have expected to find the numbers -- genes having a specific GO
> term in my list being "1" and genes having the term in total, in the
> annotation, being "1" as well -- in "count" and "size", however, 13 and
> 62 genes are denoted. Please have a look at my script and the session info:
>
>
>
> SessionInfo()
>
> R version 3.0.2 (2013-09-25)
>
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>
> [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C
>
> [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8
>
> [5] LC_MONETARY=de_DE.utf8 LC_MESSAGES=de_DE.utf8
>
> [7] LC_PAPER=de_DE.utf8 LC_NAME=C
>
> [9] LC_ADDRESS=C LC_TELEPHONE=C
>
> [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
>
> [1] parallel stats graphics grDevices utils datasets methods
>
> [8] base
>
> other attached packages:
>
> [1] GO.db_2.9.0 GSEABase_1.22.0 annotate_1.38.0
>
> [4] GOstats_2.26.0 RSQLite_0.11.4 DBI_0.2-7
>
> [7] graph_1.38.3 Category_2.26.0 AnnotationDbi_1.22.6
>
> [10] Biobase_2.20.1 BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
>
> [1] AnnotationForge_1.2.2 genefilter_1.42.0 IRanges_1.18.4
>
> [4] RBGL_1.36.2 splines_3.0.2 stats4_3.0.2
>
> [7] survival_2.37-7 tcltk_3.0.2 tools_3.0.2
>
> [10] XML_3.98-1.1 xtable_1.7-1
>
>
> ## Script
>
> library(GOstats)
>
> library(GSEABase)
>
> library(GO.db)
>
>
> ANNO_FILE <- "genes_anno_forgostats.txt"
>
> UNIVERSE_FILE <-"universe.txt"
>
> ORG <- "MY_Organism"
>
> GENE_FILE<-"targets.txt"
>
> OUT_FILE<-"results.txt"
>
>
> goframeData <- read.table(ANNO_FILE,header=T)
>
> universe <- scan(file=UNIVERSE_FILE,what="character")
>
> goFrame = GOFrame(goframeData, organism = ORG)
>
> goAllFrame = GOAllFrame(goFrame)
>
> gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
>
>
> genes <- scan(file=GENE_FILE,what="character")
>
>
> NAMESPACE="MF"
>
> x = length(NAMESPACE)
>
> params <- GSEAGOHyperGParams(name="custom",geneSetCollection = gsc,
> geneIds = genes, universeGeneIds = universe, ontology = NAMESPACE,
> pvalueCutoff = 0.05, conditional=TRUE, testDirection="over")
>
> OverMF <- hyperGTest(params)
>
>
> NAMESPACE="BP"
>
> x = length(NAMESPACE)
>
> params <- GSEAGOHyperGParams(name="custom",geneSetCollection = gsc,
> geneIds = genes, universeGeneIds = universe, ontology = NAMESPACE,
> pvalueCutoff = 0.05, conditional=TRUE, testDirection="over")
>
> OverBP <- hyperGTest(params)
>
>
> NAMESPACE="CC"
>
> x = length(NAMESPACE)
>
> params <- GSEAGOHyperGParams(name="custom",geneSetCollection = gsc,
> geneIds = genes, universeGeneIds = universe, ontology = NAMESPACE,
> pvalueCutoff = 0.05, conditional=TRUE, testDirection="over")
>
> OverCC <- hyperGTest(params)
>
>
> write.table(summary(OverMF),file=OUTFILE,quote=F,sep='\t', append=T)
>
> write.table(summary(OverBP),file=OUTFILE,quote=F,sep='\t', append=T)
>
> write.table(summary(OverCC),file=OUTFILE,quote=F,sep='\t', append=T)
>
>
>
>
> If I run GOStats unconditionally, other GO terms being parent terms to
> the one I'm looking at also pop up. The numbers (count, size) are the
> same for all terms; my assumption is that genes with child terms
> belonging to a specific parent term are summed up in count/size, but I'm
> absolutely not sure if this is the case... Could you please help me
> figuring out what is denoted by the numbers in "counts" and "size", if
> not the absolute numbers like I would have expected?
>
>
> Kind regards, and many thanks in advance,
>
> Christine Gläßer
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


	[[alternative HTML version deleted]]

