[BioC] GOstats hyperGTest question
Seth Falcon
sfalcon at fhcrc.org
Fri Jan 26 05:59:35 CET 2007
Hi Ivan,
ivan.borozan at utoronto.ca writes:
> I got following results using hyperGTest(params) with a given list of genes
>
>> summary(hgOver)
> GOBPID Pvalue OddsRatio ExpCount Count Size
> 1 GO:0030185 0.000000e+00 -73.314685 0.02692165 2 1
> 2 GO:0006067 0.000000e+00 -110.746479 0.05384330 3 2
> 3 GO:0006069 0.000000e+00 -110.746479 0.05384330 3 2
Hmm, that is a suspect result. One would expect Size >= Count. In
the current devel version of Category and GOstats, I have added code
to verify that the selected gene list (geneIds) and the gene universe
do not contain any duplicates. Could you verify that your input does
not contain duplicate IDs either in the selected list or the universe?
> If for example I look at genes that are associated with the first GO
> term (i.e GO:0030185) I get:
>
>
>> probeSetSummary(hgOver)[[1]]
> EntrezID ProbeSetID selected
> 1 3043 144221 0
> 2 3043 148425 0
> 3 3043 3108408 0
> 4 3043 5708746 0
This is, of course, also surprising, but it is difficult to assess
what is going on without knowing more details of what data you used as
input. Are you sure that all Entrez IDs in geneIds(params) are
represented by at least one probe set on the chip?
> My question is how are Counts (in this case Count = 2) in the above
> summary(hgOver) table obtained ?
The details are in the code, but the intention is that Count is the
intersection of the selected gene list with the Entrez IDs annotated
at the given GO term.
> Looking at probeSetSummary(hgOver)[[1]] I can see one EntrezID
> (EntrezID = 3043) and 4 ProbeSetID associated with this particular
> node (i.e GO:0030185).
That just tells you that there are 4 probesets that interrogate Entrez
ID 3043. The count in the hyperGTest result tells you that 2 Entrez
IDs from the selected gene list are in the list of genes annotated at
GO:0030185.
I have added a considerable amount of detail to the GOstats vignette
in the current devel repository and I would suggest reading over it:
http://www.bioconductor.org/packages/1.9/bioc/html/GOstats.html
+ seth
More information about the Bioconductor
mailing list