[BioC] new topGO results using GO.db very different from old ones using GO
Adrian Alexa
adrian.alexa at gmail.com
Fri May 2 17:07:08 CEST 2008
Hi Joern,
seems like a bug in topGO. The GO IDs are the same in both cases, but
the names are wrong in the second table. GO:0007610 is not
"reproduction" but "behavior". At the first sight looks like a bug in
the the GenTable function. but I need to look closely. There will be
an update soon.
Thanks for the report and I apologies for the bug,
Adrian
On Fri, May 2, 2008 at 2:48 PM, Joern Toedling <toedling at ebi.ac.uk> wrote:
> Dear all,
> I would appreciate any suggestion on the following issue. I have noticed a
> major inconsistency between new and older topGO results. For the older ones,
> topGO used the "GO" package, while it uses "GO.db" for the new results I
> can't figure out whether it is a problem with topGO only or whether there
> are some serious inconsistencies between GO and GO.db
>
> Here is the source code I used:
>
> library("topGO")
>
> ## load list of genes of interest
>
> load("brainOnlyGenes.RData")
>
> ## load genereal gene-to-GO mapping and universe of genes to use in
> analysis:
>
> load("mm9gene2GO.RData")
>
> load("arrayGenesWithGO.RData")
>
> ## then the function to call topGO and to return a nice result table:
>
> sigGOTable <- function(selGenes, GOgenes=arrayGenesWithGO,
> gene2GO=mm9.gene2GO[arrayGenesWithGO], ontology="BP", maxP=0.001)
>
> {
>
> inGenes <- factor(as.integer(GOgenes %in% selGenes))
>
> names(inGenes) <- GOgenes
>
> GOdata <- new("topGOdata", ontology=ontology, allGenes=inGenes,
> annot=annFUN.gene2GO, gene2GO=gene2GO)
>
> myTestStat <- new("elimCount", testStatistic=GOFisherTest,
> name="Fisher test", cutOff=maxP)
>
> mySigGroups <- getSigGroups(GOdata, myTestStat)
>
> sTab <- GenTable(GOdata, mySigGroups, topNodes=length(usedGO(GOdata)))
>
> names(sTab)[length(sTab)] <- "p.value"
>
> return(subset(sTab, as.numeric(p.value) < maxP))
>
> }#
>
> ## call it:
>
> (brainRes <- sigGOTable(brainOnlyGenes))
>
> # with topGO_1.4.0 using GO_2.0.1
>
> # this is:
>
> # GO.ID Term Annotated Significant Expected
> p.value
> # 1 GO:0007268 synaptic transmission 136 44 24.46
> 3.0e-05
> # 2 GO:0007610 behavior 180 54 32.38
> 4.4e-05
> # 3 GO:0007409 axonogenesis 119 38 21.41
> 0.00014
> # 4 GO:0006887 exocytosis 40 17 7.20
> 0.00026
> # 5 GO:0007420 brain development 136 40 24.46
> 0.00066
>
>
> # which kind of make sense if it somehow to annotate a list of interesting
> genes when investigating brain cells
>
> ## now unfortunately using all the same gene list, universe and gene-to-GO
> mapping, and the same function as above
>
> ## with topGO_1.9.0 using GO.db_2.2.0, the result is:
>
> # GO.ID Term Annotated Significant
> Expected p.value
> # 1 GO:0007268 mitochondrial genome maintenance 137 44
> 24.65 3.7e-05
> # 2 GO:0007610 reproduction 180 54
> 32.39 4.4e-05
> # 3 GO:0007409 single strand break repair 119 38
> 21.41 0.00014
> # 4 GO:0006887 regulation of DNA recombination 40 17
> 7.20 0.00026
> # 5 GO:0007420 regulation of mitotic recombination 136 40
> 24.47 0.00066
>
>
> # which is obviously very, very different
>
>
> Does anyone have an educated guess what is going on? Could it be a bug a in
> topGO? Or is the information in GO.db really different from the one in GO,
> and in that case which one is the right one?
>
> Best regards,
> Joern
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list