[BioC] hyperGTest on KEGG and PFAM with org.XX.eg annotations

James F. Reid james.reid at ifom-ieo-campus.it
Fri Nov 21 11:34:19 CET 2008


Dear list,

hyperGTest behaves differently when using org.XX.eg.db packages compared 
to microarray based ones, like hgu95av2.db for example, for doing a KEGG 
analysis. hyperGTest complains if the annotation string does not end 
with the suffix ".db", it works if you add it but then you can't run a 
summary on the result. A quick fix is to re-assign the ".db"-less string 
to the annotation slot of the hyperGTest result.
So I am wondering if I am doing something wrong of if it is a bug.

For the PFAM analysis everything works fine except that in the summary 
output the Term (Description) is just the PFAMID which is not very 
useful for interpretation. I think this could easily be fixed by using 
the same approach as for the KEGG output in the PFAMHyperGResult summary 
method:
## implicit require("PFAM.db")
pfamEnv <- getAnnMap("DE", "PFAM", load=TRUE)
pfamTerms <- unlist(mget(pfamIds, pfamEnv, ifnotfound=NA))


Many thanks,
James.

Here is a session reporting the problem:

library("Category")
library("org.Hs.eg.db")

set.seed(123)
geneBackground <- Lkeys(org.Hs.egPATH)
geneList <- sample(geneBackground, 500)

params <- new("KEGGHyperGParams",
               geneIds = geneList,
               universeGeneIds = geneBackground,
               annotation = "org.Hs.eg")
hgKEGG <- hyperGTest(params)
#  Error in get(paste(lib, name, sep = "")) :
#    variable "org.Hs.egPATH2PROBE" was not found

params at annotation <- "org.Hs.eg.db"
hgKEGG <- hyperGTest(params)
summary(hgKEGG)
#  Error in get(paste(annotation(object), "ORGANISM", sep = "")) :
#    variable "org.Hs.eg.dbORGANISM" was not found

hgKEGG at annotation <- "org.Hs.eg"
summary(hgKEGG)
#  KEGGID      Pvalue OddsRatio  ExpCount Count Size
#1  05130 0.003282103  7.314332 0.6239536     4   51
#2  05131 0.003282103  7.314332 0.6239536     4   51
#                                          Term
#1 Pathogenic Escherichia coli infection - EHEC
#2 Pathogenic Escherichia coli infection - EPEC


 > sessionInfo()
R version 2.8.0 (2008-10-20)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
  [1] KEGG.db_2.2.5       org.Hs.eg.db_2.2.6  RSQLite_0.7-1
  [4] DBI_0.2-4           Category_2.8.1      genefilter_1.22.0
  [7] survival_2.34-1     annotate_1.20.1     xtable_1.5-4
[10] AnnotationDbi_1.4.1 graph_1.20.0        Biobase_2.2.1

loaded via a namespace (and not attached):
[1] cluster_1.11.11 GSEABase_1.4.0  RBGL_1.18.0     XML_1.98-1



More information about the Bioconductor mailing list