[BioC] GOHyperGResult in a data.frame, possible NAMESPACE problem?

Robert Castelo robert.castelo at upf.edu
Mon Jan 11 20:05:19 CET 2010


dear list,

i have the following GO reporting function that builds a data frame with
the output of the call to summary() on a GOHyperGResult object resulting
from testing for GO enrichment with the package GOstats, plus the gene
symbols and entrez ids of the genes that provide enrichment to each
category, all nicely ordered by odds ratio of enrichment. the third
argument allows one to produce latex code that highlight genes of
interest in bold face but which can be set to NULL to avoid that:

GOreport <- function(goHypGresult, chip, highlightedGenes=NULL) {

  cats <- sigCategories(goHypGresult)
  reportGenes <- vector()
  for (i in 1:length(cats)) {
    reportGenes <- append(reportGenes, geneIdsByCategory(goHypGresult,
cats[i]))
  }
  reportGeneSyms <- sapply(reportGenes, function(egIDs) {
    syms <- as.vector(unlist(mget(egIDs, getAnnMap(map="SYMBOL",
chip=chip,
                                                   type="db"))))
    syms <- sort(syms)
    syms <- sapply(1:length(egIDs),
                   function(i, egIDs) {
                     s <- syms[i]
                     if (!is.null(highlightedGenes) &&
                         !is.na(match(egIDs[i],
                                      highlightedGenes))) {
                       s <- sprintf("{\\bf %s}", s)
                     }
                     s
                   }, egIDs)
    paste(syms, collapse=", ")
  })
  reportGenes <- sapply(reportGenes, function(x) {
                                       paste(x, collapse=",")
                                     })
  report <- data.frame(summary(goHypGresult), GeneSyms=reportGeneSyms,
Genes=reportGenes)
  rownames(report) <- NULL
  report[sort(report$"OddsRatio", decreasing=TRUE,
index.return=TRUE)$ix, ]
}


this function forms part of a package i'm building and when i use it as
follows gives the error shown below:

library(annotate)
library(org.Hs.eg.db)
library(GOstats)
library(myPkg) ## this would be the package where GOreport resides

## use the genes from the death GO category as test
deathEGs <- org.Hs.egGO2EG[["GO:0016265"]]
## sample 100 genes randomly and add the death genes as universe
set.seed(123)
universeEGs <- unique(c(sample(mappedLkeys(org.Hs.egGO2EG), size=100),
deathEGs))

## test for GO enrichment
goHypGparams <- new("GOHyperGParams",
                    geneIds=deathEGs,
                    universeGeneIds=universeEGs,
                    annotation="org.Hs.eg.db",
                    ontology="BP",
                    pvalueCutoff=0.05,
                    conditional=TRUE,
                    testDirection="over")
goHypGcond <- hyperGTest(goHypGparams)

## call the problematic function
report <- GOreport(goHypGcond, "org.Hs.eg.db", NULL)
Error in do.call("expand.grid", dimnames(x)) : 
  second argument must be a list
> traceback()
9: stop("second argument must be a list")
8: do.call("expand.grid", dimnames(x))
7: data.frame(do.call("expand.grid", dimnames(x)), Freq = c(x), 
       row.names = row.names)
6: eval(expr, envir, enclos)
5: eval(ex)
4: as.data.frame.table(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors)
3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors)
2: data.frame(summary(goHypGresult), GeneSyms = reportGeneSyms, 
       Genes = reportGenes)
1: GOreport(goHypGcond, "org.Hs.eg.db", NULL)

so there is something wrong with the instruction that builds the
data.frame at the very end of the function.

however, if i paste the function on the R shell and i call it again
works smoothly:

GOreport(goHypGcond, "org.Hs.eg.db", NULL)
1 GO:0016265 7.966717e-05       Inf 0.4500000     4    9
3 GO:0032501 6.718598e-03       Inf 1.2000000     4   24
6 GO:0050896 3.279279e-02       Inf 0.3733333     2   14
2 GO:0007517 5.598832e-03  75.00000 0.1500000     2    3
4 GO:0007610 1.802312e-02  24.33333 0.2500000     2    5
5 GO:0048731 1.956272e-02  16.00000 0.7500000     3   15
7 GO:0009653 4.784456e-02  11.66667 0.4000000     2    8
                                Term                     GeneSyms
1                              death AFG3L2, RAG1, SLC18A2, TCF15
3   multicellular organismal process AFG3L2, RAG1, SLC18A2, TCF15
6               response to stimulus                 AFG3L2, RAG1
2                 muscle development                AFG3L2, TCF15
4                           behavior               SLC18A2, TCF15
5                 system development          AFG3L2, RAG1, TCF15
7 anatomical structure morphogenesis                AFG3L2, TCF15
                 Genes
1 10939,5896,6571,6939
3 10939,5896,6571,6939
6           10939,5896
2           10939,6939
4            6571,6939
5      10939,5896,6939
7           10939,6939

this is really puzzling for me and i suspect that i'm confronted with
some general problem regarding namespaces or so, thus these are the
contents of the NAMESPACE file from this package myPkg containing the
GOreport function:

exportPattern("^[[:alpha:]]+")
importFrom(annotate, getAnnMap)
importFrom(AnnotationDbi, mget)
importFrom(IRanges, unique)
importMethodsFrom(GOstats)

and these are the contents of the DESCRIPTION file

Package: myPkg
Type: Package
Title: What the package does (short line)
Version: 1.0
Date: 2010-01-11
Author: Who wrote it
Description: More about what it does (maybe more than one line)
Depends: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats,
IRanges
Imports: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats,
IRanges
Maintainer: Who to complain to <yourfault at somewhere.net>
License: What license is it under?

i guess this is difficult to reproduce since if you simply paste the
function is going to work for you and the problem arises only within the
context of a package, but any hint on the possible reason related to the
namespace or whatever else you think will be highly appreciated.

thanks!
robert.

sessionInfo()
R version 2.9.1 (2009-06-26) 
x86_64-unknown-linux-gnu 

locale:
C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base     

other attached packages:
 [1] GO.db_2.2.11                   myPkg_1.0
 [3] IRanges_1.2.3                  GOstats_2.10.0                
 [5] graph_1.22.3                   Category_2.10.1               
 [7] org.Hs.eg.db_2.2.11            RSQLite_0.7-2                 
 [9] DBI_0.2-4                      annotate_1.22.0               
[11] AnnotationDbi_1.6.1            Biobase_2.4.1                 

loaded via a namespace (and not attached):
[1] GSEABase_1.6.1    RBGL_1.20.0       XML_2.6-0
genefilter_1.24.2
[5] splines_2.9.1     survival_2.35-7   tools_2.9.1       xtable_1.5-6



More information about the Bioconductor mailing list