[BioC] GOHyperGResult in a data.frame, possible NAMESPACE problem?
Robert Castelo
robert.castelo at upf.edu
Mon Jan 11 20:05:19 CET 2010
dear list,
i have the following GO reporting function that builds a data frame with
the output of the call to summary() on a GOHyperGResult object resulting
from testing for GO enrichment with the package GOstats, plus the gene
symbols and entrez ids of the genes that provide enrichment to each
category, all nicely ordered by odds ratio of enrichment. the third
argument allows one to produce latex code that highlight genes of
interest in bold face but which can be set to NULL to avoid that:
GOreport <- function(goHypGresult, chip, highlightedGenes=NULL) {
cats <- sigCategories(goHypGresult)
reportGenes <- vector()
for (i in 1:length(cats)) {
reportGenes <- append(reportGenes, geneIdsByCategory(goHypGresult,
cats[i]))
}
reportGeneSyms <- sapply(reportGenes, function(egIDs) {
syms <- as.vector(unlist(mget(egIDs, getAnnMap(map="SYMBOL",
chip=chip,
type="db"))))
syms <- sort(syms)
syms <- sapply(1:length(egIDs),
function(i, egIDs) {
s <- syms[i]
if (!is.null(highlightedGenes) &&
!is.na(match(egIDs[i],
highlightedGenes))) {
s <- sprintf("{\\bf %s}", s)
}
s
}, egIDs)
paste(syms, collapse=", ")
})
reportGenes <- sapply(reportGenes, function(x) {
paste(x, collapse=",")
})
report <- data.frame(summary(goHypGresult), GeneSyms=reportGeneSyms,
Genes=reportGenes)
rownames(report) <- NULL
report[sort(report$"OddsRatio", decreasing=TRUE,
index.return=TRUE)$ix, ]
}
this function forms part of a package i'm building and when i use it as
follows gives the error shown below:
library(annotate)
library(org.Hs.eg.db)
library(GOstats)
library(myPkg) ## this would be the package where GOreport resides
## use the genes from the death GO category as test
deathEGs <- org.Hs.egGO2EG[["GO:0016265"]]
## sample 100 genes randomly and add the death genes as universe
set.seed(123)
universeEGs <- unique(c(sample(mappedLkeys(org.Hs.egGO2EG), size=100),
deathEGs))
## test for GO enrichment
goHypGparams <- new("GOHyperGParams",
geneIds=deathEGs,
universeGeneIds=universeEGs,
annotation="org.Hs.eg.db",
ontology="BP",
pvalueCutoff=0.05,
conditional=TRUE,
testDirection="over")
goHypGcond <- hyperGTest(goHypGparams)
## call the problematic function
report <- GOreport(goHypGcond, "org.Hs.eg.db", NULL)
Error in do.call("expand.grid", dimnames(x)) :
second argument must be a list
> traceback()
9: stop("second argument must be a list")
8: do.call("expand.grid", dimnames(x))
7: data.frame(do.call("expand.grid", dimnames(x)), Freq = c(x),
row.names = row.names)
6: eval(expr, envir, enclos)
5: eval(ex)
4: as.data.frame.table(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors)
3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors)
2: data.frame(summary(goHypGresult), GeneSyms = reportGeneSyms,
Genes = reportGenes)
1: GOreport(goHypGcond, "org.Hs.eg.db", NULL)
so there is something wrong with the instruction that builds the
data.frame at the very end of the function.
however, if i paste the function on the R shell and i call it again
works smoothly:
GOreport(goHypGcond, "org.Hs.eg.db", NULL)
1 GO:0016265 7.966717e-05 Inf 0.4500000 4 9
3 GO:0032501 6.718598e-03 Inf 1.2000000 4 24
6 GO:0050896 3.279279e-02 Inf 0.3733333 2 14
2 GO:0007517 5.598832e-03 75.00000 0.1500000 2 3
4 GO:0007610 1.802312e-02 24.33333 0.2500000 2 5
5 GO:0048731 1.956272e-02 16.00000 0.7500000 3 15
7 GO:0009653 4.784456e-02 11.66667 0.4000000 2 8
Term GeneSyms
1 death AFG3L2, RAG1, SLC18A2, TCF15
3 multicellular organismal process AFG3L2, RAG1, SLC18A2, TCF15
6 response to stimulus AFG3L2, RAG1
2 muscle development AFG3L2, TCF15
4 behavior SLC18A2, TCF15
5 system development AFG3L2, RAG1, TCF15
7 anatomical structure morphogenesis AFG3L2, TCF15
Genes
1 10939,5896,6571,6939
3 10939,5896,6571,6939
6 10939,5896
2 10939,6939
4 6571,6939
5 10939,5896,6939
7 10939,6939
this is really puzzling for me and i suspect that i'm confronted with
some general problem regarding namespaces or so, thus these are the
contents of the NAMESPACE file from this package myPkg containing the
GOreport function:
exportPattern("^[[:alpha:]]+")
importFrom(annotate, getAnnMap)
importFrom(AnnotationDbi, mget)
importFrom(IRanges, unique)
importMethodsFrom(GOstats)
and these are the contents of the DESCRIPTION file
Package: myPkg
Type: Package
Title: What the package does (short line)
Version: 1.0
Date: 2010-01-11
Author: Who wrote it
Description: More about what it does (maybe more than one line)
Depends: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats,
IRanges
Imports: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats,
IRanges
Maintainer: Who to complain to <yourfault at somewhere.net>
License: What license is it under?
i guess this is difficult to reproduce since if you simply paste the
function is going to work for you and the problem arises only within the
context of a package, but any hint on the possible reason related to the
namespace or whatever else you think will be highly appreciated.
thanks!
robert.
sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-unknown-linux-gnu
locale:
C
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] GO.db_2.2.11 myPkg_1.0
[3] IRanges_1.2.3 GOstats_2.10.0
[5] graph_1.22.3 Category_2.10.1
[7] org.Hs.eg.db_2.2.11 RSQLite_0.7-2
[9] DBI_0.2-4 annotate_1.22.0
[11] AnnotationDbi_1.6.1 Biobase_2.4.1
loaded via a namespace (and not attached):
[1] GSEABase_1.6.1 RBGL_1.20.0 XML_2.6-0
genefilter_1.24.2
[5] splines_2.9.1 survival_2.35-7 tools_2.9.1 xtable_1.5-6
More information about the Bioconductor
mailing list