[BioC] GO Term Enrichment Analysis for ath1121501
suganya [guest]
guest at bioconductor.org
Tue Nov 20 09:27:06 CET 2012
Iam trying to do a enrichment analysis and working with data from ath1121501 (arabidopsis) arrays.I have problems in defining the universe and the selected genes though I referred to a number of documentations:
My data is as follows:
myd #str(myd)
chr [1:22810] "244901_at" "244902_at" "244903_at" "244904_at" "244905_at" "244906_at" ...
wolunilist# set of selected genes(DEG): str(wolunilist)
chr [1:831] "244901_at" "244910_s_at" "245000_at" "245001_at" "245002_at" "245003_at" ...
The coding I used was :
locus <- unlist(get(myd, ath1121501ACCNUM)) # in this case it only fetches a single id
head(locus) [1] "ATMG00640"
selected<-unlist(get(wolunilist,ath1121501ACCNUM))# in this case it only fetches a single id
head(selected) [1] "ATMG00640"
params <- new("GOHyperGParams", geneIds = selected, universeGeneIds = locus, annotation="ath1121501",
+ ontology = "MF", pvalueCutoff = 0.5, conditional = FALSE, testDirection = "over")
hgOver <- hyperGTest(params)
And I get the following error:
debugging in: getUniverseHelper(probes, datPkg, entrezIds)
debug: {
univ <- unique(unlist(mget(probes, ID2EntrezID(datPkg))))
if (!missing(entrezIds) && !is.null(entrezIds) && length(entrezIds) >
0)
univ <- intersect(univ, unlist(entrezIds))
if (length(univ) < 1)
stop("After filtering, there are no valid IDs that can be used as the Gene universe.\n Check input values to confirm they are the same type as the central ID used by your annotation package.\n For chip packages, this will still mean the central GENE identifier used by the package (NOT the probe IDs).")
univ
}
I tried several possibilities :
mget(c("244901_at", "244902_at", "244903_at", "244904_at", "244905_at"),ath1121501ACCNUM)
I get:
$`244901_at`
[1] "ATMG00640"
$`244902_at`
[1] "ATMG00650"
$`244903_at`
[1] "ATMG00660"
$`244904_at`
[1] "ATMG00670"
$`244905_at`
[1] "ATMG00680"
sel<-unlist(mget(wolunilist,ath1121501ACCNUM))
Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
value for "245651_s_at" not found
I am unable to figure out where am going wrong.If I do for a few ids , Iam able to do but if I store it as a character vector I get the above error.
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: i686-redhat-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] annaffy_1.28.0 KEGG.db_2.7.1 GO.db_2.7.1 GOstats_2.22.0 graph_1.34.0
[6] Category_2.22.0 ath1121501.db_2.7.1 org.At.tair.db_2.7.1 RSQLite_0.10.0 DBI_0.2-5
[11] AnnotationDbi_1.18.4 BiocInstaller_1.4.9 limma_3.12.3 affy_1.34.0 Biobase_2.16.0
[16] BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] affyio_1.22.0 annotate_1.34.1 genefilter_1.38.0 GSEABase_1.18.0
[5] IRanges_1.14.4 preprocessCore_1.18.0 RBGL_1.32.1 splines_2.15.2
[9] stats4_2.15.2 survival_2.36-14 tools_2.15.2 XML_3.9-4
[13] xtable_1.6-0 zlibbioc_1.2.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list