[BioC] KEGG overrepresentation loses genes
Anne Kupczok
anne.kupczok at univie.ac.at
Wed Apr 14 17:04:44 CEST 2010
Hello,
I observed the following problem when using the KEGG annotation with
hyperGTest: Somehow hyperGTest does not consider all genes. In the
example below, all three genes are in the category "05020" (this is what
mget(genes,envir=org.Hs.egPATH) says). In the summary of hyperGTest,
however, the category contains only two genes.
Is there an explanation of this behavior?
Thanks in advance!
Anne
> library("Category")
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
> library("org.Hs.eg.db")
Loading required package: DBI
> genes=c("1958","3553","3303")
>
>
GoHyp=new("KEGGHyperGParams",geneIds=genes,annotation="org.Hs.eg",pvalueCutoff=1,testDirection="over")
> htest=hyperGTest(GoHyp)
> s=summary(htest)
> s[1,]
KEGGID Pvalue OddsRatio ExpCount Count Size Term
1 05020 3.810228e-06 Inf 0.003960844 2 35 Prion diseases
>
> p=mget(genes,envir=org.Hs.egPATH,ifnotfound=NA)
> p
$`1958`
[1] "05020"
$`3553`
[1] "04010" "04060" "04210" "04620" "04640" "04940" "05010" "05020" "05332"
$`3303`
[1] "04010" "04144" "04612" "05020"
> geneIdsByCategory(htest,"05020")
$`05020`
[1] "1958" "3553"
> sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] KEGG.db_2.3.5 org.Hs.eg.db_2.3.6 RSQLite_0.7-3
[4] DBI_0.2-4 Category_2.12.0 AnnotationDbi_1.8.1
[7] Biobase_2.6.0
loaded via a namespace (and not attached):
[1] annotate_1.24.0 genefilter_1.28.0 graph_1.24.1 GSEABase_1.8.0
[5] RBGL_1.22.0 splines_2.10.0 survival_2.35-7 tools_2.10.0
[9] XML_2.6-0 xtable_1.5-6
>
More information about the Bioconductor
mailing list