[BioC] problem with GO terms

Ina Hoeschele inah at vbi.vt.edu
Tue Nov 22 20:10:16 CET 2011


thanks again, Jim ... to the best of my knowledge I am not using anything from the past - below is some part of my code (omitting everything that should not be relevant).
Thanks and sorry - there must be an obvious reason ...

library(illuminaHumanv4.db)
library(annotate)
library("GO.db")
library("KEGG.db")
library("PFAM.db")
library("GOstats")

DATA <- read.csv(file=filename2,header=TRUE,sep=",")

topProbeIDs <- DATA$PROBE_ID[1:top]
topProbeIDs <- as.character(topProbeIDs)
universeProbeIDs <- DATA$PROBE_ID
universeProbeIDs <- as.character(universeProbeIDs)
topPvals <- DATA$AGE_pval_expr[1:top]
universePvals <- DATA$AGE_pval_expr

entrezIDs <- getEG(topProbeIDs, "illuminaHumanv4")
topProbeIDs1 <- topProbeIDs[is.na(entrezIDs)==FALSE]
entrezIDs1 <- entrezIDs[is.na(entrezIDs)==FALSE]
topPvals1 <- topPvals[is.na(entrezIDs)==FALSE]
universeEntrezIDs <- getEG(universeProbeIDs, "illuminaHumanv4")
universeProbeIDs1 <- universeProbeIDs[is.na(universeEntrezIDs)==FALSE]
universeEntrezIDs1 <- universeEntrezIDs[is.na(universeEntrezIDs)==FALSE]
universePvals1 <- universePvals[is.na(universeEntrezIDs)==FALSE]

GOannot1 <- getGO(topProbeIDs1, "illuminaHumanv4")
topProbeIDs2 <- topProbeIDs1[is.na(GOannot1)==FALSE]
entrezIDs2 <- entrezIDs1[is.na(GOannot1)==FALSE]
topPvals2 <- topPvals1[is.na(GOannot1)==FALSE]
GOannot2 <- GOannot1[is.na(GOannot1)==FALSE]
universeGOannot1 <- getGO(universeProbeIDs1, "illuminaHumanv4")
universeProbeIDs2 <- universeProbeIDs1[is.na(universeGOannot1)==FALSE]
universeEntrezIDs2 <- universeEntrezIDs1[is.na(universeGOannot1)==FALSE]
universePvals2 <- universePvals1[is.na(universeGOannot1)==FALSE]
universeGOannot2 <- universeGOannot1[is.na(universeGOannot1)==FALSE]

...

params_BP_cond_over <- new("GOHyperGParams", 
	geneIds=entrezIDs_final, universeGeneIds=universeEntrezIDs_final, 
	annotation="illuminaHumanv4", ontology="BP", pvalueCutoff=HGcutoffGO, 
	conditional=TRUE, testDirection="over") 

BP_cond_over <- hyperGTest(params_BP_cond_over)
Pval_BP_cond_over <- summary(BP_cond_over)$Pvalue[summary(BP_cond_over)$Size > minCatSize]
GOterm_BP_cond_over <- summary(BP_cond_over)$Term[summary(BP_cond_over)$Size > minCatSize]
GOID_BP_cond_over <- summary(BP_cond_over)$GOBPID[summary(BP_cond_over)$Size > minCatSize]

...



----- Original Message -----
From: "James W. MacDonald" <jmacdon at med.umich.edu>
To: "Ina Hoeschele" <inah at vbi.vt.edu>
Cc: "Bioconductor mailing list" <bioconductor at r-project.org>
Sent: Tuesday, November 22, 2011 1:52:56 PM
Subject: Re: [BioC] problem with GO terms

Hi Ina,



On 11/22/2011 1:03 PM, Ina Hoeschele wrote:
> thank you, Jim ...
> I did what you show below and I get the same result:
>
>   >  get("GO:0050864", org.Hs.egGO2EG)
> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>     value for "GO:0050864" not found
>
> but why is GOstats giving me this GO term?

Did you use GOstats with this current version of BioC, or are you using 
data you processed sometime in the past?

As far as I can tell, it is impossible for you to be getting that GO 
term if you are using the current version of these packages. I am 
assuming that your data are from the Illumina Human V4 chip.

 > get("GO:0050864", illuminaHumanv4GO2PROBE)
Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
   value for "GO:0050864" not found

This is with the version of the illuminaHumanV4.db package that you are 
using. Since this isn't even in that package, it is not possible for 
GOstats to be reporting it as being significant.

Best,

Jim
>
> Thanks again, Ina
>
>> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] biomaRt_2.10.0            GOstats_2.20.0
>   [3] graph_1.32.0              Category_2.20.0
>   [5] PFAM.db_2.6.1             KEGG.db_2.6.1
>   [7] GO.db_2.6.1               annotate_1.32.0
>   [9] illuminaHumanv4.db_1.12.1 org.Hs.eg.db_2.6.4
> [11] RSQLite_0.10.0            DBI_0.2-5
> [13] AnnotationDbi_1.16.4      Biobase_2.14.0
> [15] BiocInstaller_1.2.1
>
> loaded via a namespace (and not attached):
>   [1] genefilter_1.36.0 GSEABase_1.16.0   IRanges_1.12.2    RBGL_1.30.1
>   [5] RCurl_1.7-0.1     splines_2.14.0    survival_2.36-10  tools_2.14.0
>   [9] XML_3.4-2.2       xtable_1.6-0

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues



More information about the Bioconductor mailing list