[BioC] problem with GO terms
James W. MacDonald
jmacdon at med.umich.edu
Tue Nov 22 18:39:18 CET 2011
Hi Ina,
On 11/22/2011 12:19 PM, Ina Hoeschele wrote:
> Hi,
> I have done a simple analysis associating GO terms with a gene list using GOstats. Then when I try to retrieve all genes belonging to a significant GO category I get zero genes ! I use this code:
> library(biomaRt)
> mart<- useMart("ensembl", dataset="hsapiens_gene_ensembl")
> temp<- getBM(attributes="entrezgene", filters="go", values=GOID[g], mart=mart)
You don't give sessionInfo(), so I have no idea why this is happening
(remember to always supply this in the future!). However, you don't need
to use biomaRt for this.
> library(GO.db)
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation("pkgname")'.
Loading required package: DBI
> library(org.Hs.eg.db)
> get("GO:0050864", org.Hs.egGO2EG)
Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
value for "GO:0050864" not found
So for the current version of the GO.db, this GO term no longer exists,
which is probably the problem you are having with biomaRt as well.
However if you got this GO term from GOstats, then it will exist in your
version of these packages.
As an example of what you should expect:
> get("GO:0007597", org.Hs.egGO2EG)
TAS IDA TAS TAS TAS TAS TAS TAS TAS
IC TAS
"2" "350" "708" "710" "2147" "2157" "2158" "2159" "2160" "2161"
"2161"
TAS TAS TAS TAS TAS TAS TAS TAS
"2811" "2812" "2814" "2815" "3818" "3827" "5547" "7450"
> sessionInfo()
R version 2.14.0 beta (2011-10-17 r57293)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] org.Hs.eg.db_2.6.4 GO.db_2.6.1 RSQLite_0.10.0
[4] DBI_0.2-5 AnnotationDbi_1.16.4 Biobase_2.13.12
loaded via a namespace (and not attached):
[1] IRanges_1.11.32
Best,
Jim
>
> length(temp$entrezgene) is zero!!
>
> GOID[g=1] = "GO:0050864", so as long as this is a valid GO ID (as returned from GOstats), length(temp$entrezgene) should not be zero!?
>
> This happens for multiple of my top 105 GO (BP, CC, MF) categories.
>
> Thanks for any hint ...
>
> Ina
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list