[BioC] unable to find known entrezgene with biomaRt
James W. MacDonald
jmacdon at med.umich.edu
Sat Jan 19 16:01:05 CET 2008
Hi Dick,
I'm not sure I understand your question. When I go to the webpage you
reference, there is AFAICT no mention of this gene being the same as
Entrez Gene 3514 (other than having the same symbol). Nor does Entrez
Gene mention that it is the same as Ensembl Gene ENSG00000211592.
A quick look at the location of the gene would imply that it probably is
the same, and not two genes that have the same symbol (which is not unique).
Since both the web interface and the programmatic interface agree, this
isn't a matter of inconsistencies between the interfaces, so perhaps the
question is why do Entrez Gene and Ensembl not reference each other?
If so, this I think is simply due to the fact that you have two
different groups that are doing the annotation, and they are not always
perfect at referencing each other.
Best,
Jim
Dick Beyer wrote:
> Hello,
>
> I am unable to find some Entrez Gene IDs in the ensembl homo sapiens database via biomaRt, even though I can access them via the ensembl web.
>
> library(biomaRt)
> mart <- useMart( "ensembl", dataset="hsapiens_gene_ensembl")
>
> getBM(attributes=c("entrezgene","hgnc_symbol","ensembl_gene_id"),filters="entrezgene",values=3845, mart=mart)
> entrezgene hgnc_symbol ensembl_gene_id
> 1 3845 KRAS ENSG00000133703
>
> getBM(attributes=c("entrezgene","hgnc_symbol","ensembl_gene_id"),filters="entrezgene",values=3514, mart=mart)
> NULL
>
> The ensembl web interface:
>
> http://www.ensembl.org/Homo_sapiens/geneview?gene=ENSG00000211592
>
> shows Entrez Gene ID 3514 corresponds to ensembl_gene_id ENSG00000211592, IGKC.
>
> I'm curious why my biomaRt session will return good results for some valid Entrez Gene IDs but not for others. I'm not sure what to try next. I'd very much appreciate any help.
>
> sessionInfo()
> R version 2.6.1 (2007-11-26)
> x86_64-redhat-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] topGO_1.4.0 SparseM_0.75 AnnotationDbi_1.0.6
> [4] RSQLite_0.6-4 DBI_0.2-4 GO_2.0.1
> [7] Biobase_1.16.2 graph_1.16.1 biomaRt_1.12.2
> [10] RCurl_0.8-3
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.9 rcompgen_0.1-17 XML_1.93-2
>
> Thanks much,
> Dick
> *******************************************************************************
> Richard P. Beyer, Ph.D. University of Washington
> Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
> Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
> Seattle, WA 98105-6099
> http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
> http://staff.washington.edu/~dbeyer
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list