[BioC] org.Hs.eg question

Marc Carlson mcarlson at fhcrc.org
Thu Mar 19 20:15:37 CET 2009


Hi Sim,

I can explain this, and maybe you can even help me to improve things. 
The mappings for ensembl protein and transcript IDs are available mapped
to ensembl gene IDs from ensembls web site (as mapped to ensembl gene
IDs).  And the mappings from entrez gene to ensembl gene IDs presently
come from NCBI.

However, the gene to gene mappings from NCBI do not seem to be as
complete as whatever ensembl is using, and I do not have any explanation
from them about why that is.  I also don't have a better source for this
information (yet) as I have been unable to locate this kind of
information from ensembls FTP sites.  Something must exist somewhere at
ensembl though because the ensembl web site is presumably based on it. 
But whatever they are using at ensembl they do not seem to be sharing
that mapping with the world (although it would be great to find out that
I had just missed it somehow).  If you know where I can find a better
source for this kind of information than what I am currently using, I
would be more than happy to consider it.  But it obviously has to be
from a trustworthy and documentable source (such as NCBI or ensembl). 
Otherwise there would not be much point in including it.  ;)


  Marc




Sim, Fraser wrote:
> Hi,
>
> I'm using the org.Hs.eg annotation package to convert Ensembl protein
> annotations to Entrez GeneIds. I don't understand why although I can
> find the correct annotation manually via the Ensembl website (EG =
> 4340), the annotation package is unable to. 
>
> Here is the code:
>   
>> HsENSP
>>     
> [1] "ENSP00000373017"
>   
>> require("org.Hs.eg.db")
>> HsEG = as.character(unlist(mget(HsENSP, org.Hs.egENSEMBLPROT2EG,
>>     
> ifnotfound = NA)))
>   
>> HsEG
>>     
> [1] NA
>
> Thanks for any input.
>
> Regards,
> Fraser
>
>
>   
>> sessionInfo()
>>     
> R version 2.8.1 (2008-12-22) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
>
> [8] base     
>
> other attached packages:
>  [1] gplots_2.6.0        gdata_2.4.2         gtools_2.5.0       
>  [4] bioDist_1.14.0      RColorBrewer_1.0-2  GEOquery_2.6.0     
>  [7] RCurl_0.94-0        rae230a.db_2.2.5    org.Rn.eg.db_2.2.6 
> [10] hom.Rn.inp.db_2.2.5 org.Hs.eg.db_2.2.6  RSQLite_0.7-1      
> [13] DBI_0.2-4           AnnotationDbi_1.4.2 Biobase_2.2.1      
> [16] rcom_2.0-4          rscproxy_1.0-12    
>   
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list