[BioC] from using biomaRt and r10kcod
    Weiwei Shi 
    helprhelp at gmail.com
       
    Mon May 14 22:29:49 CEST 2007
    
    
  
Hi, there:
I happened to re-address this question of codelink probe id to human
entrezgene id. I describe my question using an example:
by using r10kcod package, you can find probe "GE16490" mapped to
"502674", which I assume it is rat entrezgene id. However, when I use
biomaRt to convert all rat entrezgene id in this array to human ones,
I found the following maps involving 502674:
         id MappedID rat.count human.count
4167 296197    11034         1           2
7021 502674    11034         1           2
so, basically, 296197, 502674 and 11034 are all associated with
protein "destrin". To be accurate, 296197 is a rat protein which is
similar to destrin.
However, as shown in
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene
, the other two (11034 and *502674*) are human ids (if I am wrong
here, please correct me).
so my questions are:
1. whether 502674 is a rat entrezgene id or human one?
2. r10kcod is wrong or ncbi is wrong or my understanding is wrong (i
assume the last one :)
3. i found many many-2-many maps in this process of rat to human
entrezgene ids. Like the following:
> t0[t0[,1]== 396527,]
         id MappedID rat.count human.count
6608 396527    54576         9           4
6609 396527    54575         9           4
6610 396527    54600         9           4
6611 396527    54577         9           4
6612 396527    54578         9           4
6613 396527    54579         9           4
6614 396527    54657         9           4
6615 396527    54659         9           4
6616 396527    54658         9           4
> t0[t0[,2]== 54576,]
         id MappedID rat.count human.count
2494 113992    54576         9           4
6608 396527    54576         9           4
6617 396551    54576         9           4
6626 396552    54576         9           4
> t0[t0[,2]== 54577,]
         id MappedID rat.count human.count
2497 113992    54577         9           4
6611 396527    54577         9           4
6620 396551    54577         9           4
6629 396552    54577         9           4
so, basically all the ids are related to different polypeptides
associated with UDP glucuronosyltransferase 1 family. Are there some
other situations causing this many2many mappings?
Sorry for the long questions,
Regards,
-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
    
    
More information about the Bioconductor
mailing list