[BioC] find overlaping genes in ENSEMBL gene ID list and NCBI gene ID list

Marc Carlson mcarlson at fhcrc.org
Wed Mar 9 20:28:08 CET 2011


Hi Fernando,

I see that the list has already provided a lot of very clever
suggestions.  I would like to add one that is slightly less exciting but
which I hope might still be of some use in the event that you just
wanted to do something simple.  If your set of ensembl IDs was something
like this:

ensIds <-
c("ENSBTAG00000038843","ENSBTAG00000009091","ENSBTAG00000033312", "wrongID")


Then we could simply just use the appropriate mapping to get them back
as a list like so:

mget(ensIds, org.Bt.egENSEMBL2EG ,ifnotfound=NA)


Or you might find it easier to work with data.frames, in which case you
could do it more like this:

toTable(org.Bt.egENSEMBL2EG[ Rkeys(org.Bt.egENSEMBL2EG) %in% ensIds ])


Once you had converted all your lists to the same kind of IDs, then you
could use something like %in% (similar to how I used it above) to
quickly see which things overlap.  Please let us know if you still have
questions.


Hope this helps,


  Marc



On 03/08/2011 09:13 AM, Biase, Fernando wrote:
> Hi everyone,
>
> I have a list of ENSEMBL gene _IDS  and a list with NCBI gene_IDs. I need to find which  ids correspond to genes in both list (overlapping genes) and each genes are in each one of them but not present in the other list (non-overlapping genes).
> Can anyone give me some advice on this task? Or indicate a material do read?
> In case it is relevant, the organism is Bos taurus.
>
> Thanks in advance,
> Fernando
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list