[BioC] Human, Mouse and Rat homologs
michael watson (IAH-C)
michael.watson at bbsrc.ac.uk
Thu May 20 20:54:24 CEST 2010
Lets say your data is in a data frame called "d", then the code might be:
> d
probe_id ensembl_id
1 8039748 ENSG00000121410
2 7960947 ENSG00000175899
3 8144857 ENSG00000171428
4 8144866 ENSG00000156006
5 7976496 ENSG00000196136
6 8083415 ENSG00000114771
>
> library(biomaRt)
>
> mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
>
> h2m <- getBM(attributes=c("ensembl_gene_id","mouse_ensembl_gene"), mart=mart)
>
> my.h2m <- merge(d, h2m, by.x="ensembl_id", by.y="ensembl_gene_id", sort=FALSE)
> my.h2m
ensembl_id probe_id mouse_ensembl_gene
1 ENSG00000121410 8039748 ENSMUSG00000022347
2 ENSG00000175899 7960947 ENSMUSG00000030111
3 ENSG00000171428 8144857 ENSMUSG00000025588
4 ENSG00000171428 8144857 ENSMUSG00000051147
5 ENSG00000171428 8144857 ENSMUSG00000056426
6 ENSG00000156006 8144866 ENSMUSG00000051147
7 ENSG00000156006 8144866 ENSMUSG00000056426
8 ENSG00000156006 8144866 ENSMUSG00000025588
9 ENSG00000196136 7976496 ENSMUSG00000066363
10 ENSG00000196136 7976496 ENSMUSG00000041536
11 ENSG00000196136 7976496 ENSMUSG00000066364
12 ENSG00000196136 7976496 ENSMUSG00000058207
13 ENSG00000196136 7976496 ENSMUSG00000079012
14 ENSG00000196136 7976496 ENSMUSG00000079013
15 ENSG00000196136 7976496 ENSMUSG00000021091
16 ENSG00000196136 7976496 ENSMUSG00000066361
17 ENSG00000196136 7976496 ENSMUSG00000041449
18 ENSG00000196136 7976496 ENSMUSG00000041481
19 ENSG00000114771 8083415 ENSMUSG00000027761
________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Wolfgang Huber [whuber at embl.de]
Sent: 20 May 2010 19:40
To: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Human, Mouse and Rat homologs
Dear David
one of the possible solutions is via the BioMart interface to the
Ensembl databases. Please check the getLDS function in the biomaRt
package, which is described in that package's vignette.
Best wishes
Wolfgang
Lyon scripsit 20/05/10 04:54:
> If I had a file containing a list of Human:
>
> 1)Refseq IDs:
>
> "probe_id" "accession"
> "1" "8039748" "NM_130786"
> "2" "8039748" "NP_570602"
> "3" "7960947" "NM_000014"
> "4" "7960947" "NP_000005"
> "5" "8144857" "NM_000662"
> "6" "8144857" "NM_001160170"
>
> Or
>
> 2)Ensemble genes:
>
> "probe_id" "ensembl_id"
> "1" "8039748" "ENSG00000121410"
> "2" "7960947" "ENSG00000175899"
> "3" "8144857" "ENSG00000171428"
> "4" "8144866" "ENSG00000156006"
> "5" "7976496" "ENSG00000196136"
> "6" "8083415" "ENSG00000114771"
>
>
> which R package does the conversion of the list of IDs to find the Mouse homologs and can someone type the exact command?
>
> Thank you for your consideration.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list