[BioC] Annotate - gene name to ENSEMBL
Marc Carlson
mcarlson at fhcrc.org
Tue Nov 5 19:28:48 CET 2013
Or you can use an annotation package (if you know what organism you are
searching).
So for example:
library(Mus.musculus)
## Now you if you have a gene name like this:
name <- "pregnancy zone protein"
## You can try and extract it directly (and hope its an exact match like
this):
select(Mus.musculus, keys=name, columns="ENSEMBL", keytype="GENENAME")
## That will work as long as the name matches what is in the database
exactly.
## But names create a special problem since they can sometimes be
written in slightly different ways.
## So instead, you might want to use the keys method to do partial
matching 1st like is described in this man page:
help('keys,OrganismDb-method')
## That would mean that you could look up a range of "valid" keys like this:
possible <- keys(Mus.musculus, keytype="GENENAME", pattern="pregnancy")
## And then you could choose the key you want and use it to extract
whatever you want to know.
select(Mus.musculus, keys=possible[1], columns="ENSEMBL",
keytype="GENENAME")
## OR maybe you are asking a more general question and you just want to
know which ENSEMBL IDs are matched to any GENENAME that has "pregnancy"
in the title. For that you could just call keys and use the column
argument like this:
keys(Mus.musculus, keytype="ENSEMBL", pattern="pregnancy",
column="GENENAME")
## OR you might want to combine a more usual use of keys with select to
get both kinds of information about any gene that has "pregnancy" in the
name:
select(Mus.musculus, keys(Mus.musculus, keytype="GENENAME",
pattern="pregnancy"), columns="ENSEMBL", keytype="GENENAME")
Hope this helps,
Marc
On 11/05/2013 12:21 AM, Hans-Rudolf Hotz wrote:
> Hi Kripa
>
> Use biomaRt
> see: http://www.bioconductor.org/packages/release/bioc/html/biomaRt.html
>
>
> quick example, assuming you are working with mouse, and want ensembl
> gene ids:
>
> > library(biomaRt)
> > ensembl = useMart("ensembl")
> > mouse.ensembl = useDataset("mmusculus_gene_ensembl",mart=ensembl)
> >
> > getBM(attributes = "ensembl_gene_id", filters = 'mgi_symbol',
> values=c("Papola"),mart=mouse.ensembl)
> ensembl_gene_id
> 1 ENSMUSG00000021111
> >
>
>
> Regards, Hans-Rudolf
>
>
>
>
> On 11/05/2013 02:07 AM, Kripa R wrote:
>> Hi everyone,
>>
>> Does anyone know how to go from gene name to ENSEMBL ID?
>>
>> I'm using lumi to analyze my microarray data, however the names get
>> changed from NuID to gene name when reading in the data.... I'd like
>> to do pathway analysis but require either ENSEMBL or GO id format
>>
>> Any help would be greatly appreciated,
>>
>> .kripa
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list