[BioC] biomaRt error in getGene
Steffen Durinck
durincks at mail.nih.gov
Wed Jul 12 21:06:12 CEST 2006
Hi Georg,
You need to use the latest version of biomaRt 1.7.3 for getGene to work
with zebrafish (See developmental packages).
http://www.bioconductor.org/packages/1.9/bioc/html/biomaRt.html
Also have a look at the getBM, listAttributes and listFilters
functions...these are robust against Ensembl database changes and allow
you to query more than what is possible with the simple biomaRt
functions such as getGene.
Best,
Steffen
Georg Otto wrote:
> Hi,
>
> I have a problem using biomaRt. I want to retrieve information connected to a probe, and do something like this:
>
>
>> mart<-useMart("ensembl", dataset="drerio_gene_ensembl")
>>
> Checking attributes and filters ... ok
>
>> getAffyArrays(mart)
>>
> [1] "affy_zebrafish"
>
>> getGene(id="Dr.1.1.S1_at", array="affy_zebrafish", mart=mart)
>>
> Error in getBM(attributes = c(attrib, "hgnc_symbol", "description", chrname, :
> attribute: hgnc_symbol not found, please use the function 'listAttributes' to get valid attribute names
>
>
>
>> listAttributes(mart=mart)
>>
>
> <snip>
> [8] "adf_swissprot"
> [9] "affy_zebrafish"
> [10] "affy_zebrafish_primary_db"
> [11] "agilent_g2518a"
> <snip>
>
> So it seems that the attribute "affy_zebrafish" exists.
>
> What is wrong here? I had this problem before, and I got a reply from
> Steffen Durinck (see below) that it has to do with an inconistency in
> attribute and filter naming, i.e. the attribute for the affyids was
> zebrafish_affy and the filter was called affy_zebrafish. I was told
> that this will be fixed in the next ensembl release. It seems that the
> diffrence between the array and the attribute has been repaired, since
> both are now called affy_zebrafish, but the problem still persists.
>
>
>> sessionInfo()
>>
> Version 2.3.1 (2006-06-01)
> powerpc-apple-darwin8.6.0
>
> attached base packages:
> [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets"
> [7] "base"
>
> other attached packages:
> DBI biomaRt RCurl XML
> "0.1-10" "1.6.0" "0.6-2" "0.99-7"
>
> Cheers,
>
> Georg
>
>
>
>
>> Hi,
>>
>>
>>> My understanding is that the BioMart folks are making changes to the
>>> table names for the BioMart database, and this is happening right now
>>> (e.g, right after BioC 1.8 is released). Unfortunately this means that
>>> some of the convenience functions like getGene() are being broken.
>>>
>
>
>
>> This is correct however here there is a problem with the zebrafish dataset
>> as well. Part of the error here is produced by an inconistency in
>> attribute and filter naming. The attribute for the affyids is
>> zebrafish_affy and the filter is called affy_zebrafish. The getGene
>> function expects these to have the same name and thus generates an error.
>>
>> To make the BioMart datasets better we can post these inconsistencies to
>> the corresponding BioMart mailinglist (in this case the Ensembl helpdesk)
>> so they get fixed in the next database release.
>>
>> Best,
>> Steffen
>>
>>
>>> Hi Georg,
>>>
>>> Georg Otto wrote:
>>>
>>>> Hi,
>>>>
>>>> using biomaRt, I get an error:
>>>>
>>>>
>>>>
>>>>> mart<-useMart("ensembl", dataset= "drerio_gene_ensembl")
>>>>> getAffyArrays(mart)
>>>>>
>>>> [1] "affy_zebrafish"
>>>>
>>>>
>>>>> getGene(id=genes.regulated.mas5, array="affy_zebrafish" , mart=mart)
>>>>>
>>>> Error in getBM(attributes = c(attrib, "hgnc_symbol", "description",
>>>> chrname, :
>>>> attribute: affy_zebrafish not found, please use the function
>>>> 'listAttributes' to get valid attribute names
>>>>
>>>>
>>>> Then I use listAttributes() as requested:
>>>>
>>>>
>>>>
>>>>> listAttributes(mart=mart)
>>>>>
>>>> and get the following output:
>>>>
>>>> <snip>
>>>> [309] "zebrafish_affy"
>>>> [310] "zebrafish_affy_primary_db"
>>>> <snip>
>>>>
>>>> using "zebrafish_affy" instead of "affy_zebrafish" does not help,
>>>> however:
>>>>
>>>>
>>>>
>>>>> getGene(id=genes.regulated.mas5, array="zebrafish_affy" , mart=mart)
>>>>>
>>>> Error in getBM(attributes = c(attrib, "hgnc_symbol", "description",
>>>> chrname, :
>>>> attribute: hgnc_symbol not found, please use the function
>>>> 'listAttributes' to get valid attribute names
>>>>
>>>> Any hint will be appreciated.
>>>>
>>> My understanding is that the BioMart folks are making changes to the
>>> table names for the BioMart database, and this is happening right now
>>> (e.g, right after BioC 1.8 is released). Unfortunately this means that
>>> some of the convenience functions like getGene() are being broken.
>>>
>>> I think your best bet is to use getBM() directly, and query for things
>>> that you see when you do listAttributes(mart).
>>>
>>> HTH,
>>>
>>> Jim
>>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Steffen Durinck, Ph.D.
Oncogenomics Section
Pediatric Oncology Branch
National Cancer Institute, National Institutes of Health
URL: http://home.ccr.cancer.gov/oncology/oncogenomics/
Phone: 301-402-8103
Address:
Advanced Technology Center,
8717 Grovemont Circle
Gaithersburg, MD 20877
More information about the Bioconductor
mailing list