[BioC] problem getting biotype in biomaRt
Rhoda Kinsella
rhoda at ebi.ac.uk
Tue Jan 13 11:41:24 CET 2009
Hi Steffen and Elizabeth,
I have had a look through the ensembl mart configuration and have
found an error which may fix the current gene and transcript biotype
problem.
The pointer attribute for the structure_biotype is still pointing to
biotype so I will change this to point to gene_biotype and this should
solve the issue. I will
implement this change for release 53 (approximately mid February). My
apologies for any inconvenience and thank you for reporting this
problem.
Regards,
Rhoda
On 13 Jan 2009, at 10:08, Rhoda Kinsella wrote:
> Hi Steffen and Elizabeth,
> The biotype attribute was changed into gene_biotype and
> transcript_biotype after a user requested that
> we provide the transcript_biotype information. I have carried out
> the query below on the Ensembl mart web interface and
> there are no errors reported. Steffan, can you provide me with some
> more information about where you think
> the source of the problem is and I can help look into this?
> Kind regards,
> Rhoda
>
>
>
>
> On 12 Jan 2009, at 20:48, steffen at stat.berkeley.edu wrote:
>
>> Hi Elizabeth,
>>
>> The biotype attribute seem to have changed into a separate
>> gene_biotype
>> and transcript_biotype these two represent the same info.
>>
>> These two new attributes however are indeed currently not
>> retrievable and
>> I am investigating what causes this. It looks like it is on the
>> BioMart
>> side.
>>
>> Cheers,
>> Steffen
>>
>>
>>
>>> Hi,
>>> I am trying to pull down information from Ensembl using biomaRt
>>> and I
>>> can't get the relevant biotype information (for Human). The old
>>> 'biotype' attribute doesn't exist, so what I see is 'gene_biotype'
>>> and
>>> 'structure_biotype'. I have no idea what the difference is, but I
>>> can't
>>> get either one. The error says it's probably an internal error to be
>>> reported, but I also get this when I try to bring down what I
>>> think are
>>> incompatible attributes.
>>> Thanks,
>>> Elizabeth
>>>
>>>> library(biomaRt)
>>>> mart<-useMart("ensembl",dataset= "hsapiens_gene_ensembl")
>>> Checking attributes and filters ... ok
>>>> martAttr<-listAttributes(mart)
>>>> att<-c("ensembl_gene_id",
>>> + "ensembl_transcript_id",
>>> + "ensembl_exon_id",
>>> + "exon_chrom_start",
>>> + "exon_chrom_end",
>>> + "strand",
>>> + "chromosome_name",
>>> + "rank",
>>> + "3_utr_start","3_utr_end",
>>> + "5_utr_start","5_utr_end"
>>> + )
>>>> all(att%in%martAttr[,1]) #valid names for the mart
>>> [1] TRUE
>>> #works fine here
>>>> tempGene <-
>>> getBM(att,filter="ensembl_gene_id",value="ENSG00000187634",mart =
>>> mart)
>>> #error
>>>> tempGene <-
>>> getBM
>>> (c
>>> (att
>>> ,"gene_biotype
>>> "),filter="ensembl_gene_id",value="ENSG00000187634",mart
>>> = mart)
>>>
>>> V1
>>> 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from
>>> multiple attribute pages are not allowed
>>> Error in getBM(c(att, "gene_biotype"), filter = "ensembl_gene_id",
>>> value
>>> = "ENSG00000187634", :
>>> Number of columns in the query result doesn't equal number of
>>> attributes in query. This is probably an internal error, please
>>> report.
>>> #again an error
>>>> tempGene <-
>>> getBM
>>> (c
>>> (att
>>> ,"structure_biotype
>>> "),filter="ensembl_gene_id",value="ENSG00000187634",mart
>>> = mart)
>>>
>>> V1
>>> 1 Query ERROR: caught BioMart::Exception::Usage: Attribute biotype
>>> NOT
>>> FOUND
>>> Error in getBM(c(att, "structure_biotype"), filter =
>>> "ensembl_gene_id", :
>>> Number of columns in the query result doesn't equal number of
>>> attributes in query. This is probably an internal error, please
>>> report.
>>>> sessionInfo()
>>> R version 2.8.1 (2008-12-22)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] biomaRt_1.16.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] RCurl_0.93-0 tools_2.8.1 XML_1.99-0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> Rhoda Kinsella Ph.D.
> Ensembl Bioinformatician,
> European Bioinformatics Institute (EMBL-EBI),
> Wellcome Trust Genome Campus,
> Hinxton
> Cambridge CB10 1SD,
> UK.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
More information about the Bioconductor
mailing list