[Bioc-devel] Change of schema of ENSEMBL biomart

Thomas Maurel maurel at ebi.ac.uk
Thu Dec 18 17:53:46 CET 2014


Dear Tiphaine,

Yes, we are trying to apply all the attribute re-naming on ensembl.org back to the GRCh37 archive.
We are planning to have a "news" page similar to what we have for ensembl.org for our next e!79 grch37 update planned for March 2015.

Please find below all the changes I've made to the GRCh37 Ensembl marts in the Ensembl Release 78:
Ensembl Genes 78
Added "Associated Gene Name" (internal name "external_gene_name") in the id list limit filter section
Renamed "external_gene_id" attribute to "external_gene_name"
Renamed "Reference ID" to "Variation Name" and "Source description" to "Variation source description" in the Variation attribute section for germline and somatic. 
Renamed internal name:
For the "Variation name" attribute from "external_id" to "variation_name"
For the "Variation Source" attribute from "source_name" to "germ_line_variation_source".
For the "Variation name" somatic attribute from "somatic_reference_id" to "somatic_variation_name"
Removed DN/DS attribute for species that don't have the data
Added a "Transcript Type" filter and rename Gene biotype and transcript biotype attributes display name to "Gene type" and "transcript type" 
Re-ordered all the id list filters.
Ensembl Variation 78
Renamed the following attributes in snp, structvar, structvar som and snp_som:
1000 genomes global MAF (all) to "1000 genomes global Minor Allele Frequency (all)"
1000 genomes global MAC (all) to "1000 genomes global Minor Allele Count (all)"
Position on chromosome (bp) to "Chromosome position start (bp)"
Added "Chromosome position end (bp)"
sequence region start to "Chromosome position start (bp)" and sequence region end to "Chromosome position end (bp)"
Added Band filter for the structvar and structvar somatic templates
Added Marker filter for the structvar and structvar somatic templates
Vega 58
Added "Associated Gene Name" (internal name "external_gene_name") in the id list limit filter section
Renamed "external_gene_id" attribute to "external_gene_name"
Add a "Transcript Type" filter and rename Gene biotype and transcript biotype attributes display name to "Gene type" and "transcript type"
Renamed all the "External ID and DB" display name to "Associated Name" and "Associated Source"

Hope this helps,
Regards,
Thomas
> On 18 Dec 2014, at 16:37, Martin, Tiphaine <tiphaine.martin at kcl.ac.uk> wrote:
> 
> Hi Thomas,
> 
> I would like to know if you change also the schema of ENSEMBL mart of http://grch37.ensembl.org/biomart/martview <http://grch37.ensembl.org/biomart/martview>, such as the "Associated Gene Name" attribute internal name from "external_gene_id" to “external_gene_name” like in the new release 77.  My request to ENSEMBL Mart worked well at least until 25/11/2014. It seems that it does not work  again.
> 
> I would like to know if you can send a message sur bioc-devel, when you do any change in the schema of ENSEMBL mart ?
> 
> Thanks,
> 
> Tiphaine
> 
> From: Thomas Maurel <maurel at ebi.ac.uk <mailto:maurel at ebi.ac.uk>>
> Date: Friday, 17 October 2014 14:06
> To: Tiphaine Martin <tiphaine.martin at kcl.ac.uk <mailto:tiphaine.martin at kcl.ac.uk>>
> Cc: "bioc-devel at r-project.org <mailto:bioc-devel at r-project.org>" <bioc-devel at r-project.org <mailto:bioc-devel at r-project.org>>
> Subject: Re: [Bioc-devel] Change of schema of ENSEMBL biomart
> 
> Dear Tiphaine,
> 
> The Ensembl marts on the biomart.org <http://biomart.org/> portal were recently updated from Ensembl release 75 to release 77. In release 76, we renamed the "Associated Gene Name" attribute internal name from "external_gene_id" to "external_gene_name". We have also added a new "Associated Gene Name" id list limit filter in release 77.
> From release 77 onward we have decided to declare all the attribute/filter internal names changes on the Ensembl website declaration page:  http://www.ensembl.org/info/website/news.html#cat-other <http://www.ensembl.org/info/website/news.html#cat-other>. 
> The data in the "hsapiens_gene_ensembl" dataset have also changed a lot since we moved from the human assembly GRCh37 to GRCh38 in Ensembl release 76. If you are still interested in the GRCh37 human assembly then you can access our archive Ensembl marts on the following page: http://grch37.ensembl.org/biomart/martview <http://grch37.ensembl.org/biomart/martview>
> 
> Hope this helps,
> Apologies for any inconvenience caused,
> Regards,
> Thomas
> On 17 Oct 2014, at 14:48, Martin, Tiphaine <tiphaine.martin at kcl.ac.uk <mailto:tiphaine.martin at kcl.ac.uk>> wrote:
> 
>> Hi Everybody,
>> 
>> 
>> Yesterday, I observed that ENSEMBL changed a little their schema of  the dataset "hsapiens_gene_ensembl".
>> 
>> In my case, some of my functions were impacted by the change from "external_gene_id" to
>> 
>> "external_gene_name".
>> 
>> 
>> Maybe there are other change in this dataset or other datasets.
>> 
>> 
>> Regards,
>> 
>> Tiphaine
>> 
>> [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> 
> --
> Thomas Maurel
> Bioinformatician - Ensembl Production Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> United Kingdom
> 

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom


	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list