[BioC] biomaRt query ??

Rhoda Kinsella rhoda at ebi.ac.uk
Wed Oct 26 12:04:33 CEST 2011


Hi Tim
As you are only asking for the DBASS5 name and id, you will only get  
the aberrant 5' splice sites generated as a result of disease-causing  
mutations in human genes (see here for more information: http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/) 
. You should take a look at the transcript event attributes from the  
Ensembl BioMart (see here for paper about this project: http://www.ncbi.nlm.nih.gov/pubmed/18978772) 
  as this will give you the alternative splice site data i think you  
are looking for.


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0"  
uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
			
	<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
		<Attribute name = "ensembl_gene_id" />
		<Attribute name = "ensembl_transcript_id" />
		<Attribute name = "ensembl_peptide_id" />
		<Attribute name = "name_1078" />
		<Attribute name = "splicing_event__dm_name_1059" />
		<Attribute name = "splicing_event_type" />
		<Attribute name = "name_106" />
		<Attribute name = "seq_region_start_1078" />
		<Attribute name = "seq_region_end_1078" />
		<Attribute name = "seq_region_strand_1078" />
	</Dataset>
</Query>
I hope that helps
Regards
Rhoda


On 24 Oct 2011, at 20:10, Tim Smith wrote:

> Hi,
>
> I wanted to determine the locations for all the alternative splicing  
> sites. I've made the query in biomaRt, but am not sure if this is  
> giving me what I want. Any help would be appreciated!
>
> #####
> library(biomaRt)
> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> atb2 <-  
> c 
> ("dbass5_id 
> ","dbass5_name 
> ",'hgnc_symbol 
> ','chromosome_name','start_position','end_position','strand')
> splice.locs <- getBM(attributes=atb2, mart=ensembl)
> print(splice.locs[1:5,])
>
> ####
>
> thanks!
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Rhoda Kinsella Ph.D.
Ensembl Production Project Leader,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.



More information about the Bioconductor mailing list