[BioC] biomaRt query ??
Rhoda Kinsella
rhoda at ebi.ac.uk
Wed Oct 26 12:04:33 CEST 2011
Hi Tim
As you are only asking for the DBASS5 name and id, you will only get
the aberrant 5' splice sites generated as a result of disease-causing
mutations in human genes (see here for more information: http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/)
. You should take a look at the transcript event attributes from the
Ensembl BioMart (see here for paper about this project: http://www.ncbi.nlm.nih.gov/pubmed/18978772)
as this will give you the alternative splice site data i think you
are looking for.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query virtualSchemaName = "default" formatter = "TSV" header = "0"
uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
<Attribute name = "ensembl_gene_id" />
<Attribute name = "ensembl_transcript_id" />
<Attribute name = "ensembl_peptide_id" />
<Attribute name = "name_1078" />
<Attribute name = "splicing_event__dm_name_1059" />
<Attribute name = "splicing_event_type" />
<Attribute name = "name_106" />
<Attribute name = "seq_region_start_1078" />
<Attribute name = "seq_region_end_1078" />
<Attribute name = "seq_region_strand_1078" />
</Dataset>
</Query>
I hope that helps
Regards
Rhoda
On 24 Oct 2011, at 20:10, Tim Smith wrote:
> Hi,
>
> I wanted to determine the locations for all the alternative splicing
> sites. I've made the query in biomaRt, but am not sure if this is
> giving me what I want. Any help would be appreciated!
>
> #####
> library(biomaRt)
> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> atb2 <-
> c
> ("dbass5_id
> ","dbass5_name
> ",'hgnc_symbol
> ','chromosome_name','start_position','end_position','strand')
> splice.locs <- getBM(attributes=atb2, mart=ensembl)
> print(splice.locs[1:5,])
>
> ####
>
> thanks!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Rhoda Kinsella Ph.D.
Ensembl Production Project Leader,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
More information about the Bioconductor
mailing list