[BioC] Obtaining exon structure of a gene via Bioconductor
Steve Lianoglou
mailinglist.honeypot at gmail.com
Tue Feb 2 17:35:03 CET 2010
Hi,
On Tue, Feb 2, 2010 at 11:08 AM, Ruppert Valentino <ruppert7 at hotmail.com> wrote:
> Hello,
>
> I want to do heteroduplex on each exon of around 50 genes. Getting the exon structure for each gene from Ensembl and manually identifying the exon sequence seems very laborous.
>
> Is there a way using Bioconductor package to get the exon sequences for all the transcripts of a gene, if so how can I do this, would biomaRt do it, if so how?
>
> Anyway examples of a script or ideas is greatly appreciated as it takes hours to get all the exon sequences for a gene split up into files to use for PCR.
>
> thanks in advance for any help on this.
I'm not sure that it really takes hours to get the exon structure ...
I've actually been developing and using a package to do this:
http://wiki.github.com/lianos/GenomeAnnotations
I'm not necessarily recommending that you use this package, but I
outlined the steps you could take to download the refseq gene
annotations for mm9, here:
http://wiki.github.com/lianos/GenomeAnnotations/installing-annotation-packages
In the "Downloading the Gene Annotation File" section.
You'll get a tab delimited file. 1 line per transcript. There are
exonStart and exonEnd columns that are comma separated list of numbers
that have the information you're looking for.
If you only want a few genes, then parsing that file shouldn't be too bad ...
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list