[BioC] Create transcriptDb using gff3 files? - library GenomicFeatures and rtracklayer

Sang Chul Choi schoi at cornell.edu
Thu Apr 5 03:27:22 CEST 2012


I think that gff3 is the most advanced version of gff. See the following website

http://www.sequenceontology.org/gff3.shtml

It is not crystal clear, though.  Feature mRNA seems to be transcripts, and exons and CDSs are parts of mRNAs. Features mRNA seem to be children of feature gene.

I've found out that the gff3 file that I have been parsing does not seem to be typical because it does not have alternative spliced mRNA (the genome that I have been working on is pretty in draft stage). This makes it easy.  I've also found out that not all exons have corresonding CDSs. This makes it hard to create TranscriptDb object. 

Thank you,

SangChul

________________________________________
From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Marc Carlson [mcarlson at fhcrc.org]
Sent: Wednesday, April 04, 2012 8:44 PM
To: bioconductor at r-project.org
Subject: Re: [BioC] Create transcriptDb using gff3 files? - library GenomicFeatures and rtracklayer

I was looking at this during the course, and this is on my TODO list for
the next release cycle.  I think it is long overdue and I don't think
that the community is going to get it done in spite of all the
enthusiasm.  There has not been time to do it before now but I am hoping
that will now change.  It should be simple enough in principle, but it
might not be exactly trivial as I have discovered (on closer inspection)
that the gff specification is not as concrete as one would like it to
be.  Also there have been several different versions.

Some things that can help speed me along:

1) which version is most important?  gff3?  Or one of the other
versions?  It is likely that with the older versions we may not be able
to extract as much meaningful information.

  2) where is the best place to find some typical gff3 files for
examples?  This should not be difficult, but when I was looking before I
was finding that people were surprisingly stingy about sharing these.


   Marc



On 04/03/2012 03:57 PM, Michael Lawrence wrote:
> Marc was working on this during the course in Feb. Not sure what happened
> to it. He said it was simple. Maybe just waiting for the release to pass.
>
> Michael
>
> On Tue, Apr 3, 2012 at 3:40 PM, Steve Lianoglou<
> mailinglist.honeypot at gmail.com>  wrote:
>
>> Hi,
>>
>> On Tue, Apr 3, 2012 at 4:41 PM, Sang Chul Choi<schoi at cornell.edu>  wrote:
>>> Hi,
>>>
>>> I am wondering if I could create a TranscriptDb object (library
>> GenomicFeatures) using a gff3 file.  I could read a gff3 file using
>> import.gff3, but I could not find a way to create TranscriptDb object from
>> the object from import.gff3.
>>> Two arguments for makeTranscriptDb are required: transcripts, splicings.
>> It does not seem to be easy to parse this information from the object form
>> import.gff3.  I will appreciate any help.
>>
>> As far as I know, this functionality isn't there yet ...
>>
>> I once (early feb, 2012) suggested I might take a crack at making this
>> happen but haven't actually found the time to do it ... I'm not sure
>> anyone in bioc-core land (hi, Marc) has found the time to do it
>> either, so I think you're out of luck.
>>
>> Sorry for that. But the good news is that I bet a patch that does this
>> would be welcome ;-)
>>
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>   | Memorial Sloan-Kettering Cancer Center
>>   | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list