[BioC] Can subjunc use splice sites from a reference annotation (i.e. GTF file)?

Wei Shi shi at wehi.EDU.AU
Fri Jan 17 07:38:03 CET 2014

Dear Ryan,

I agree that leveraging the knowledge of known splice sites is likely to increase the accuracy of read mapping. 

When performing read alignments for RNA-seq data, subjunc firstly identifies high-confidence exon spanning reads and then uses them to compile a list of splice sites. It then uses the discovered splice sites to re-align all the reads to try to achieve the best mapping results. This is the fundamental difference between subjunc and other splice-aware aligners. Our evaluation results have already shown that subjunc was more accurate in identifying splice sites and in mapping exon-spanning reads (Tables 6 and 7 in PMID:23558742).

Allowing users to provide an annotation will complement the list of splice sites we discovered from the data and would possibly further improve the mapping results. It is on our to-do list to investigate this and possibly implement it.

Best wishes,

On Jan 17, 2014, at 11:20 AM, Ryan C. Thompson wrote:

> Hello,
> I am looking to test out subjunc for aligning my RNA-seq data. I have a reference genome and GTF file describing annotated transcripts, with splce sites implied by consecutive exons in the same transcript. If necessary, I could easily generate a tab-separated file just describing all the splice sites in the annotation. Many other spliced aligners can use this information to better align reads to known splice forms. Is there any way for subjunc to use this information? I don't see any option for this in either the command line subjunc program or the "align" function in the Rsubread package.
> -Ryan Thompson
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

The information in this email is confidential and intend...{{dropped:6}}

More information about the Bioconductor mailing list