[BioC] Devel version of easyRNASeq: using the simpleRNASeq method gives an error

Thu Mar 27 16:26:27 CET 2014

On 27 Mar 2014, at 15:54, Sylvain Foisy Ph. D. <sylvain.foisy at diploide.net> wrote:

> Hi Nicolas,
> 
> On Mar 27, 2014, at 10:40 AM, Nicolas Delhomme <nicolas.delhomme at umu.se> wrote:
> 
>>> I realized that my first attempts at DE analysis were using tophat2 alignments unfiltered for badly mapped reads.
>> 
>> That’s a good idea since easyRNASeq was not paying attention to that previously apart from mentioning it in the vignette. The new function which you tried is meant to fix this among other things.
> 
> Am I to understand that easyRNASeq::simpleRNASeq will do the filtering? :-)

Well, it’s going to do what nowadays is a standard, i.e. dropping multimapping and/or ambiguously mapping reads. Adding a mapped quality cutoff should not be too difficult either though, is that what you had in mind by ‘filtering’? If you meant something like what “trimmomatic” does, then the answer is no; we’re very happy using trimmomatic :-) No need to re-invent the wheel, right ;-)

> 
>>> 
>>> if I tried using "genes" and summeraization=“geneModels”, i would get these but I am also unable to write the count data to file, complaining that the count.table object cannot be written to file using the write.table function…
>> 
>> Try 1.8.7 then. And instead of the genes / geneModels paradigm, have a look at the section 7.1 of the vignettes for a more accurate (IMO) and faster approach at looking for geneModels (what I now call "synthetic transcripts” rather than "gene models" as that last one was confusing because of its many meanings, e.g. if you think of a genic locus in an assembly project.
> 
> I actually read that section ;-) I have to admit that the process to create the synthetic transcripts looks a bit daunting… I have used BioC/R mostly with canned methods so some of the high-end operations are still bizarre to me. Well, let’s dive!
> 

I’ll try to “can" that function asap ;-), but annotation format can be a challenge as they are so lenient. Luckily there are some Bioc resources for these already.

>> 
>>> All of these were not observed on the unfiltered data, which flags me as something that samtools did…
>> 
>> This looks more like some quality-based read trimming was done on the data rather than having anything to do with samtools. Anyway, using 1.8.7 should help getting your count.table written out properly. Let me know if not.
> 
> I have used Trimmomatic to remove/trim low-quality reads and nucleotides prior to Tophat 2 alignments and that was all. I am upgrading as I write this and I’ll start the runs anew; I’ll let you know how it worked out.

Great. Thanks for the feedback!

Nico

> 
> Thanks for the inputs 
> 
> Best regards
> 
> S
>