[Bioc-devel] sensibility and specificity in trimLRpatterns

Leandro Roser |e@ro@er @end|ng |rom gm@||@com
Wed Mar 13 12:32:02 CET 2019


Hello Bioc,

I am running some tests to compare trimLRpatterns vs other trimming
tools (skewer, cutadapt, AdapterRemoval).

I have generated simulated data using ART
(https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm).
In particular, there is a modified version of the program from the
authors of Skewer that allows to simulate the contamination with
adaptors (http://sourceforge.net/projects/skewer/files/Simulator/).

For my simulations, I have created reads of 150 bp for a coverage of
20x, and a fragment size of 200 bp +- 50 bp, to simulate the
contamination with adaptors in those reads with small fragment size.
The quality profiles were taken from actual MiSeq E.  coli Fastq
files.

Most of the programs achieve a sensibility/specificity of 99%.
trimLRpatterns is showing high specificity (99%) but a very low
sensibility (max. 16%), having problems to remove the adaptors
globally. I have changed different parameters, but I can't improve the
value.

Attached is a test script for a portion of the simulated data, where
I'm varying max.Rmismatch from 1 to 50.

I know exactly the length of the true trimmed reads, the location in
the genome is attached in the bed file. So, the width can be compared
with the output of the program. I am using the same statistics of the
AdapterRemoval paper.

 Any advice in relation to this?

Thanks!

Leandro

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bed_example.txt
URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20190313/6942962f/attachment.txt>


More information about the Bioc-devel mailing list