[BioC] Trimming of partial adaptor sequences
Ryan C. Thompson
rct at thompsonclan.org
Mon Jul 22 22:37:37 CEST 2013
The solution I have used in the past for a similar application is
Biostrings::pairwiseAlignment with type="overlap". This has the
advantage of doing a full alignment, allowing for mismatches and gaps
and such and giving an alignment score.
On 07/22/2013 01:02 PM, Taylor, Sean D wrote:
> We have been experimenting with a NGS protocol in which we insert sheared genomic fragments into a custom plasmid for sequencing on an Illumina MiSeq instrument. The insertion site of this plasmid is flanked by our own custom barcodes (N7) and ~80 nt Illumina-based adaptor sequence. We then PCR out the insert with barcodes and adaptors for sequencing. Our adaptor sequence is similar to the Illumina adaptor, but we use custom primer binding sites. We are not sure if the Illumina software will be able to recognize and trim our custom adaptors. We are trying to figure out the best way to trim read through into the 3' adaptor ourselves. We have roughly three scenarios:
>
> (1) The insert is long enough that we have no read through
> (2) The vector is empty, in which case the entire adaptor sequence is present
> (3) The insert is long enough to have useful data, but we get read-through into the 3' adaptor sequence that must be trimmed.
>
> The solution we are currently working on is to identify the minimal sequence that is recognizable as the adaptor sequence and trim that using trimLRPatterns() in the Biostrings package. Ideally we would like it if we could give trimLRPatterns() the entire adaptor sequence and have it recognize it on our reads even if it is only partially present. However, in my experimenting it did not seem to be able to this. I thought I would ask the Bioconductor community if there are any better solutions to recognizing and trimming partial adaptor sequences.
>
> Thanks in advance for any input.
>
> Sean Taylor
> Post-doctoral Fellow
> Fred Hutchinson Cancer Research Center
> 206-667-5544
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list