[Bioc-sig-seq] finding the final nucleotide of trimmed reads

Hervé Pagès hpages at fhcrc.org
Sat Aug 28 04:30:03 CEST 2010


On 08/26/2010 04:58 PM, Martin Morgan wrote:
> On 08/26/2010 03:22 PM, joseph franklin wrote:
>> Many thanks for all the advice.  For the record, to get subseq() to work I had to flatten the trimmed reads to a character object:
>>
>> trimmedseq<-as.character(sread(trimmed))
>
> Hi Joe --that shouldn't be necessary, just
>
>     subseq(sread(trimmed), start=width(trimmed), width=1)
>
> or one of the other suggestions. Or what am I missing?

Or alternatively:

   subseq(sread(trimmed), start=-1)

Cheers,
H.

>
> Martin
>
>>
>> There may be a better way to accomplish that.
>>
>> Then this works great:
>>
>> last<- subseq(trimmedseq, start=width(trimmedseq), width=1)
>> consensusMatrix(last, as.prob=TRUE)
>>
>> -joe
>>
>>
>> On 26 Aug 2010, at 9:26, Steve Lianoglou wrote:
>>
>>> Howdy,
>>>
>>> On Thu, Aug 26, 2010 at 9:29 AM, Joern Toedling<Joern.Toedling at curie.fr>  wrote:
>>>> Hi,
>>>>
>>>> have a look at the "shift" argument of the function consensusMatrix from
>>>> Biostrings.
>>>>
>>>> This code example should correspond to your question. Three nucleotide strings
>>>> are aligned at their last position and the sequence composition is obtained:
>>>>
>>>> A<- DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
>>>> maxlen<- max(nchar(A))
>>>> consensusMatrix(A, shift=maxlen-nchar(A), baseOnly=TRUE)
>>>
>>> Alternatively:
>>>
>>> R>  last<- subseq(A, start=width(A), width=1)
>>> R>  consensusMatrix(last)
>>>
>>> -steve
>>>
>>>> I tested this with Biostrings_2.17.29, but I guess that it works with the
>>>> current release version, too.
>>>>
>>>> Regards,
>>>> Joern
>>>>
>>>>
>>>> On Thu, 26 Aug 2010 07:35:01 -0500, joseph franklin wrote
>>>>> Hi,
>>>>>
>>>>> I've been trimming adapters from reads using trimLRPatterns.  The
>>>>> resulting, trimmed set contains a heterogenous mix of widths: from
>>>>> ~18-35 nt.  Can anyone guide me toward an elegant way to find the
>>>>> nucleotide composition of the final (right-most) cycle for each of
>>>>> the trimmed reads?
>>>>>
>>>>> Many thanks,
>>>>> Joe Franklin
>>>>
>>>> ---
>>>> Joern Toedling
>>>> Institut Curie -- U900
>>>> 26 rue d'Ulm, 75005 Paris, FRANCE
>>>> Tel. +33 (0)156246927
>>>>
>>>> _______________________________________________
>>>> Bioc-sig-sequencing mailing list
>>>> Bioc-sig-sequencing at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>>
>>>
>>>
>>>
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>>   | Memorial Sloan-Kettering Cancer Center
>>>   | Weill Medical College of Cornell University
>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-sig-sequencing mailing list