[Bioc-sig-seq] finding the final nucleotide of trimmed reads

Martin Morgan mtmorgan at fhcrc.org
Fri Aug 27 01:58:42 CEST 2010


On 08/26/2010 03:22 PM, joseph franklin wrote:
> Many thanks for all the advice.  For the record, to get subseq() to work I had to flatten the trimmed reads to a character object:
> 
> trimmedseq<-as.character(sread(trimmed))

Hi Joe --that shouldn't be necessary, just

   subseq(sread(trimmed), start=width(trimmed), width=1)

or one of the other suggestions. Or what am I missing?

Martin

> 
> There may be a better way to accomplish that.
> 
> Then this works great:
> 
> last <- subseq(trimmedseq, start=width(trimmedseq), width=1)
> consensusMatrix(last, as.prob=TRUE)
> 
> -joe
> 
> 
> On 26 Aug 2010, at 9:26, Steve Lianoglou wrote:
> 
>> Howdy,
>>
>> On Thu, Aug 26, 2010 at 9:29 AM, Joern Toedling <Joern.Toedling at curie.fr> wrote:
>>> Hi,
>>>
>>> have a look at the "shift" argument of the function consensusMatrix from
>>> Biostrings.
>>>
>>> This code example should correspond to your question. Three nucleotide strings
>>> are aligned at their last position and the sequence composition is obtained:
>>>
>>> A <- DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
>>> maxlen <- max(nchar(A))
>>> consensusMatrix(A, shift=maxlen-nchar(A), baseOnly=TRUE)
>>
>> Alternatively:
>>
>> R> last <- subseq(A, start=width(A), width=1)
>> R> consensusMatrix(last)
>>
>> -steve
>>
>>> I tested this with Biostrings_2.17.29, but I guess that it works with the
>>> current release version, too.
>>>
>>> Regards,
>>> Joern
>>>
>>>
>>> On Thu, 26 Aug 2010 07:35:01 -0500, joseph franklin wrote
>>>> Hi,
>>>>
>>>> I've been trimming adapters from reads using trimLRPatterns.  The
>>>> resulting, trimmed set contains a heterogenous mix of widths: from
>>>> ~18-35 nt.  Can anyone guide me toward an elegant way to find the
>>>> nucleotide composition of the final (right-most) cycle for each of
>>>> the trimmed reads?
>>>>
>>>> Many thanks,
>>>> Joe Franklin
>>>
>>> ---
>>> Joern Toedling
>>> Institut Curie -- U900
>>> 26 rue d'Ulm, 75005 Paris, FRANCE
>>> Tel. +33 (0)156246927
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>
>>
>>
>> -- 
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list