[Bioc-devel] phred qualities

Martin Morgan mtmorgan at fhcrc.org
Wed Jun 27 20:26:25 CEST 2012


On 06/27/2012 11:22 AM, Martin Morgan wrote:
> On 06/27/2012 08:02 AM, Kasper Daniel Hansen wrote:
>> Phred qualities are usually presented as ascii-encode numbers with an
>> offset of either 32 or 64. Some packages returns this as a
>> BStringSet. I can convert a character vector "charvec" to a list of
>> integers using code like
>> sapply(charvec, function(xx) charToRaw(xx) - 33L)
>>
>> Do we have fast(er) ways of doing this, when charvec is really long
>> and not necessarily with the same number of chars in each string? I
>> am thinking of implementing the sapply() above in C (directly
>> vectorizing it), but surely someone has done something like that
>> somewhere.
>
> I think you get this with XStringSet, e.g., PhredQuality, with
>
> x = PhredQuality(c("HH", "III"))
> y = as.numeric(unlist(x)) - 33L

   as.integer

> z = relist(y, x)

or for a simple list

   split(y, rep(seq_along(x), elementLengths(x))

I have a recollection that there is something built-in...

Martin

>
> Martin
>
>>
>> Kasper
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list