[Bioc-devel] phred qualities

Martin Morgan mtmorgan at fhcrc.org
Wed Jun 27 20:22:15 CEST 2012


On 06/27/2012 08:02 AM, Kasper Daniel Hansen wrote:
> Phred qualities are usually presented as ascii-encode numbers with an
> offset of either 32 or 64.  Some packages returns this as a
> BStringSet.  I can convert a character vector "charvec" to a list of
> integers using code like
>    sapply(charvec, function(xx) charToRaw(xx) - 33L)
>
> Do we have fast(er) ways of doing this, when charvec is really long
> and not necessarily with the same number of chars in each string?  I
> am thinking of implementing the sapply() above in C (directly
> vectorizing it), but surely someone has done something like that
> somewhere.

I think you get this with XStringSet, e.g., PhredQuality, with

   x = PhredQuality(c("HH", "III"))
   y = as.numeric(unlist(x)) - 33L
   z = relist(y, x)

Martin

>
> Kasper
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list