[BioC] what is the best way to get scores for matches from matchPWM() ?
Lucas Carey
lucas.carey at gmail.com
Wed Jan 20 17:40:24 CET 2010
Hi All,
I'm wondering what is the best way to get the score for every match
from matchPWM() in Biostrings
Right now, to score all matches to pwm in genome I do this:
#Find PWM hits for fwd & reverse complement of PWM for all chromosomes in genome
mmf <- sapply(1:Nchr,
function(chr){matchPWM(pwm,genome[[chr]],min.score=cutoff) } )
mmr <- sapply(1:Nchr,
function(chr){matchPWM(reverseComplement(pwm),genome[[chr]],min.score=cutoff)
} )
mmm <- c(mmf,mmr)
#Extract the sequences. RevComp where necessary.
Sequences <- c( rapply(mmf,as.character,how='unlist'),
sapply(rapply(mmr,as.character,how='unlist'),function(x){c2s(rev(comp(s2c(x))))})
)
#convert to DNAStringSet for in order to score. This is quite slow
lcl_set <- DNAStringSet(as.character(Sequences))
Scores <- sapply(lcl_set,PWMscoreStartingAt,pwm=pwm)
This is incredibly inefficient. What is the best way to do this?
thanks
-Lucas
More information about the Bioconductor
mailing list