[Bioc-sig-seq] Bioc short read directions

Martin Morgan mtmorgan at fhcrc.org
Wed Apr 2 21:44:29 CEST 2008

Herve Pages wrote:
> Hi Harris,
>> Nobody seems to have mentioned it, but what about a "both strand" 
>> mode?  If RC is reverse-complement,
>> this feature would basically automate the first statement here:
>>     Dict = PDict(c(patterns, RC(patterns)))
>>         matchPDict(Dict,Seq)
>> The user would just pass 'patterns' and have to say whether he wants 
>> forward and reverse matches
>> distinguished.  The result would be of length 2*length(patterns) as it 
>> is now, if they should be,
>> but of length length(patterns) if they can be combined.
> Instead of putting this at the PDict() level, I would rather build some
> higher level function _on top_ of PDict() that would handle this.

Also, for Solexa-style data and approximate matching, you wouldn't want 
to reverseComplement the reads, because the start and end of the reads 
are not equally trustworthy. reverseComplementing the subject is one 
approach (though quite expensive if the subject is human chromosome 1, 
for instance).


