[BioC] BioStrings questions
Patrick Aboyoun
paboyoun at fhcrc.org
Fri Oct 17 12:33:38 CEST 2008
Raffaele,
The pairwiseAlignment function uses an O(nm) {with n and m being the
length of the two sequences being aligned} dynamic programming algorithm
that is designed to find an optimal alignment and as you have discovered
isn't intended for use with a long reference sequence. Do your PSR
sequences map nearly exactly to a location on your reference sequence
and are these sequences of equal length? If so, see the matchPDict
function. It matches a pattern dictionary consisting of equal length
fragments to a reference sequence. The pseudo code looks something like:
psrPDict <- PDict(PSRDNAStringSet)
matchPDict(psrPDict, refseq)
To answer your second question, the append function should get you what
you want:
> append(DNAStringSet(c("AAA", "GA")), DNAStringSet(c("ACTG", "TTTACCC")))
A DNAStringSet instance of length 4
width seq
[1] 3 AAA
[2] 2 GA
[3] 4 ACTG
[4] 7 TTTACCC
Patrick
rcaloger wrote:
> Hi,
> In my onechannelGUI package I am developing a section related to
> Affymetrix exon array analysis, creating few functions that allow the
> association of exon-level Probe Selection Region (PSR) to refseq
>
> 1st question:
> I have implemented a function that blast a list of PSR sequences over
> all refseq.
> However, I would like to know if there is any way of doing something
> similar using the Biostring package.
> I tried the pairwiseAlignment function but it is quite slow compared
> to blast.
>
> 2nd question:
> there is any way of merging two DNAStringSets ?
>
> Cheers
> Raffaele
>
>
>
More information about the Bioconductor
mailing list