[BioC] BioStrings questions
rcaloger
raffaele.calogero at gmail.com
Fri Oct 17 13:51:27 CEST 2008
Dear Patrick,
many thanks for the quick answer. Unfortunately PSR can be located in
any position of the refseq and the sequence length can be very
different. Therefore, I cannot apply your suggestion.
Cheers
Raffaele
Patrick Aboyoun wrote:
> Raffaele,
> The pairwiseAlignment function uses an O(nm) {with n and m being the
> length of the two sequences being aligned} dynamic programming
> algorithm that is designed to find an optimal alignment and as you
> have discovered isn't intended for use with a long reference sequence.
> Do your PSR sequences map nearly exactly to a location on your
> reference sequence and are these sequences of equal length? If so, see
> the matchPDict function. It matches a pattern dictionary consisting of
> equal length fragments to a reference sequence. The pseudo code looks
> something like:
>
> psrPDict <- PDict(PSRDNAStringSet)
> matchPDict(psrPDict, refseq)
>
> To answer your second question, the append function should get you
> what you want:
>
> > append(DNAStringSet(c("AAA", "GA")), DNAStringSet(c("ACTG",
> "TTTACCC")))
> A DNAStringSet instance of length 4
> width seq
> [1] 3 AAA
> [2] 2 GA
> [3] 4 ACTG
> [4] 7 TTTACCC
>
>
> Patrick
>
>
> rcaloger wrote:
>> Hi,
>> In my onechannelGUI package I am developing a section related to
>> Affymetrix exon array analysis, creating few functions that allow the
>> association of exon-level Probe Selection Region (PSR) to refseq
>>
>> 1st question:
>> I have implemented a function that blast a list of PSR sequences over
>> all refseq.
>> However, I would like to know if there is any way of doing something
>> similar using the Biostring package.
>> I tried the pairwiseAlignment function but it is quite slow compared
>> to blast.
>>
>> 2nd question:
>> there is any way of merging two DNAStringSets ?
>>
>> Cheers
>> Raffaele
>>
>>
>>
>
--
----------------------------------------
Prof. Raffaele A. Calogero
Bioinformatics and Genomics Unit
Dipartimento di Scienze Cliniche e Biologiche
c/o Az. Ospedaliera S. Luigi
Regione Gonzole 10, Orbassano
10043 Torino
tel. ++39 0116705417
Lab. ++39 0116705408
Fax ++39 0119038639
Mobile ++39 3333827080
email: raffaele.calogero at unito.it
raffaele[dot]calogero[at]gmail[dot]com
www: http://www.bioinformatica.unito.it
Info: http://publicationslist.org/raffaele.calogero
More information about the Bioconductor
mailing list