[BioC] BioStrings questions

rcaloger raffaele.calogero at gmail.com
Fri Oct 17 13:51:27 CEST 2008


Dear Patrick,
many thanks for the quick answer. Unfortunately PSR can be located in 
any position of the refseq and the sequence length can be very 
different. Therefore, I cannot apply your suggestion.
Cheers
Raffaele

Patrick Aboyoun wrote:
> Raffaele,
> The pairwiseAlignment function uses an O(nm) {with n and m being the 
> length of the two sequences being aligned} dynamic programming 
> algorithm that is designed to find an optimal alignment and as you 
> have discovered isn't intended for use with a long reference sequence. 
> Do your PSR sequences map nearly exactly to a location on your 
> reference sequence and are these sequences of equal length? If so, see 
> the matchPDict function. It matches a pattern dictionary consisting of 
> equal length fragments to a reference sequence. The pseudo code looks 
> something like:
>
> psrPDict <- PDict(PSRDNAStringSet)
> matchPDict(psrPDict, refseq)
>
> To answer your second question, the append function should get you 
> what you want:
>
> > append(DNAStringSet(c("AAA", "GA")), DNAStringSet(c("ACTG", 
> "TTTACCC")))
>  A DNAStringSet instance of length 4
>    width seq
> [1]     3 AAA
> [2]     2 GA
> [3]     4 ACTG
> [4]     7 TTTACCC
>
>
> Patrick
>
>
> rcaloger wrote:
>> Hi,
>> In my onechannelGUI package I am developing a section related to 
>> Affymetrix exon array analysis, creating few functions that allow the 
>> association of exon-level Probe Selection Region (PSR) to refseq
>>
>> 1st question:
>> I have implemented a function that blast a list of PSR sequences over 
>> all refseq.
>> However, I would like to know if there is any way of doing something 
>> similar using the Biostring package.
>> I tried the pairwiseAlignment function but it is quite slow compared 
>> to blast.
>>
>> 2nd question:
>> there is any way of merging two DNAStringSets ?
>>
>> Cheers
>> Raffaele
>>
>>
>>
>


-- 

----------------------------------------
Prof. Raffaele A. Calogero
Bioinformatics and Genomics Unit
Dipartimento di Scienze Cliniche e Biologiche
c/o Az. Ospedaliera S. Luigi
Regione Gonzole 10, Orbassano
10043 Torino
tel.   ++39 0116705417
Lab.   ++39 0116705408
Fax    ++39 0119038639
Mobile ++39 3333827080
email: raffaele.calogero at unito.it
       raffaele[dot]calogero[at]gmail[dot]com
www:   http://www.bioinformatica.unito.it
Info: http://publicationslist.org/raffaele.calogero



More information about the Bioconductor mailing list