[BioC] pairwiseAlignment generates different outcomes for the same input sequences

Martin Morgan mtmorgan at fhcrc.org
Mon Mar 14 17:00:29 CET 2011


On 03/13/2011 06:58 PM, Alogmail2 at aol.com wrote:
> Dear Mailing List,
>
> Why does pairwiseAlignment() generate different outcomes for the same input
>   sequences defined differently in terms of classes (see showMethods below):
>
> for
>
> pattern="character", subject="character"

Hi Alex --

here 'character' could be anything -- 'the quick brown fox', whereas

>
> vs.
>
> pattern="DNAString", subject="DNAString"

here pattern and subject are drawn from a restricted alphabet and some 
symbols have particular meaning (e.g., 'N' does not mean the nucleotide 
'N').

Martin

>
> ?
>
> It generates the same outcomes for the case of
>
>
> pattern="character", subject="character"
>
> vs.
>
> pattern="character", subject="DNAString"
>
>
> It looks like a bug.
>
> Thanks
>
> Alex
>
>
>
>
>
>
> #showMethods("pairwiseAlignment")
> #Function: pairwiseAlignment (package Biostrings)
> #pattern="character",  subject="character"
> #pattern="character",  subject="DNAString"
> #    (inherited from: pattern="character",  subject="XString")
> #pattern="DNAString", subject="DNAString"
>
>> pattern.1
> 50-letter "DNAString" instance
> seq:  CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
>> pattern.2
> [1]  "CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC"
>> subject.1
> 543-letter "DNAString" instance
> seq:
> AAAAAAAAAAAAAAAAAAAAATGAAATCCGAACTTCTTGGAGCCTCGTCTGAAGGCCATCTCGGCTCTTCAGATT...CAGTCAACAAGTTCCAACGGGACTAGATTACGGGGCGTATACGCCGTACGGCCAGGCCAGTAGTTACG
> CGTCGT
>>   subject.2
> [1]
> "AAAAAAAAAAAAAAAAAAAAATGAAATCCGAACTTCTTGGAGCCTCGTCTGAAGGCCATCTCGGCTCTTCAGATTCGCCCTTCGTCAGCGACGCTCTGGCTGCCGTCACCGGTGACTACCAATCGGCCTACGCTGCTTCCTATTAC
> AGCAGCGCGATGCAGGCCTACAATAGTCAATCGACGTCGGCCTACATGCCAAGCAGTGGATTCTATAATGGCGCAT
> CTTCGCAGACGCCCTACGGAGTCCTGGCGCCCTCCACTTACACAACGATGGGCGTTCCCAGTACAAGAGGTTTAGG
> CCAACAATGTAAAAATGGACAATCATTAGCACAAACGCCTCCGTATTTGAGCTCGTACGGGTCGGCATTCGGTGGT
> GTCACAGCCAGCAGTTCGCCTTCGGGTCCACCCGCCTACGCGTCCGCTTATGGATCGGCATACAATAGCGCCACCG
> CCGCCCAATCGTTCACCAACAGTCAACAAGTTCCAACGGGACTAGATTACGGGGCGTATACGCCGTACGGCCAGGC
> CAGTAGTTACGCGTCGT"
>>
>
>>
> pairwiseAlignment(pattern=pattern.1,subject=subject.1,type="global-local")
> Global-Local  PairwiseAlignedFixedSubject (1 of 1)
> pattern:   [1]  CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
> subject: [429]  TCGGCATACAATAGC--GCCACC------GCCGCC-CAATCGTT---CAC
> score: -91.50367
>>
> pairwiseAlignment(pattern=pattern.2,subject=subject.2,type="global-local")
> Global-Local  PairwiseAlignedFixedSubject (1 of 1)
> pattern:  [1]  CTGC--CATGGCAAAGCTC--GCTGCC-TCAGAGGCCGCCAC-AATGGTTGCGCAC
> subject: [80]  CTTCGTCA--GCGACGCTCTGGCTGCCGTCACCGGTGACTACCAATCG--GCCTAC
> score: 95.69296
>>
> pairwiseAlignment(pattern=pattern.2,subject=subject.1,type="global-local")
> Global-Local  PairwiseAlignedFixedSubject (1 of 1)
> pattern:   [1]  CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
> subject: [429]  TCGGCATACAATAGC--GCCACC------GCCGCC-CAATCGTT---CAC
> score: -91.50367
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform:  i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252   LC_CTYPE=English_United
> States.1252    LC_MONETARY=English_United  States.1252
> [4]  LC_NUMERIC=C                            LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] stats     graphics   grDevices utils     datasets  methods    base
>
> other attached packages:
> [1] altcdfenvs_2.12.0 hypergraph_1.22.0  graph_1.28.0
> makecdfenv_1.28.0  affy_1.28.0       Biobase_2.10.0     GeneR_2.20.0      seqinr_3.0-1
> [9] Biostrings_2.18.4 IRanges_1.8.9      limma_3.6.9
>
> loaded via a namespace (and not attached):
> [1]  affyio_1.18.0          preprocessCore_1.12.0  tools_2.12.2
>
>
>
>
> ###############################
>
> Nutritional Sciences and Toxicology,
> 119 Morgan Hall
> UC.Berkeley,CA 94720
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list