[BioC] pairwiseAlignment generates different outcomes for the same input sequences
Martin Morgan
mtmorgan at fhcrc.org
Mon Mar 14 17:00:29 CET 2011
On 03/13/2011 06:58 PM, Alogmail2 at aol.com wrote:
> Dear Mailing List,
>
> Why does pairwiseAlignment() generate different outcomes for the same input
> sequences defined differently in terms of classes (see showMethods below):
>
> for
>
> pattern="character", subject="character"
Hi Alex --
here 'character' could be anything -- 'the quick brown fox', whereas
>
> vs.
>
> pattern="DNAString", subject="DNAString"
here pattern and subject are drawn from a restricted alphabet and some
symbols have particular meaning (e.g., 'N' does not mean the nucleotide
'N').
Martin
>
> ?
>
> It generates the same outcomes for the case of
>
>
> pattern="character", subject="character"
>
> vs.
>
> pattern="character", subject="DNAString"
>
>
> It looks like a bug.
>
> Thanks
>
> Alex
>
>
>
>
>
>
> #showMethods("pairwiseAlignment")
> #Function: pairwiseAlignment (package Biostrings)
> #pattern="character", subject="character"
> #pattern="character", subject="DNAString"
> # (inherited from: pattern="character", subject="XString")
> #pattern="DNAString", subject="DNAString"
>
>> pattern.1
> 50-letter "DNAString" instance
> seq: CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
>> pattern.2
> [1] "CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC"
>> subject.1
> 543-letter "DNAString" instance
> seq:
> AAAAAAAAAAAAAAAAAAAAATGAAATCCGAACTTCTTGGAGCCTCGTCTGAAGGCCATCTCGGCTCTTCAGATT...CAGTCAACAAGTTCCAACGGGACTAGATTACGGGGCGTATACGCCGTACGGCCAGGCCAGTAGTTACG
> CGTCGT
>> subject.2
> [1]
> "AAAAAAAAAAAAAAAAAAAAATGAAATCCGAACTTCTTGGAGCCTCGTCTGAAGGCCATCTCGGCTCTTCAGATTCGCCCTTCGTCAGCGACGCTCTGGCTGCCGTCACCGGTGACTACCAATCGGCCTACGCTGCTTCCTATTAC
> AGCAGCGCGATGCAGGCCTACAATAGTCAATCGACGTCGGCCTACATGCCAAGCAGTGGATTCTATAATGGCGCAT
> CTTCGCAGACGCCCTACGGAGTCCTGGCGCCCTCCACTTACACAACGATGGGCGTTCCCAGTACAAGAGGTTTAGG
> CCAACAATGTAAAAATGGACAATCATTAGCACAAACGCCTCCGTATTTGAGCTCGTACGGGTCGGCATTCGGTGGT
> GTCACAGCCAGCAGTTCGCCTTCGGGTCCACCCGCCTACGCGTCCGCTTATGGATCGGCATACAATAGCGCCACCG
> CCGCCCAATCGTTCACCAACAGTCAACAAGTTCCAACGGGACTAGATTACGGGGCGTATACGCCGTACGGCCAGGC
> CAGTAGTTACGCGTCGT"
>>
>
>>
> pairwiseAlignment(pattern=pattern.1,subject=subject.1,type="global-local")
> Global-Local PairwiseAlignedFixedSubject (1 of 1)
> pattern: [1] CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
> subject: [429] TCGGCATACAATAGC--GCCACC------GCCGCC-CAATCGTT---CAC
> score: -91.50367
>>
> pairwiseAlignment(pattern=pattern.2,subject=subject.2,type="global-local")
> Global-Local PairwiseAlignedFixedSubject (1 of 1)
> pattern: [1] CTGC--CATGGCAAAGCTC--GCTGCC-TCAGAGGCCGCCAC-AATGGTTGCGCAC
> subject: [80] CTTCGTCA--GCGACGCTCTGGCTGCCGTCACCGGTGACTACCAATCG--GCCTAC
> score: 95.69296
>>
> pairwiseAlignment(pattern=pattern.2,subject=subject.1,type="global-local")
> Global-Local PairwiseAlignedFixedSubject (1 of 1)
> pattern: [1] CTGCCATGGCAAAGCTCGCTGCCTCAGAGGCCGCCACAATGGTTGCGCAC
> subject: [429] TCGGCATACAATAGC--GCCACC------GCCGCC-CAATCGTT---CAC
> score: -91.50367
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252 LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] altcdfenvs_2.12.0 hypergraph_1.22.0 graph_1.28.0
> makecdfenv_1.28.0 affy_1.28.0 Biobase_2.10.0 GeneR_2.20.0 seqinr_3.0-1
> [9] Biostrings_2.18.4 IRanges_1.8.9 limma_3.6.9
>
> loaded via a namespace (and not attached):
> [1] affyio_1.18.0 preprocessCore_1.12.0 tools_2.12.2
>
>
>
>
> ###############################
>
> Nutritional Sciences and Toxicology,
> 119 Morgan Hall
> UC.Berkeley,CA 94720
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list