[Bioc-sig-seq] Biostrings: problem to access indel-details form pairwiseAlignment()

Patrick Aboyoun paboyoun at fhcrc.org
Tue Jul 21 18:16:55 CEST 2009


Wolfgang,
Below is code that retrieves the indel locations you are looking for. I 
like your attempts at using indel, insertion, and deletion for 
PairwiseAlignment objects and I'll add the methods for PairwiseAlignment 
objects to BioC 2.5 (devel) shortly using the conventions that I specify 
below.

 > suppressMessages(library(Biostrings))
 > ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
 > samp1 <- 
DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG")) 

 > # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...
 >
 > align <- pairwiseAlignment(samp1,ref1)
 >
 > nindel(align)
An object of class “InDel”
Slot "insertion":
Length WidthSum
[1,] 0 0
[2,] 1 1
[3,] 0 0

Slot "deletion":
Length WidthSum
[1,] 0 0
[2,] 0 0
[3,] 0 0

 > deletions <- indel(pattern(align))
 > deletions
CompressedIRangesList: 3 elements
 > insertions <- indel(subject(align))
 > insertions
CompressedIRangesList: 3 elements
 > insertions[[2]]
IRanges instance:
start end width
[1] 10 10 1
 > sessionInfo()
R version 2.10.0 Under development (unstable) (2009-06-28 r48863)
i386-apple-darwin9.7.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Biostrings_2.13.26 IRanges_1.3.41

loaded via a namespace (and not attached):
[1] Biobase_2.5.4


Wolfgang Raffelsberger wrote:
> Dear list,
>
> previously I've been extracting indel-information from sequences 
> aligned by the Biostrings function pairwiseAlignment(), which is 
> probably not the best way since the class 
> 'PairwiseAlignedFixedSubject' has evoled & changed and my old code 
> won't work any more. Now trying to use the library-provided functions 
> to access the information/details about indels (ie their localization 
> on the pattern and possibly the indel sequence ). However, I can't 
> find a function to extract this information, that is (to the best of 
> my knowledge) part of the aligned object.
>
> ## here an example :
> library(Biostrings)
> ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
> samp1 <- 
> DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG")) 
>
> # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...
>
> align <- pairwiseAlignment(samp1,ref1)
>
> nindel(align) # insertion was found properly but I can't see at which 
> nt position the indel was found (neither if it's an insertion or 
> deletion)
> indel(align) # Error in function (classes, fdef, mtable) unable to 
> find an inherited method for function...
> insertion(align) # Error in function (classes, fdef, mtable) unable to 
> find an inherited method for function ...
> deletion(align) # neither ...
> ?AlignedXStringSet # says under 'Accessor methods' that indel() exists ..
>
> ## ideally I'd be looking for something like
> mismatchTable(align) # but addressing indels ...
>
>
> ## for completeness :
> > sessionInfo()
> R version 2.9.1 (2009-06-26)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 
>
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] ShortRead_1.2.1 lattice_0.17-25 BSgenome_1.12.3 Biostrings_2.12.7 
> IRanges_1.2.3
> loaded via a namespace (and not attached):
> [1] Biobase_2.4.1 grid_2.9.1 hwriter_1.1
>
> Thank's in advance,
> Wolfgang Raffelsberger
>
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> Wolfgang Raffelsberger, PhD
> Laboratoire de BioInformatique et Génomique Intégratives
> CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, 
> France
> Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
> wolfgang.raffelsberger (at) igbmc.fr
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list