[Bioc-devel] Best object structure for representing a pairwise genome alignment ?
Vincent Carey
@tvjc @end|ng |rom ch@nn|ng@h@rv@rd@edu
Fri Sep 18 11:41:47 CEST 2020
Starting from
PairwiseAlignments-class package:Biostrings R Documentation
PairwiseAlignments, PairwiseAlignmentsSingleSubject, and
PairwiseAlignmentsSingleSubjectSummary objects
Description:
The ‘PairwiseAlignments’ class is a container for storing a set of
pairwise alignments.
The ‘PairwiseAlignmentsSingleSubject’ class is a container for
storing a set of pairwise alignments with a single subject.
The ‘PairwiseAlignmentsSingleSubjectSummary’ class is a container
for storing the summary of a set of pairwise alignments.
Usage:
## Constructors:
## When subject is missing, pattern must be of length 2
## S4 method for signature 'XString,XString'
PairwiseAlignments(pattern, subject,
type = "global", substitutionMatrix = NULL, gapOpening = 0,
gapExtension = 1)
## S4 method for signature 'XStringSet,missing'
PairwiseAlignments(pattern, subject,
type = "global", substitutionMatrix = NULL, gapOpening = 0,
gapExtension = 1)
## S4 method for signature 'character,character'
PairwiseAlignments(pattern, subject,
type = "global", substitutionMatrix = NULL, gapOpening = 0,
gapExtension = 1,
baseClass = "BString")
...
my question would be whether this is a relevant starting place? Clearly
the focus is not on coordinates, but perhaps a structure that maintains
genomic content and coordinates together would be of use?
On Fri, Sep 18, 2020 at 2:49 AM Charles Plessy <charles.plessy using oist.jp>
wrote:
> Dear Bioc developers,
>
> I am currently analysing pairwise genome alignments with Bioconductor,
> and I represent them with a GRanges object of the first genome,
> containing one element by alignment block, and storing the coordinates
> in the other genome in a metadata column containing another GRanges object.
>
> Something like this.
>
> GRanges object with 36582 ranges and 2 metadata columns:
> seqnames ranges strand | score query
> <Rle> <IRanges> <Rle> | <numeric> <GRanges>
> [1] S1 162-550 + | 861 XSR:909374-909853
> [2] S1 833-3738 + | 7238 XSR:910181-913291
> [3] S1 3769-4212 + | 1165 XSR:913510-913953
> [4] S1 4246-4381 + | 359 XSR:914134-914275
> [5] S1 4532-5990 + | 2977 chr2:6694031-6695569
> ... ... ... ... . ... ...
> [36578] S99 17228-17759 - | 793 chr1:2375870-2376379
> [36579] S99 16417-16935 - | 632 chr1:2376612-2377077
> [36580] S99 12370-12759 - | 773 chr1:2379949-2380343
> [36581] S99 5270-5384 - | 295 chr1:843397-843511
> [36582] S99 1949-3053 - | 2105 chr1:845358-846326
> -------
>
> Using "Pairwise genome alignment" as a keyword in a search engine, I
> found that the packages CNEr is doing something similar, although it
> uses a dedicated "GRangePairs" object for the purpose.
>
> Before I start to invest time in either direction, I wanted to check on
> that mailing list if there were other solutions already existing, in
> particularly closer to the core packages ?
>
> Have a nice day,
>
> Charles
>
> --
> Charles Plessy - - ~ ~ ~ ~ ~ ~~~~ ~ ~ ~ ~ ~ - - charles.plessy using oist.jp
> Okinawa Institute of Science and Technology Graduate University
> Staff scientist in the Luscombe Unit - ~ - https://groups.oist.jp/grsu
> Toots from work - ~ ~~ ~ - https://mastodon.technology/@charles_plessy
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
The information in this e-mail is intended only for the ...{{dropped:18}}
More information about the Bioc-devel
mailing list