[Bioc-devel] Build error in released version

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Wed Jul 6 03:00:56 CEST 2016


This is my opinion.  I think the class should have a very general
structure.  Depending on the assay, users may be particularly interested in
the original case described by Herve (alignments on opposite strands, same
chromosome), but I dont think it should be enforced.

We have first(), last(), second() for getting the information in each pair.

I suggest that the functionality inherent in granges() instead by moved to
a function like
  getFragment()
What this function does, is really to get the actual DNA fragment that was
sequenced.  I think this function should have an argument like
standRequirement which defaults to "opposite" and which has other options
"same" and "ignored" and which returns a GRanges with the "outer"
coordinates exactly as now.  That accessor function would silently drop
alignment pairs which does not satisfy the criteria.

An alternative is that this could be set as options inside the
GAlignmentPairs object: what is considered a "valid" alignment.  Then this
could dictate the behaviour of say seqnames().

I strongly support this class having the possibility of alignments to
different chromosomes.  But I can also see the advantage in having default
methods drop alignments which does this, as this will be the majority
usecase.  Then people with special requirements can set a few arguments.

Kasper

On Tue, Jul 5, 2016 at 7:15 PM, Zhu, Lihua (Julie) <Julie.Zhu at umassmed.edu>
wrote:

> Thanks for the clarification , Hervé!
>
> That makes sense to me! I prefer option b and c.
>
> Best,
>
> Julie
>
> > On Jul 5, 2016, at 6:05 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
> >
> > Hi Julie,
> >
> >> On 07/05/2016 01:03 PM, Zhu, Lihua (Julie) wrote:
> >> Dear Herve,
> >>
> >> Thank you so much for the detailed explanation and quick fix!
> >>
> >> When fusion occurs, the 2 alignments in a pair could be on different
> >> chromosomes. Not sure what is the best way to handle this situation. If
> >> set seqnames to *, then the mapping location is lost.
> >
> > The exact mapping location is stored in the GAligmentPairs object and
> > will be preserved when the user coerces to GRangesList, either with
> > grglist(.) or as(. , "GRangesList"). I was thinking in using * only
> > when coercing to GRanges, either with granges(.) or as(. , "GRanges").
> >
> >> Maybe set seqnames
> >> to one of the chromosome, and have a mcol seqname2? Indicating
> chromosome
> >> fusion?
> >
> > There is really no good way to represent features that span multiple
> > chromosomes with a GRanges object. These features are better represented
> > with a GRangesList object. That's really what GRangesList objects are
> > for.
> >
> > So after giving it a 2nd thought, I'm tempted to either (a) just let
> > coercion of GAligmentPairs to GRanges fail (with an informative error
> > message) when some pairs span more than 1 chromosome, or (b) drop
> > these pairs with a warning, or (c) add an extra argument to granges()
> > to let the user decide between (a) and (b).
> >
> > Hopefully that makes sense.
> >
> > H.
> >
> >>
> >> Best,
> >>
> >> Julie
> >>
> >>
> >>
> >>
> >>
> >>
> >>> On 7/4/16 12:28 AM, "Hervé Pagès" <hpages at fredhutch.org> wrote:
> >>>
> >>> Hi Julie,
> >>>
> >>> The GAlignmentPairs container didn't support discordant strand until
> >>> BioC 3.4 (current devel). In the current release (and in previous
> >>> versions of BioC) strand discordance was not supported. I recently
> >>> fixed a bug in the released version of GenomicAlignments where the
> >>> strand() getter was returning an incorrect strand for pairs with
> >>> discordant strand, instead of raising an error. This fix is what
> >>> breaks the GUIDEseq vignette which creates and manipulates a
> >>> GAlignmentPairs object with discordant strand.
> >>>
> >>> I just changed this again (in GenomicAlignments 1.8.4) so the
> >>> strand() getter now returns * instead of raising an error in case
> >>> of discordant strand (this is actually what strand() does in devel
> >>> where strand discordance is now fully supported). That seems to
> >>> fix GUIDEseq's vignette.
> >>>
> >>> I guess there was no solid reason for not supporting strand
> discordance.
> >>> It was just a matter of deciding how the various GAlignmentPairs
> getters
> >>> and extractors should handle this. Having strand() or granges() return
> >>> * is probably the natural thing to do.
> >>>
> >>> A more complicated situation, which is also much less common, is
> >>> chromosome discordance i.e. when the 2 alignments in a pair are on
> >>> different chromosomes. Not clear what seqnames() or granges() should
> >>> do in that case. Maybe return/set the seqname to * but that means
> >>> introducing * as a special seqlevel. Still need to think about the
> >>> implications of this.
> >>>
> >>> Cheers,
> >>> H.
> >>>
> >>>> On 07/01/2016 02:23 PM, Zhu, Lihua (Julie) wrote:
> >>>> Hi,
> >>>>
> >>>> I just noticed that the released version of GUIDEseq failed at build
> >>>> stage, which did not occur previously and I did not make any change to
> >>>> the release version.
> >>>>
> >>>> The error points to the GAlignmentPairs container. Is this an intended
> >>>> change and should I modify my code to accommodate the change? If yes,
> >>>> what is the rational to enforce the rule that
> >>>>
> >>>> GAlignmentPairs container supports pairs where the 2 alignments are on
> >>>> opposite strands of the same  chromosome?
> >>>>
> >>>> Thanks for your help!
> >>>>
> >>>> Error in .local(x, ...) :
> >>>>    For some pairs in 'x', the 2 alignments are not on opposite
> strands.
> >>>> Cannot
> >>>>    associate a strand to them. Note that the GAlignmentPairs container
> >>>> only
> >>>>    supports pairs where the 2 alignments are on opposite strands of
> the
> >>>> same
> >>>>    chromosome at the moment.
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Julie
> >>>>
> >>>>    [[alternative HTML version deleted]]
> >>>>
> >>>> _______________________________________________
> >>>> Bioc-devel at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >>>
> >>> --
> >>> Hervé Pagès
> >>>
> >>> Program in Computational Biology
> >>> Division of Public Health Sciences
> >>> Fred Hutchinson Cancer Research Center
> >>> 1100 Fairview Ave. N, M1-B514
> >>> P.O. Box 19024
> >>> Seattle, WA 98109-1024
> >>>
> >>> E-mail: hpages at fredhutch.org
> >>> Phone:  (206) 667-5791
> >>> Fax:    (206) 667-1319
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpages at fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:    (206) 667-1319
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list