[Bioc-devel] readGAlignmentPairs with discordant strand

Michael Lawrence lawrence.michael at gene.com
Fri Oct 16 18:58:36 CEST 2015


Sure, "*" makes more sense for strand, given the precedent.

On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès <hpages at fredhutch.org> wrote:
> On 10/16/2015 09:28 AM, Michael Lawrence wrote:
>>
>> I kind of wish it would return NA for things like seqnames and strand,
>> but yes that would be very useful.
>
>
> Could do this for seqnames() but I'm hesitant to do this for strand().
> If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> strand of the location is unknown, or irrelevant, or when the "feature"
> at that location belongs to both strands. A pair with discordant strand
> belongs to both strands. Also there is a lot of code around that
> assumes strand() never returns NAs.
>
>
> H.
>
>>
>> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès <hpages at fredhutch.org> wrote:
>>>
>>> Hi Michael, Martin,
>>>
>>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
>>>>
>>>>
>>>> It does seem like starting with the more general data structure is the
>>>> better approach, but I couldn't find an easy way to move the paired
>>>> subset
>>>> of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
>>>> not
>>>> obvious to me, unfortunately.
>>>>
>>>> Another approach would be a GAlignmentPairs where the unpaired reads
>>>> have
>>>> "missing" mates. I know GAlignments has no concept of missing, but it
>>>> would
>>>> get everything into a single data structure that is convenient for
>>>> computing on pairs.
>>>
>>>
>>>
>>> I could modify readGAlignmentPairs() to have the discordant and/or
>>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
>>> could be marked as such thru a metadata col of the object or thru
>>> a proper slot. The seqnames() and strand() accessors will return
>>> * on discordant pairs. Does that sound reasonable?
>>>
>>> H.
>>>
>>>
>>>>
>>>> On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
>>>> Martin.Morgan at roswellpark.org> wrote:
>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Bioc-devel [mailto:bioc-devel-bounces at r-project.org] On Behalf
>>>>>> Of
>>>>>> Michael Lawrence
>>>>>> Sent: Friday, October 16, 2015 7:41 AM
>>>>>> To: bioc-devel at r-project.org
>>>>>> Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
>>>>>>
>>>>>> Now that GAlignmentPairs supports discordant strand between mates, how
>>>>>> hard would it be to relax that restriction on readGAlignmentPairs()?
>>>>>>
>>>>>> Also, would be nice if getDumpedAlignments() returned those dumped by
>>>>>> readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the
>>>>>
>>>>>
>>>>> extra
>>>>>>
>>>>>>
>>>>>> mcols) and calling makeGAlignmentPairs(). Not so convenient.
>>>>>
>>>>>
>>>>>
>>>>> I'm not sure whether this is relevant to your use case but
>>>>> readGAlignmentsList returns a list of paired mates, and if appropriate
>>>>> (based on ScanBamParam) list elements with solo travelers. The paired
>>>>> portion of the list can be coerced to GAlignmentPairs if the additional
>>>>> structure of that class is required.
>>>>>
>>>>> Martin
>>>>>
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>>         [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> This email message may contain legally privileged and/or confidential
>>>>> information.  If you are not the intended recipient(s), or the employee
>>>>> or
>>>>> agent responsible for the delivery of this message to the intended
>>>>> recipient(s), you are hereby notified that any disclosure, copying,
>>>>> distribution, or use of this email message is prohibited.  If you have
>>>>> received this message in error, please notify the sender immediately by
>>>>> e-mail and delete this email message from your computer. Thank you.
>>>>>
>>>>
>>>>          [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>
>>> --
>>> Hervé Pagès
>>>
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1-B514
>>> P.O. Box 19024
>>> Seattle, WA 98109-1024
>>>
>>> E-mail: hpages at fredhutch.org
>>> Phone:  (206) 667-5791
>>> Fax:    (206) 667-1319
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the Bioc-devel mailing list