[BioC] problems with strand in predictCoding

Jeremiah Degenhardt degenhardt.jeremiah at gene.com
Fri Apr 20 21:32:49 CEST 2012


Hi Steve,

Out of curiosity, could you provide an example of an instance when you
prefer the function to only return hits on the same strand? I have
tried hard to come up with an example but can't think of one. It's
probably due more to my background though...

best,

Jeremiah

On Fri, Apr 20, 2012 at 10:07 AM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Seems like people are piling up on the "ignore.strand=FALSE is a bad
> idea" bandwagon .. for what it's worth, I think the default to "honor
> the strand" in overlap queries is a sensible one, to me.
>
> -steve
>
> On Fri, Apr 20, 2012 at 11:58 AM, Jeremiah Degenhardt
> <degenhardt.jeremiah at gene.com> wrote:
>>>
>>>
>>> There is an "ignore.strand" argument to findOverlaps, so we have a switch. I
>>> have always thought that strand should be ignored by default in operations
>>> like overlap detection, and only considered as a "direction" rather than as
>>> separate in space. It's very useful for resize() and flank() to consider
>>> strand, but not so useful for findOverlaps. The ignore.strand=FALSE in those
>>> cases default would qualify for the eight circle if there were a Bioc
>>> Inferno book. It's only the default that I argue with though, having the
>>> capability to consider strand is useful.
>>
>> I had forgotten about the ignore.strand option, thanks for the
>> reminder Michael. So, given that it's there I agree with you fully. It
>> seems he default should be changed to TRUE for the Overlap functions
>> and the precedes and follows as well.
>>
>> Note however, that this would not fully correct the issue in the
>> predictCoding function as the function still needs to correctly
>> reverse complement the varAllele to get the annotation correct.
>>
>> As a further note on how big of an issue this is, if you go to the
>> BioC home page and look at the tutorial on "Using Bioconductor to
>> annotate genetic variants" you will find that the example makes this
>> exact mistake. The variants in the VCF are unstranded and two of the
>> genes in the example are negative strand and one is positive.
>> Following the code you will get incorrect annotations for all variants
>> on the negative strand genes.
>>
>> Jeremiah
>>
>>
>>
>> --
>> Jeremiah Degenhardt, Ph.D.
>> Computational Biologist
>> Bioinformatics and Computational Biology
>> Genentech, Inc.
>> degenhardt.jeremiah at gene.com
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact



-- 
Jeremiah Degenhardt, Ph.D.
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.
degenhardt.jeremiah at gene.com



More information about the Bioconductor mailing list