[Bioc-devel] Strand-Awareness for Restrict Function

Hervé Pagès hpages at fredhutch.org
Tue Feb 16 19:30:30 CET 2016


Hi Kasper,

On 02/16/2016 07:05 AM, Kasper Daniel Hansen wrote:
> upstream / downstream is what we have previously used for strand awareness.

These names are good for specifying *relative* positions in a
way that is strand-aware. In the case of restrict() though, where we
need to be able to specify *absolute* positions, I don't think that
works. How would you call restrict() with these arguments to perform
Dario's strand-specific trimming?

H.

>
> Kasper
>
> On Tue, Feb 16, 2016 at 3:41 AM, Hervé Pagès <hpages at fredhutch.org
> <mailto:hpages at fredhutch.org>> wrote:
>
>     Hi Dario,
>
>     AFAIK the 'start' and 'end' are strand-independent concepts so it
>     wouldn't be a good idea to let the user specify a strand-specific
>     window thru these arguments. That means a strand-aware restrict()
>     would need to have 2 additional arguments. But how should we name them?
>
>     My preference would be to support negative values for 'start' and 'end'
>     like we do for subseq(). When negative, the position is counted from
>     the end of the sequence (-1 being the last nucleotide). If we had this,
>     then you could do your strand-specific trimming with:
>
>        restrict(gr, start=ifelse(strand(gr) == "-", 51, 1),
>                     end=ifelse(strand(gr) == "-", -1, -51))
>
>     Note that using negative values is convenient but not strictly needed:
>
>        start <- ifelse(strand(gr) == "-", 51, 1)
>        end <- extractROWS(seqlengths(gr), seqnames(gr))
>        end <- ifelse(strand(gr) == "-", end, end - 50)
>        restrict(gr, start=start, end=end)
>
>     Cheers,
>     H.
>
>
>
>     On 02/14/2016 09:00 PM, Dario Strbenac wrote:
>
>         Hello,
>
>         The restrict function currently has no strand settings. This
>         would be useful if I am creating fixed size windows, 50 bases
>         wide, by sampling the start positions of the windows from a
>         GRanges object. I'd like to restrict the GRanges object being
>         sampled from, but only on the - stand when removing positions 1
>         to 50 and only on the positive strand when removing positions
>         within the last 50 bases of the chromosome. So, a strand-aware
>         version would be useful, to avoid sampling start positions too
>         close to the ends of chromosomes. I suppose that setdiff will be
>         a suitable alternative, if the first and last 50 bases are
>         calculated of each chromosome.
>
>         --------------------------------------
>         Dario Strbenac
>         PhD Student
>         University of Sydney
>         Camperdown NSW 2050
>         Australia
>
>         _______________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>         mailing list
>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>
>     _______________________________________________
>     Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list