[Bioc-devel] Strand-Awareness for Restrict Function

Hervé Pagès hpages at fredhutch.org
Tue Feb 16 09:41:51 CET 2016


Hi Dario,

AFAIK the 'start' and 'end' are strand-independent concepts so it
wouldn't be a good idea to let the user specify a strand-specific
window thru these arguments. That means a strand-aware restrict()
would need to have 2 additional arguments. But how should we name them?

My preference would be to support negative values for 'start' and 'end'
like we do for subseq(). When negative, the position is counted from
the end of the sequence (-1 being the last nucleotide). If we had this,
then you could do your strand-specific trimming with:

   restrict(gr, start=ifelse(strand(gr) == "-", 51, 1),
                end=ifelse(strand(gr) == "-", -1, -51))

Note that using negative values is convenient but not strictly needed:

   start <- ifelse(strand(gr) == "-", 51, 1)
   end <- extractROWS(seqlengths(gr), seqnames(gr))
   end <- ifelse(strand(gr) == "-", end, end - 50)
   restrict(gr, start=start, end=end)

Cheers,
H.


On 02/14/2016 09:00 PM, Dario Strbenac wrote:
> Hello,
>
> The restrict function currently has no strand settings. This would be useful if I am creating fixed size windows, 50 bases wide, by sampling the start positions of the windows from a GRanges object. I'd like to restrict the GRanges object being sampled from, but only on the - stand when removing positions 1 to 50 and only on the positive strand when removing positions within the last 50 bases of the chromosome. So, a strand-aware version would be useful, to avoid sampling start positions too close to the ends of chromosomes. I suppose that setdiff will be a suitable alternative, if the first and last 50 bases are calculated of each chromosome.
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list