[Bioc-sig-seq] Semantics of strandedness for gaps function

Patrick Aboyoun paboyoun at fhcrc.org
Mon Jun 28 18:09:41 CEST 2010


Dario,
Each element in a GRanges object is interpreted loosely as a feature so 
if you have one feature that is on the positive stand on a particular 
chromosome at a particular range and another feature on the negative 
strand on the same chromosome at the same range, you would represent 
them as two separate elements in a GRanges object. You would not 
collapse the two features into a single feature that has a strand 
designation of '*'. Similarly, you would not break up a feature that 
occurs on both strands into two separate elements on opposing strands. 
What the gaps method for GRanges does is it finds the set complement of 
the features on the positive strand, the negative strand, and those on 
both strands simultaneously, which is why you see the results that you 
do in the vignette example. Does this make sense?


Patrick


On 6/27/10 10:30 PM, Dario Strbenac wrote:
> Hello,
>
> I have a question that relates to the GenomicRanges vignette.
>
> An element of the GRanges on page 9 is :
>
>    
>> reduce(g)
>>      
> GRanges with 3 ranges and 0 elementMetadata values
> seqnames ranges strand |
> <Rle>  <IRanges>  <Rle>  |
> [1] chr1 [ 1, 7] - |
>       ...    ...   ...
>
> and the first part of the gaps() on page 10 is :
>
>    
>> gaps(g)
>>      
> GRanges with 11 ranges and 0 elementMetadata values
> seqnames ranges strand |
> <Rle>  <IRanges>  <Rle>  |
> [1] chr1 [ 1, 249250621] + |
> [2] chr1 [ 8, 249250621] - |
> [3] chr1 [ 1, 249250621] * |
>      ...     ...    ...
>
> I would've thought that the gaps for the '*' strand would have the same start and end as row 2 because I had the impression that '*' means "on both strands" and there's something on the '-' strand between 1 and 7 meaning there isn't a gap "on both strands" between 1 and 7 ? It seems that gaps changes the semantics to : (NOT '+') OR (NOT '-').
>
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



More information about the Bioc-sig-sequencing mailing list