[Bioc-sig-seq] Creating TSS Regions IRanges with Strand Information

Dario Strbenac D.Strbenac at garvan.org.au
Fri Sep 18 04:46:10 CEST 2009


Hi again,

Thanks for your reply. That's not quite what I intended. I'm trying to create a rangesList so that I can do an overlap() on later. If the TSS happens to be on the negative strand, I want to be able to subtract the value of downstream for the start of the range and add the value of upstream for the end of the range (the opposite to TSSs on the positive strand).

In pseudocode something like :

strands <- TSSTable$strand
foreach(row in TSSTable)
  if(strand is +)
    startPositions.add(row$start)
  else
     startPositions.add(row$end)

foreach(startPosition)
  if(strand is +)
    startRange <- startPosition - upstream
    endRange <- startPosition + downstream   
  else
    startRange <- startPosiion - downstream
    endRange <- startPosition + upstream

Can this be done keeping the mapply function or do I need to implement it similiar to the second for loop ?

Thanks,
       Dario.
  

---- Original message ----
>Date: Mon, 14 Sep 2009 13:47:50 -0700
>From: lawremi at gmail.com (on behalf of Michael Lawrence <mflawren at fhcrc.org>)
>Subject: Re: [Bioc-sig-seq] Creating TSS Regions IRanges with Strand  Information  
>To: D.Strbenac at garvan.org.au
>Cc: bioc-sig-sequencing at r-project.org
>
>   On Wed, Sep 9, 2009 at 6:18 PM, Dario Strbenac
>   <D.Strbenac at garvan.org.au> wrote:
>
>     Hello,
>
>     I'm trying to create a RangesList of intervals
>     around the TSSs. So far, I have :
>
>     startPositions <- as.numeric(apply(TSSDataTable,
>     1, function(x) ifelse(x$strand=="+", x$start,
>     x$end)))
>     strand2numeric <- c(-1,1)
>     names(strand2numeric) <- c("-","+")
>     TSSranges <- mapply(IRanges,
>     start=split(startPositions -
>     strand2numeric[TSSDataTable$strand]*upstream,
>     TSSDataTable$chr), end=split(startPositions +
>     strand2numeric[TSSDataTable$strand]*downstream,
>     TSSDataTable$chr), names=split(TSSDataTable$name,
>     TSSDataTable$chr))
>
>     TSSL <- do.call(RangesList, TSSranges)
>
>     Now, I'm stuck when the IRanges constructor gets
>     start > end for - strand TSSs. Is there any way to
>     do this easily without resorting to rewriting the
>     code with for loops ?
>
>   Just received this message, even though it's dated 5
>   days ago.
>
>   I'm a little confused as to your goal. IRanges will
>   not support start < (end - 1), so you'll need to
>   store the strand information separately, e.g. with a
>   RangedData object.
>
>   Something like:
>   RangedData(IRanges(startPositions - upstream,
>   startPositions + downstream),
>                      strand =
>   TSSDataTable$strand, space = TSSDataTable$chr)
>    
>   Note that the GenomicFeatures experimental data
>   package already has this information for the UCSC
>   predicted TSS's.
>
>   Michael
>
>    
>
>     Thanks, Dario.
>
>     _______________________________________________
>     Bioc-sig-sequencing mailing list
>     Bioc-sig-sequencing at r-project.org
>     https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list