[Bioc-sig-seq] Creating TSS Regions IRanges with Strand Information
Dario Strbenac
D.Strbenac at garvan.org.au
Fri Sep 18 04:46:10 CEST 2009
Hi again,
Thanks for your reply. That's not quite what I intended. I'm trying to create a rangesList so that I can do an overlap() on later. If the TSS happens to be on the negative strand, I want to be able to subtract the value of downstream for the start of the range and add the value of upstream for the end of the range (the opposite to TSSs on the positive strand).
In pseudocode something like :
strands <- TSSTable$strand
foreach(row in TSSTable)
if(strand is +)
startPositions.add(row$start)
else
startPositions.add(row$end)
foreach(startPosition)
if(strand is +)
startRange <- startPosition - upstream
endRange <- startPosition + downstream
else
startRange <- startPosiion - downstream
endRange <- startPosition + upstream
Can this be done keeping the mapply function or do I need to implement it similiar to the second for loop ?
Thanks,
Dario.
---- Original message ----
>Date: Mon, 14 Sep 2009 13:47:50 -0700
>From: lawremi at gmail.com (on behalf of Michael Lawrence <mflawren at fhcrc.org>)
>Subject: Re: [Bioc-sig-seq] Creating TSS Regions IRanges with Strand Information
>To: D.Strbenac at garvan.org.au
>Cc: bioc-sig-sequencing at r-project.org
>
> On Wed, Sep 9, 2009 at 6:18 PM, Dario Strbenac
> <D.Strbenac at garvan.org.au> wrote:
>
> Hello,
>
> I'm trying to create a RangesList of intervals
> around the TSSs. So far, I have :
>
> startPositions <- as.numeric(apply(TSSDataTable,
> 1, function(x) ifelse(x$strand=="+", x$start,
> x$end)))
> strand2numeric <- c(-1,1)
> names(strand2numeric) <- c("-","+")
> TSSranges <- mapply(IRanges,
> start=split(startPositions -
> strand2numeric[TSSDataTable$strand]*upstream,
> TSSDataTable$chr), end=split(startPositions +
> strand2numeric[TSSDataTable$strand]*downstream,
> TSSDataTable$chr), names=split(TSSDataTable$name,
> TSSDataTable$chr))
>
> TSSL <- do.call(RangesList, TSSranges)
>
> Now, I'm stuck when the IRanges constructor gets
> start > end for - strand TSSs. Is there any way to
> do this easily without resorting to rewriting the
> code with for loops ?
>
> Just received this message, even though it's dated 5
> days ago.
>
> I'm a little confused as to your goal. IRanges will
> not support start < (end - 1), so you'll need to
> store the strand information separately, e.g. with a
> RangedData object.
>
> Something like:
> RangedData(IRanges(startPositions - upstream,
> startPositions + downstream),
> strand =
> TSSDataTable$strand, space = TSSDataTable$chr)
>
> Note that the GenomicFeatures experimental data
> package already has this information for the UCSC
> predicted TSS's.
>
> Michael
>
>
>
> Thanks, Dario.
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list