[BioC] Turning a GRanges Metadata Column into Rle List
Paul Leo
p.leo at uq.edu.au
Tue Jan 15 08:59:18 CET 2013
Actually this may be better. than my last post:
But making the Rle object of scores using "coverage and weights" (is the
only way I could think of).
In won't work in cases where the intervals in ir.encode.data are not
unique (that is the coverage values must always be one else you get
coverage*score instead of just score when you use the weights)
Is there a better way to make an Rle of non-contigious scores ???
### assume encode.data is table with columns start,end,score
### assume you.intervals is table with columns start,end
# ir<-IRanges(start=you.intervals[,"start"],end=you.intervals[,"end"])
ir.encode.data<-IRanges(start=encode.data[,"start"],end=encode.data[,"end"])
cov<-coverage(ir.encode.data,weight=as.numeric(ir.encode.data[,"score"])) ## DANGER HERE
myViews<-Views(cov,start=you.intervals[,"start"],end=you.intervals[,"end"] )
viewMaxs(myViews)
if ir.encode.data and ir are RleLists use
regionViews <- RleViewsList(rleList = cov, rangesList =ir )
instesd of the Views
I'm not sure which solution is best but the above should use less
memory?
Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level , 37 Kent Street, Woolloongabba QLD 4102
Tel: +61 7 3443 7072 Mob: 041 303 8691 Fax: +61 7 3443 6966
-----Original Message-----
From: Paul Leo <p.leo at uq.edu.au>
Reply-to: <p.leo at uq.edu.au>
To: Martin Morgan <mtmorgan at fhcrc.org>
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Turning a GRanges Metadata Column into Rle List
Date: Tue, 15 Jan 2013 16:38:33 +1000
If you want the maximum of the overlap you can use something like this
too:
ir<-IRanges(start=you.intervals[,"start"],end=you.intervals[,"end"])
ir.encode.data<-IRanges(start=encode.data[,"start"],end=encode.data[,"end"])
over<-findOverlaps(ir,ir.encode.data)
query<-queryHits(over)
subject<-subjectHits(over)
tapply(as.numeric(encode.data[subject,"score"]),query,function(x)
max(x,na.rm=TRUE))
slight modification if have different chrs ....
This works for getting GERP scores in indels etc
Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level , 37 Kent Street, Woolloongabba QLD 4102
Tel: +61 7 3443 7072 Mob: 041 303 8691 Fax: +61 7 3443 6966
-----Original Message-----
From: Martin Morgan <mtmorgan at fhcrc.org>
To: D.Strbenac at garvan.org.au
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Turning a GRanges Metadata Column into Rle List
Date: Mon, 14 Jan 2013 22:17:17 -0800
On 01/14/2013 10:00 PM, Dario Strbenac wrote:
>> I was thinking you could just do (assuming score is always >= 0, and that
>> the ranges do not overlap, which seems to be the case from the initial
>> code):
>
> In another scenario, what if I have data on multiple cell lines, such as one
> of the ENCODE integrated tracks, and I would like to find the maximum value
> at each base within a specified genomic region ? In this case, the ranges in
> the supertrack would overlap often.
>
> I would imagine making a separate coverage RleList for each cell line, to
> avoid complications with overlapping ranges. Then, it would be nice to have a
> pmax function in the API that could take a number of RleList objects, each of
> the same length, and return one RleList. Are there any plans for pmin and
> pmax of this style ?
I think there is one?
library(IRanges)
showMethods(class="RleList", where=search())
> r1 = RleList(a=Rle(c(0, 0, 1, 0)), b=Rle(c(1, 1, 0, 0)))
> r2 = RleList(a=Rle(c(0, 0, 2, 2)), b=Rle(c(1, 2, 1, 1)))
> pmax(r1, r2)
SimpleRleList of length 2
$a
numeric-Rle of length 4 with 2 runs
Lengths: 2 2
Values : 0 2
$b
numeric-Rle of length 4 with 3 runs
Lengths: 1 1 2
Values : 1 2 1
>
> _______________________________________________ Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list