[Bioc-sig-seq] weighted coverage on GRanges object

Janet Young jayoung at fhcrc.org
Fri May 6 21:48:30 CEST 2011


Hi there,

I have a suggestion - is it possible to make it a little easier to directly calculate weighted coverage on a GRanges object that has regions on >1 chromosome?   I think the code below explains what I mean.

I've figured out a workaround, but it took me quite a while to get there - perhaps putting the coercion in the underlying coverage/weights code will help others avoid that in future?

thanks very much,

Janet

------------------------------------------------------------------- 

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168, 
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/

------------------------------------------------------------------- 


library(GenomicRanges)


dat1 <- GRanges(seqnames="chr2L",IRanges(start=((1:4)*5),width=20),score=c(3,5,10,20) )

dat2 <- GRanges(seqnames=rep(c("chr2L","chr3L"),c(4,3)),IRanges(start=c(  (1:4)*5 , (1:3)*5 ),width=20),score=c(3,5,10,20,6,10,30) )
seqlengths(dat2) <- c(50,40)

#### this works
coverage(dat1,weight=list(elementMetadata(dat1)$score))

## SimpleRleList of length 1
## $chr2L
## 'integer' Rle of length 39 with 8 runs
##   Lengths:  4  5  5  5  5  5  5  5
##   Values :  0  3  8 18 38 35 30 20


#### this works
myscores <- list(c(3,5,10,20), c(6,10,30))
coverage(dat2,weight=myscores)

## SimpleRleList of length 2
## $chr2L
## 'integer' Rle of length 50 with 9 runs
##   Lengths:  4  5  5  5  5  5  5  5 11
##   Values :  0  3  8 18 38 35 30 20  0

## $chr3L
## 'integer' Rle of length 40 with 7 runs
##   Lengths:  4  5  5 10  5  5  6
##   Values :  0  6 16 46 40 30  0

##### but it'd be great if this could work too....
coverage(dat2,weight=list(elementMetadata(dat2)$score))
## Error in normargWeight(weight, nseq) : 'weight' is longer than 'x'

### for now I'll coerce my scores into a list by chromosome: does this seem to be a reliable way to do it?  this is the part it took me a while to figure out on my own...
myscores <- split( elementMetadata(dat2)$score , as.character(seqnames(dat2)) )
coverage(dat2,weight=myscores)

sessionInfo()

R version 2.13.0 (2011-04-13)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicRanges_1.4.3 IRanges_1.10.0     



More information about the Bioc-sig-sequencing mailing list