[Bioc-sig-seq] seqselect using GRanges object on multiple chromosomes

Janet Young jayoung at fhcrc.org
Thu Jun 23 01:01:03 CEST 2011


Hi there,

I'm updating some of my older code - I was previously storing regions of interest as RangedData but now I'm switching to GRanges.  I'm running into a little trouble with seqselect - I've found a workaround but wanted to suggest extending seqselect so it can work with GRanges objects directly.

I have some scores for each base-pair I've stored as a SimpleRleList object.  I want to use GRanges object with seqselect to pull out scores from my regions of interest, but to make that work I first have to do an odd (and slightly wrong-looking, to me) coercion of my GRanges object to a CompressedIRangesList.

I think the code below explains all (?).

thanks very much,

Janet





> library(GenomicRanges)
Loading required package: IRanges

Attaching package: 'IRanges'

The following object(s) are masked from 'package:base':

    cbind, eval, intersect, Map, mapply, order, paste, pmax, pmax.int, pmin,
    pmin.int, rbind, rep.int, setdiff, table, union

> 
> ### make some scores objects, for single chromosomes, or across several chrs
> tempscores <- Rle(1:20)
> tempscores2 <- Rle(101:120)
> allscores <- RleList(chr1=tempscores,chr2=tempscores2) # yields a SimpleRleList
> 
> ## make some ranges objects
> myIR <- IRanges(start=3,end=5)
> myGR <- GRanges(seqnames=c("chr1","chr2"),ranges=IRanges(start=3,end=5))
> myRD <- RangedData(space=c("chr1","chr2"),IRanges(start=c(3,3),end=c(5,5)) )
> 
> 
> ### test seqselect:
> seqselect(tempscores,myIR) #works
'integer' Rle of length 3 with 3 runs
  Lengths: 1 1 1
  Values : 3 4 5
> seqselect(allscores,myGR) # doesn't work
Error in seqselect(allscores, myGR) : unrecognized 'start' type
> seqselect(allscores,myRD) # doesn't work
Error in seqselect(allscores, myRD) : unrecognized 'start' type
> 
> seqselect(allscores,ranges(myRD)) # works. ranges(myRD) is a CompressedIRangesList
SimpleRleList of length 2
$chr1
'integer' Rle of length 3 with 3 runs
  Lengths: 1 1 1
  Values : 3 4 5

$chr2
'integer' Rle of length 3 with 3 runs
  Lengths:   1   1   1
  Values : 103 104 105

> 
> seqselect(allscores,ranges(myGR)) # doesn't work
Error in .bracket.Index(start, length(x), names(x), asRanges = TRUE) : 
  range index out of bounds
> seqselect(allscores,split(myGR)) # doesn't work
Error in seqselect(allscores, split(myGR)) : unrecognized 'start' type
> 
> #### coerce myGR to something that looks a bit like a CompressedIRangesList
> myGR_convert <- split(myGR)
> names(myGR_convert) <- names(seqlengths(myGR_convert))
> myGR_convert <- lapply(myGR_convert, ranges )
> class(myGR_convert) <- "CompressedIRangesList"
> 
> 
> seqselect(allscores,myGR_convert) # works
SimpleRleList of length 2
$chr1
'integer' Rle of length 3 with 3 runs
  Lengths: 1 1 1
  Values : 3 4 5

$chr2
'integer' Rle of length 3 with 3 runs
  Lengths:   1   1   1
  Values : 103 104 105

> 
> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicRanges_1.4.6 IRanges_1.10.4     



More information about the Bioc-sig-sequencing mailing list