[Bioc-sig-seq] Fwd: seqselect using GRanges object on multiple chromosomes

Janet Young jayoung at fhcrc.org
Thu Jun 23 02:33:08 CEST 2011


Hi again,

Thanks, Michael - looking forward to seqselect on GRanges being implemented, if possible. Getting a little further through my old code, brings up another related request.

Here are some slightly different example ranges:
	myGR <- GRanges(seqnames=c("chr1","chr2","chr2"),ranges=IRanges(start=c(25,40,60),end=c(35,45,65)) ) 
	myRD <- RangedData(space=c("chr1","chr2","chr2"),IRanges(start=c(25,40,60),end=c(35,45,65)) ) 

And I had previously been using this command:
	restrict(ranges(myRD), 30,50)
I'm doing that because I'm interested in ranges within the same region on every single "chromosome". With my real data, they spaces/seqnames are not chromosomes, they're promoter regions centered around the transcription start site, and I want to take various equivalent sub-portions of the whole set of promoters)

I can't get "restrict" to work on the GRanges object at the moment (even if I try that same coercion first):
	restrict( myGR , 30,50)
	 Error in function (classes, fdef, mtable)  : 
	   unable to find an inherited method for function "restrict", for signature "GRanges"

	restrict( as(myGR,"RangesList") , 30,50)
	 Error in sapply(listData, function(Xi) extends(class(Xi), elementTypeX)) : 
	  error in evaluating the argument 'X' in selecting a method for function 'sapply': Error in .CompressedList.list.subscript(X = X,   INDEX = seq_len(length(X)),  : 
	  invalid output element of class "IRanges"

It would be great if restrict could work directly in the same way as it did for RangedData.  In the meantime I can probably figure out a way to do it with lapply.

thanks,

Janet



On Jun 22, 2011, at 4:19 PM, Michael Lawrence wrote:

> Hi, 
> 
> There is already a coercion from GRanges to RangesList, i.e., as(gr, "RangesList"). That said, seqselect() should have a method for GRanges, if possible.
> 
> Michael
> 
> On Wed, Jun 22, 2011 at 4:15 PM, Janet Young <jayoung at fhcrc.org> wrote:
> Hi again,
> 
> A further note - the workaround I first emailed only works when there's one range per chromosome.   I have a better workaround now - the coercion looks like this instead:
> 
> convertGRangesToCompressedIRangesList <- function (myGRobject) {
>    myGR_convert <- split(myGRobject, seqnames(myGRobject) )
>    names(myGR_convert) <- unlist(lapply(myGR_convert, function(x) { as.character(seqnames(x))[1] }))
> 
>    myGR_convert <- lapply(myGR_convert, ranges )
>    class(myGR_convert) <- "CompressedIRangesList"
>    myGR_convert
> }
> 
> Janet
> 
> 
> 
> Begin forwarded message:
> 
> > From: Janet Young <jayoung at fhcrc.org>
> > Date: June 22, 2011 4:01:03 PM PDT
> > To: Bioc-sig-sequencing at r-project.org
> > Subject: seqselect using GRanges object on multiple chromosomes
> >
> > Hi there,
> >
> > I'm updating some of my older code - I was previously storing regions of interest as RangedData but now I'm switching to GRanges.  I'm running into a little trouble with seqselect - I've found a workaround but wanted to suggest extending seqselect so it can work with GRanges objects directly.
> >
> > I have some scores for each base-pair I've stored as a SimpleRleList object.  I want to use GRanges object with seqselect to pull out scores from my regions of interest, but to make that work I first have to do an odd (and slightly wrong-looking, to me) coercion of my GRanges object to a CompressedIRangesList.
> >
> > I think the code below explains all (?).
> >
> > thanks very much,
> >
> > Janet
> >
> >
> >
> >
> >
> >> library(GenomicRanges)
> > Loading required package: IRanges
> >
> > Attaching package: 'IRanges'
> >
> > The following object(s) are masked from 'package:base':
> >
> >    cbind, eval, intersect, Map, mapply, order, paste, pmax, pmax.int, pmin,
> >    pmin.int, rbind, rep.int, setdiff, table, union
> >
> >>
> >> ### make some scores objects, for single chromosomes, or across several chrs
> >> tempscores <- Rle(1:20)
> >> tempscores2 <- Rle(101:120)
> >> allscores <- RleList(chr1=tempscores,chr2=tempscores2) # yields a SimpleRleList
> >>
> >> ## make some ranges objects
> >> myIR <- IRanges(start=3,end=5)
> >> myGR <- GRanges(seqnames=c("chr1","chr2"),ranges=IRanges(start=3,end=5))
> >> myRD <- RangedData(space=c("chr1","chr2"),IRanges(start=c(3,3),end=c(5,5)) )
> >>
> >>
> >> ### test seqselect:
> >> seqselect(tempscores,myIR) #works
> > 'integer' Rle of length 3 with 3 runs
> >  Lengths: 1 1 1
> >  Values : 3 4 5
> >> seqselect(allscores,myGR) # doesn't work
> > Error in seqselect(allscores, myGR) : unrecognized 'start' type
> >> seqselect(allscores,myRD) # doesn't work
> > Error in seqselect(allscores, myRD) : unrecognized 'start' type
> >>
> >> seqselect(allscores,ranges(myRD)) # works. ranges(myRD) is a CompressedIRangesList
> > SimpleRleList of length 2
> > $chr1
> > 'integer' Rle of length 3 with 3 runs
> >  Lengths: 1 1 1
> >  Values : 3 4 5
> >
> > $chr2
> > 'integer' Rle of length 3 with 3 runs
> >  Lengths:   1   1   1
> >  Values : 103 104 105
> >
> >>
> >> seqselect(allscores,ranges(myGR)) # doesn't work
> > Error in .bracket.Index(start, length(x), names(x), asRanges = TRUE) :
> >  range index out of bounds
> >> seqselect(allscores,split(myGR)) # doesn't work
> > Error in seqselect(allscores, split(myGR)) : unrecognized 'start' type
> >>
> >> #### coerce myGR to something that looks a bit like a CompressedIRangesList
> >> myGR_convert <- split(myGR)
> >> names(myGR_convert) <- names(seqlengths(myGR_convert))
> >> myGR_convert <- lapply(myGR_convert, ranges )
> >> class(myGR_convert) <- "CompressedIRangesList"
> >>
> >>
> >> seqselect(allscores,myGR_convert) # works
> > SimpleRleList of length 2
> > $chr1
> > 'integer' Rle of length 3 with 3 runs
> >  Lengths: 1 1 1
> >  Values : 3 4 5
> >
> > $chr2
> > 'integer' Rle of length 3 with 3 runs
> >  Lengths:   1   1   1
> >  Values : 103 104 105
> >
> >>
> >> sessionInfo()
> > R version 2.13.0 (2011-04-13)
> > Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] GenomicRanges_1.4.6 IRanges_1.10.4
> >
> >
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 



More information about the Bioc-sig-sequencing mailing list