[Bioc-sig-seq] Fwd: seqselect using GRanges object on multiple chromosomes
Janet Young
jayoung at fhcrc.org
Thu Jun 23 01:15:41 CEST 2011
Hi again,
A further note - the workaround I first emailed only works when there's one range per chromosome. I have a better workaround now - the coercion looks like this instead:
convertGRangesToCompressedIRangesList <- function (myGRobject) {
myGR_convert <- split(myGRobject, seqnames(myGRobject) )
names(myGR_convert) <- unlist(lapply(myGR_convert, function(x) { as.character(seqnames(x))[1] }))
myGR_convert <- lapply(myGR_convert, ranges )
class(myGR_convert) <- "CompressedIRangesList"
myGR_convert
}
Janet
Begin forwarded message:
> From: Janet Young <jayoung at fhcrc.org>
> Date: June 22, 2011 4:01:03 PM PDT
> To: Bioc-sig-sequencing at r-project.org
> Subject: seqselect using GRanges object on multiple chromosomes
>
> Hi there,
>
> I'm updating some of my older code - I was previously storing regions of interest as RangedData but now I'm switching to GRanges. I'm running into a little trouble with seqselect - I've found a workaround but wanted to suggest extending seqselect so it can work with GRanges objects directly.
>
> I have some scores for each base-pair I've stored as a SimpleRleList object. I want to use GRanges object with seqselect to pull out scores from my regions of interest, but to make that work I first have to do an odd (and slightly wrong-looking, to me) coercion of my GRanges object to a CompressedIRangesList.
>
> I think the code below explains all (?).
>
> thanks very much,
>
> Janet
>
>
>
>
>
>> library(GenomicRanges)
> Loading required package: IRanges
>
> Attaching package: 'IRanges'
>
> The following object(s) are masked from 'package:base':
>
> cbind, eval, intersect, Map, mapply, order, paste, pmax, pmax.int, pmin,
> pmin.int, rbind, rep.int, setdiff, table, union
>
>>
>> ### make some scores objects, for single chromosomes, or across several chrs
>> tempscores <- Rle(1:20)
>> tempscores2 <- Rle(101:120)
>> allscores <- RleList(chr1=tempscores,chr2=tempscores2) # yields a SimpleRleList
>>
>> ## make some ranges objects
>> myIR <- IRanges(start=3,end=5)
>> myGR <- GRanges(seqnames=c("chr1","chr2"),ranges=IRanges(start=3,end=5))
>> myRD <- RangedData(space=c("chr1","chr2"),IRanges(start=c(3,3),end=c(5,5)) )
>>
>>
>> ### test seqselect:
>> seqselect(tempscores,myIR) #works
> 'integer' Rle of length 3 with 3 runs
> Lengths: 1 1 1
> Values : 3 4 5
>> seqselect(allscores,myGR) # doesn't work
> Error in seqselect(allscores, myGR) : unrecognized 'start' type
>> seqselect(allscores,myRD) # doesn't work
> Error in seqselect(allscores, myRD) : unrecognized 'start' type
>>
>> seqselect(allscores,ranges(myRD)) # works. ranges(myRD) is a CompressedIRangesList
> SimpleRleList of length 2
> $chr1
> 'integer' Rle of length 3 with 3 runs
> Lengths: 1 1 1
> Values : 3 4 5
>
> $chr2
> 'integer' Rle of length 3 with 3 runs
> Lengths: 1 1 1
> Values : 103 104 105
>
>>
>> seqselect(allscores,ranges(myGR)) # doesn't work
> Error in .bracket.Index(start, length(x), names(x), asRanges = TRUE) :
> range index out of bounds
>> seqselect(allscores,split(myGR)) # doesn't work
> Error in seqselect(allscores, split(myGR)) : unrecognized 'start' type
>>
>> #### coerce myGR to something that looks a bit like a CompressedIRangesList
>> myGR_convert <- split(myGR)
>> names(myGR_convert) <- names(seqlengths(myGR_convert))
>> myGR_convert <- lapply(myGR_convert, ranges )
>> class(myGR_convert) <- "CompressedIRangesList"
>>
>>
>> seqselect(allscores,myGR_convert) # works
> SimpleRleList of length 2
> $chr1
> 'integer' Rle of length 3 with 3 runs
> Lengths: 1 1 1
> Values : 3 4 5
>
> $chr2
> 'integer' Rle of length 3 with 3 runs
> Lengths: 1 1 1
> Values : 103 104 105
>
>>
>> sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] GenomicRanges_1.4.6 IRanges_1.10.4
>
>
More information about the Bioc-sig-sequencing
mailing list