[BioC] IRanges package: findOverlaps on blobs
Hervé Pagès
hpages at fhcrc.org
Fri Jun 17 07:29:49 CEST 2011
On 11-06-08 01:56 PM, Hervé Pagès wrote:
> Hi Fahim,
[...]
> So it looks like the first thing you might want to do is to import
> your file into a GRangesList object. Which can be done with something
> like:
>
> library(GenomicRanges)
> refseqs <- read.table("RefSeqs.txt", header=TRUE,
> stringsAsFactors=FALSE)
> starts <- strsplitAsListOfIntegerVectors(refseqs$targetStart)
> widths <- strsplitAsListOfIntegerVectors(refseqs$blockSizes)
> ranges <- IRanges(start=unlist(starts), width=unlist(widths))
> seqnames <- Rle(factor(refseqs$targetName), elementLengths(starts))
> strand <- Rle(strand(refseqs$strand), elementLengths(starts))
> gr <- GRanges(seqnames, ranges, strand)
> grl <- split(gr, rep.int(seq_len(length(starts)),
> elementLengths(starts)))
> names(grl) <- refseqs$RefSeqID
FWIW, I've added an utility function to the devel version of the
GenomicRanges package that takes care of making a GRangesList object
from this type of input:
library(GenomicRanges)
refseqs <- read.table("RefSeqs.txt", header=TRUE,
stringsAsFactors=FALSE)
grl <- with(refseqs,
makeGRangesListFromFeatureFragments(
seqnames=targetName,
fragmentStarts=targetStart,
fragmentWidths=blockSizes,
strand=strand))
Sorry for the long and ugly name... (suggestions welcome).
Cheers,
H.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list