[BioC] [somehow-OT] Storing/quickly accessing "genome length" data.
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Feb 9 22:08:02 CET 2011
Hi,
I guess a lot of us have this problem: I'm storing "genome long"
integer/doubles vectors for each position along each chromosome.
I want to quickly access parts of these vectors in a manner quite
similar/convenient/efficient to how we can quickly access the reads in
a given region of a BAM file. I'm curios what you folks are using to
store this type of info?
Currently I just have RData objects of Rle's or XIntegers, etc. for
each strand of each chromosome. I'll load these data files, query the
info over the ranges I want, then junk the (usually large) vector I
just loaded. It's not the best, but it works.
In the bioinformatics world, I guess these data are best stored as
bigWig files, yes? And AFAIK, there's no (convenient or otherwise) way
to query bigWigs from within R/Bioc, right?
Then I wonder if storing these in hdf/netcdf files isn't actually the
way to go ... and if so, why not go whole-hog and work on a bioc
interface to the somehow-defined biohdf format?
Any thoughts?
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list