[Bioc-sig-seq] large BAM files and large BED files
mtmorgan at fhcrc.org
Fri Sep 16 23:29:01 CEST 2011
On 09/16/2011 02:11 PM, Michael Lawrence wrote:
> It sounds like you're trying to use BED as an alternative to BAM? Probably
> not a good idea, especially at this scale. Why are you aiming for a
> GenomeData? A GappedAlignments might be more appropriate. See
> GenomicRanges::readGappedAlignments() for bringing a BAM into a
the 'which' argument to readGappedAlignments (it'll become 'param' with
the next release, and be a ScanBamParam object) allows you to select
regions to process, e.g., chromosome-at-a-time, to help with file size.
> This page might help:
> But it could really be improved.
> On Fri, Sep 16, 2011 at 1:44 PM, Rene Paradis<rene.paradis at genome.ulaval.ca
>> I am experiencing a problem regarding the load in memory of bed files of
>> 30 GB. my function read.table unleash the error : Error in unique(x) :
>> length xxxxxx is too large for hashing.
>> this is generated by the function MKsetup of the unique.c file. Even by
>> increasing by 10 000x the value, the error persists. I believe the
>> function pushes more data in ram, but I am not sure this is the good way
>> to focus on.
>> Ultimately, I would like to produce a GenomeData object from either a
>> BAM file or a bed file.
>> has someone ever worked with very very big BAM files (about 30 GB)
>> Rene paradis
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
> [[alternative HTML version deleted]]
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Telephone: 206 667-2793
More information about the Bioc-sig-sequencing