[BioC] Rsamtools: Realloc integer overflow?
Hervé Pagès
hpages at fhcrc.org
Tue Jun 4 04:33:08 CEST 2013
Hi Martin,
On 06/03/2013 06:26 PM, Martin Morgan wrote:
> On 06/03/2013 05:27 PM, Michael Lawrence wrote:
>> Hey guys,
>>
>> Whenever I try to calculate the coverage for a BAM file with more than
>> say
>> 500 million reads, I get this error:
>>
>> Error in coverage(readBamGappedAlignments(x, param = param), shift =
>> shift, : \n error in evaluating the argument 'x' in selecting a method
>> for function 'coverage': Error in value[[3L]](cond) (from #2) : \n
>> 'Realloc' could not re-allocate memory (18446744065128005632 bytes)\n
>>
>> This looks like integer overflow, possibly within _grow_SCAN_BAM_DATA().
>> Could we just use long there?
>
> I wonder if it would be more sensible if less convenient to do this
> (under Bioc-devel)
>
> bf <- open(BamFile(fl, yieldSize=100000000))
> cvg <- coverage(readGAlignmentsFromBam(bf))
> while (length(aln <- readGAlignmentsFromBam(bf)))
> cvg <- cvg + coverage(aln)
> close(bf)
>
> ? It opens the door for better memory management and parallel evaluation.
>
> I'm concerned that using size_t (Realloc casts to this) or ptrdiff_t
> (the size of R long vectors) would only get us through the C code; the
> representation of this in R would require R long vectors, and Rsamtools
> does not (yet?) support that.
Sorry if I'm missing something obvious but why would the representation
of 500 million reads (either as a GappedAlignments object or as a plain
list as returned by scanBam()) require R long vectors?
Thanks,
H.
>
> Martin
>
>>
>> Michael
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list