[BioC] Integer overflow when summing an 'integer' Rle
Nicolas Delhomme
delhomme at embl.de
Fri Feb 10 17:04:19 CET 2012
Hi all,
While calculating some statistics of an RNA-seq experiment I tumbled onto the following problem. Applying the IRanges coverage function to my IRanges, I get back an integer Rle object. However trying to get the mean or sum of that Rle object results in an integer overflow. The following example just exemplify that overflow.
library(IRanges)
rC <- Rle(values=as.integer(c(1,(2^31)-1,1)))
sum(rC)
mean(rC)
Both result in an integer overflow.
[1] NA
Warning message:
In sum(runValue(x) * runLength(x), ..., na.rm = na.rm) :
Integer overflow - use sum(as.numeric(.))
The solution to that is to do the following:
sum(as.numeric(runLength(rC) * runValue(rC)))
but IMO it should be handled at the Rle level code; i.e. an integer Rle can clearly have a sum, a mean, etc... result that involve calculating values outside the integer range. Is there anything that speaks again having these functions internally converting the integer values to numeric before calculating the sum or mean?
Looking forward to hearing your thoughts on this,
Cheers,
Nico
sessionInfo()
R Under development (unstable) (2012-02-07 r58290)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.13.24 BiocGenerics_0.1.4
loaded via a namespace (and not attached):
[1] tools_2.15.0
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
More information about the Bioconductor
mailing list