[Bioc-devel] GenomicRanges: Storing 'seqlengths' as numeric

Martin Morgan mtmorgan at fhcrc.org
Tue Dec 3 19:07:21 CET 2013


On 12/03/2013 02:29 AM, Julian Gehring wrote:
> Hi,
>
> Some of the chromosomes out in the world are fairly large (e.g. wheat chr 3B
> with > 995 Mbp [1]).  Currently, the 'seqlengths' of the reference sequence are
> stored as 'integers' which do not allow to store lengths of this size.  Are
> there any plans of switching to 'doubles' or 64-bit integers for the
> 'seqlengths' slot?  Or extending the slot such that a user can store it either
> as integer or floating-point number?

But

 > .Machine$integer.max
[1] 2147483647

so we at least survive wheat chr 3B?

If there is movement to support this I'd encourage exact representation as 
double (this is how R deals with long vectors, and I believe it is the 
javascript representation of integers so not completely unprecedented) rather 
than 64 bit integers (which do not have any support in R).

I guess this would be quite a big undertaking so real use cases need to be 
present. And support for larger integers would seem to be useful to R generally 
rather than just to Bioc.

Martin

>
> Best wishes
> Julian
>
>
> [1] http://www.sciencemag.org/content/322/5898/101
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list