[BioC] Biostrings - size limits

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Jun 24 17:57:59 CEST 2010


On Thu, Jun 24, 2010 at 9:24 AM, Erik Wright <eswright at wisc.edu> wrote:
> Hello,
> What limits the number of sequences and size of sequences that are part of a DNAStringSet?

(1) the amount of RAM on your computer, and;
(2) the architecture (32 bit vs. 64 bit) that R is running on/in.

>  Are there any way to change these limits?

Yes, adjust (1) and (2) from above.

> On my home laptop I can create a DNAStringSet of about sixty thousand sequences of length 1500.  On my desktop I can only make a DNAStringSet of twenty thousand sequences.

What's the error you get when you try to create these large
DNAStringSets on your desktop?

> Does anyone know what is creating this difference is maximum size?

What are the differences between the computers, specifically with
respect to RAM and architecture?

> Is there any way I can "up" the amount of memory available to a sequence set?  I would like to be able to have at least one hundred thousand sequences (of width 1500 nucleotides) in memory.

R wouldn't limit the amount of memory on a per object basis[*] -- it
just takes as much RAM as it can get from the OS.

[*] with the exception of the number of elements in a vector, which is
limited to 2^(32-1) due to R's 32bit addressing for vectors, but AFAIK
(i) Biostrings runs an end-around on this to avoid this problem, and
(ii) you're only talking about 100,000 elements anyway, so this isn't
your problem.

>> sessionInfo()
>R version 2.11.0 (2010-04-22)

Is this for your laptop, desktop, or both?

Also, R 2.11.1 is out -- never hurts to upgrade ;-)


Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list