[Rd] Vector binding on harddisk
Simon Urbanek
simon.urbanek at r-project.org
Thu Feb 14 16:32:08 CET 2008
On Feb 14, 2008, at 6:32 AM, _ wrote:
> Hi all,
> Using big vectors (more than 4GB) is unfortunately not possible under
> Windows or other OS's if not enough RAM exists.
> Could it be possible to implement an a new data type in R, like a
> vector, but instead holding the information in memory, the data lies
> on
> an file. If data is accessed, the data type vector get the information
> automatically from the file.
> There is a package out there (named ff) but the accessed boundary have
> to be declared by the user this is a disadvantage.
>
I don't think you have been reading the documentation carefully enough
- it doesn't impose any limits itself. Whatever limits you hit with it
are due to the OS and/or R, so you cannot write a package that you
describe without hitting those limits. They are as follows: size of an
integer in R which limits the length of a single vector (2^31-1 ~ 2G
entries on 32-bit machines) and file size limit of your OS. The former
is a really hard limit, the only way to overcome it (without modifying
R) is to use multiple indices (which the ff package suggests). You can
overcome the file size limit by simply using multiple files (or using
a more reasonable OS).
Cheers,
Simon
More information about the R-devel
mailing list