[Rd] data frame subset patch, take 2
Marcus G. Daniels
mgd at santafe.edu
Tue Dec 12 18:32:14 CET 2006
Hi Martin,
Conventions for optimizing away long, useless row name vector sound very
useful. Nice timings too!
I've noticed that before, and not been sure quite what to do. e.g. the
hdf5 module just gives up past a certain threshold as the long vectors
cause performance problems and HDF5 doesn't allow giant attributes
anyway. The common case for me, is no row names except numbers.
> Note however that some of these changes are backward
> incompatible. I do hope that the changes gaining efficiency
> for such large data frames are worth some adaption of
> current/old R source code..
>
On numerous occasions I've used 64 bit Altix systems, e.g. having a
terabyte of RAM, for loading and preprocessing data, just so I can zip
around in the image once it is done (either on that system or
another). R works great for big datasets, even though it has a few of
these rough edges..
Marcus
More information about the R-devel
mailing list