[R] clara - memory limit
Martin Maechler
maechler at stat.math.ethz.ch
Wed Aug 3 19:23:45 CEST 2005
>>>>> "Nestor" == Nestor Fernandez <nestor.fernandez at ufz.de>
>>>>> on Wed, 03 Aug 2005 18:44:38 +0200 writes:
Nestor> I'm trying to estimate clusters from a
Nestor> very large dataset using clara but the program stops
Nestor> with a memory error. The (very simple) code and the
Nestor> error:
Nestor> mydata<-read.dbf(file="fnorsel_4px.dbf")
Nestor> my.clara.7k<-clara(mydata,k=7)
>> Error: cannot allocate vector of size 465108 Kb
Nestor> The dataset contains >3,000,000 rows and 15
Nestor> columns. I'm using a windows computer with 1.5G RAM;
Nestor> I also tried changing the memory limit to the
Nestor> maximum possible (4000M) Is there a way to calculate
Nestor> clara clusters from such large datasets?
One way to start is reading the help ?clara more carefully
and hence use
clara(mydata, k=7, keep.data = FALSE)
^^^^^^^^^^^^^^^^^^^
But that might not be enough:
You may need 64-bit CPU and an operating system (with system
libraries and an R version) that uses 64-bit addressing, i.e.,
not any current version of M$ Windows.
Nestor> Thanks a lot.
you're welcome.
Martin Maechler, ETH Zurich
More information about the R-help
mailing list