[R] Error: cannot allocate vector of size 18.4 Gb (NbClust)

Ben Harrison harb at student.unimelb.edu.au
Thu Aug 22 13:39:08 CEST 2013


I have a 70363 x 5 double matrix that I am playing with.

 > head(df)
         GR       SP         SN         LN       NEUT
1 1.458543 1.419946 -0.2928088 -0.2615358 -0.5565227
2 1.432041 1.418573 -0.2942713 -0.2634204 -0.5927334
3 1.406642 1.418226 -0.2958296 -0.2652920 -0.6267121
4 1.382284 1.418843 -0.2974732 -0.2671464 -0.6585127
5 1.358903 1.420360 -0.2991920 -0.2689792 -0.6881888
6 1.336436 1.422717 -0.3009756 -0.2707864 -0.7157941

In an attempt to explore it using clustering, I have tried the NbClust 
package with the following code:

library(NbClust)
nc <- NbClust(df, min.nc=5, max.nc=7, method="kmeans")

which returns the error
Error: cannot allocate vector of size 18.4 Gb

My workstation is an Intel Xeon with 23.5 GiB of memory.

I am very ignorant of the requirements of the package, but for 
comparison using stats::kmeans to cluster the data set is no problem.
What is the issue with this? Can anyone spell it out for me, so that 
perhaps I can do something to reduce the problem a little?
Or offer a solution to work around the memory restrictions?
Should I round off the variables?
Should I sample it, and analyse the sample?

Thanks,
Ben.



More information about the R-help mailing list