[R] pam() clustering for large data sets
    Lilia Nedialkova 
    lbravewo at princeton.edu
       
    Tue May 17 00:26:25 CEST 2011
    
    
  
Hello everyone,
I need to do k-medoids clustering for data which consists of 50,000
observations.  I have computed distances between the observations
separately and tried to use those with pam().
I got the "cannot allocate vector of length" error and I realize this
job is too memory intensive.  I am at a bit of a loss on what to do at
this point.
I can't use clara(), because I want to use the already computed distances.
What is it that people do to perform clustering for such large data sets?
I would greatly appreciate any form of suggestions that people may have.
Thank you very much in advance.
    
    
More information about the R-help
mailing list