[R] memory issue trying to solve too large a problem using hclust

Agustin Lobo alobo at ija.csic.es
Fri Nov 30 09:59:02 CET 2001


On Thu, 29 Nov 2001, Wiener, Matthew wrote:

> Hi, all.
> 
> I'm trying to cluster 12,500 objects using hclust from package mva.  The

But does this make sense? I often use R for the stat. analysis of remotely
sensed imagery, so have much larger datasets MIght I suggest the
following:

1. Study a subsample, applying 
many different methods (including hclust). 
2. Define the centroids
(both means and dispersions).
3. Use IDL, C or R amb C programs to 
assign all the objects to a centroid.
4. Select those objects with low maximum  similarity
and perform a dedicated analysis. Maybe there are rare
classes that must be added to the set that was 
produced in 2., or maybe
there are just rare objects that should be left
as unclassified.

This procedure would have the advantage of expending more of your
time at exploring the data than on system adm. issues.

But this is just a suggestion.

Agus



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list