[R] Simple clustering help

donvolencia bdoshaughnessy1 at gmail.com
Thu Jul 14 19:59:10 CEST 2011


Hi all,

I have just begun to use R and am hoping to receive some advice about the
problem I need to solve.  I have a file containing xy points that I need to
find all significant clusters and write each of their xy coordinates to
file(total points  ~ 75000 and sig. cluster = 2500 points.  I want to use a
euclidean distance threshold to determine if a point belongs to a cluster.

My initial thought is to take a random (seed) point and write a region
growing method to determine how many points belong to the cluster
(basically, add to the cluster all points that are within the threshold and
continue this until no points are with the threshold) .  Once there are no
neighboring points within the threshold, if the number of points added to
the region (cluster) is greater than 2500 I would write all of the point's
coordinates to a text file and remove them from the list of seed candidates
and begin again.  If the cluster size is less than 2500 I would simply
remove the points as they are not significant.  The process would continue
until there are less than 2500 points remaining.

 Is there a package that would be helpful in this task? 

Thanks
Don



--
View this message in context: http://r.789695.n4.nabble.com/Simple-clustering-help-tp3668274p3668274.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list